I've seen Samba serverrs, where files are still locked though the corresponding smbd process doesn't exist anymore. First I thought it would be a reiserfs problem but the problem also occured on ext3. I did parrallel tdbtorture test on different fiesystems and I noticed that on all tested filesystems (reiser3, reiser4, ext2/3, xfs) tdbtorture sooner or later throws fatal error messages like: rec_read bad magic 0x42424242 at offset=776 It looks like there are some problems in the tdb code.
Ok, I'm trying to reproduce this. Any more details on how long it takes to reproduce with tdbtorture ? Jeremy.
I can't reproduce this on a Fedora core1 (2.4.22 kernel) with ext3 + ACL patches. What kernel are you reproducing tdbtorture corruptions on ? Have you tried modifying it to only use pread/pwrite rather than mmap ? Jeremy.
I was using SuSE 9.2 kernel (2.6.8 based) on a 1.4GHz x86 system and I ran 4 to 6 tdb tortures parallel on the same filesystem in different directories, starting them again and again, it took about 10 minutes to see those errors. The filesystems were freshly created and contained just the torture files. I did not modify the torture test, if you want I can do whatever modifications you want and retry the test.
I'd like you to remove the -DHAVE_MMAP=1 from the standalone compile and then try and reproduce the error. This will tell me if it's in the kernel or in the tdb libraries. Jeremy.
I this ended up being a bug in the torture test wasn't it ?
even without the -DHAVE_MMAP=1 I got those errors but I didn't investigate here much deeper. I will try once more later with a recent samba version and newer kernel and keep you up-to-date here. I can't see why this should be a tdbtorture bug, this might however explain, why some people have problems with locked files not being "unlocked" with recent samba versions.
No I know what this is. I tracked down the problem but then didn't update the bug (sorry). It's something we wouldn't run into in smbd or the rest of Samba but a problem in the way tdbtorture uses the tdb library (it allows a re-open race). I'll update this when I have more time. Jeremy.
Didn't this get fixed now? Intdbtorture at least?
Yes this got fixed by tridge's changes to tdb.c - integrated in 3.0.20a. Jeremy.