We are using Oracle's Linear Tape File System Library edition to store files on a Linux server. LTFSLE exposes metadata for files and directories that are on tape. We share through Samba the directory structure under which lies these files/dirs. The underlying file system type is LTFS Open. Samba clients through Windows explorer are able to click on directories and files which then causes the tape to load and perform posix type file system commands (read/write etct..). Everything works fine until we exceed 563 files or a combination of files and directory elements that exceed 563. If we have 800 files in the top level dir of the linux server, only 563 can be seen or operated on with the Windows client.
We first saw this issue running Samba 3.X and upgraded to 4.0.
We have tried the latest Windows 7 and Windows 10 and see the same behavior.
We are running Oracle Enterprise Linux 2.6.32-431.29.2.el6.x86_64.
The server rpms for samba are:
Note that the filesystem is backed by LTFS Open, the actual file system exposed throough Samba is of type LTFS_LE. Here is the "mount" output...
/mnt/LTFS_LE/metadata on /LTFSLE type LTFS_LE (rw)
Hi Bob, can you upload a file containing the ls -l output from the directory you're trying to list, and also a smbd debug level 10 log from the client trying to list the same directory. A wireshark trace plus your smb.conf file would also be helpful.
(In reply to Jeremy Allison from comment #2)
Yes, we wil work on gathering this data and let you know.
Created attachment 11846 [details]
ls -l output, smbd debug level 10 log from the client, wireshark trace, and smb.conf file
Jeremy, the information that you requested is attached.
Set to unconfirmed as the requested info is attached.
OK - the relevent frame is 66 is the wireshark trace. This contains a chained SMB2 packet containing a directory open (Create) followed by 2 find requests.
Looking at log.oldstyle tells me something very interesting. I have a suspicion that telldir() on your LTFS filesystem is broken. Samba depends on a working telldir() to break directory listings into multiple chunks.
From the man pages:
TELLDIR(3) Linux Programmer's Manual TELLDIR(3)
telldir - return current location in directory stream
long telldir(DIR *dirp);
So on your x86_64 this should return a 64-bit cookie that represents the current directory position. Samba stores this off to allow restart from the 'long'. Now let's look at what is in log.oldstyle. If I restrict my query to the lines with mid=27 (the first FIND request) I see:
grep 'smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset' log.oldstyle.mod|wc -l
Hmmm. Close to the number you're seeing. Let's look closer...
grep 'smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset' log.oldstyle.mod
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 0 <- special start of directory offset
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2147483648 <- special Samba dot dot directory offset (0x80000000)
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 6066357
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 6329624
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 13096377
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 13998200
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2130973939
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2132715850
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2135244613
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2147483647
smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset -1 <- special Samba end of directory offset
However - look at the offset just *before* the -1 value. 2147483647 == 0x7FFFFFFF
This looks suspiciously to me like an internal 'magic' number that might be used by the LTFS filesystem to mean 'last offset'. But it's only a 32-bit value. This is a 64-bit system - these cookies should go much larger than that.
I'm guessing you might not see this in 'ls', as /bin/ls doesn't use the telldir() call, so ignores return cookies.
This is looking more and more to me like a LTFS bug. Can you investigate inside LTFS to discover when and why telldir() might return 0x7FFFFFFF ?
The samba telldir call maps to the LTFSLE readdir call.
This calls into the EXT3 readdir function. The offset is not handled correctly.
We learned that the issue did not occur with Samba 2.x which was current when
LTFSLE code was originally developed.
Changes to Samba 3.x appear to have exacerbated this in our code.
We are investigating this from our side. Thanks for the help!
Please advise on whether you would prefer to close this, or wait for us to confirm our fix. I am not sure how to set the status in either event.
Thanks for letting me know. I'll close this one out on our side, but I'd love an update on when you get a fixed version of LTFS Open to your customers so I can advise that modern Samba will work with it.