Bug 11741 - Samba clients do not list all files that are present in a directory on the server share.
Summary: Samba clients do not list all files that are present in a directory on the se...
Status: RESOLVED INVALID
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.0.0
Hardware: x64 Linux
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-17 18:36 UTC by Bob
Modified: 2016-02-19 17:24 UTC (History)
1 user (show)

See Also:


Attachments
ls -l output, smbd debug level 10 log from the client, wireshark trace, and smb.conf file (326.18 KB, application/zip)
2016-02-17 20:05 UTC, Bob
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bob 2016-02-17 18:36:43 UTC
We are using Oracle's Linear Tape File System Library edition to store files on a Linux server.  LTFSLE exposes metadata for files and directories that are on tape. We share through Samba the directory structure under which lies these files/dirs.  The underlying file system type is LTFS Open.  Samba clients through Windows explorer are able to click on directories and files which then causes the tape to load and perform posix type file system commands (read/write etct..).   Everything works fine until we exceed 563 files or a combination of files and directory elements that exceed 563.    If we have 800 files in the top level dir of the linux server, only 563 can be seen or operated on with the Windows client.
We first saw this issue running Samba 3.X and upgraded to 4.0. 

We have tried the latest Windows 7 and Windows 10 and see the same behavior.
We are running Oracle Enterprise Linux 2.6.32-431.29.2.el6.x86_64.
The server rpms for samba are:
samba4-winbind-4.0.0-67.el6_7.rc4.x86_64
samba4-common-4.0.0-67.el6_7.rc4.x86_64
samba4-libs-4.0.0-67.el6_7.rc4.x86_64
samba4-4.0.0-67.el6_7.rc4.x86_64
samba4-client-4.0.0-67.el6_7.rc4.x86_64
Comment 1 Bob 2016-02-17 18:49:50 UTC
Note that the filesystem is backed by LTFS Open, the actual file system exposed throough Samba is of type LTFS_LE.  Here is the "mount" output...

/mnt/LTFS_LE/metadata on /LTFSLE type LTFS_LE (rw)
Comment 2 Jeremy Allison 2016-02-17 18:55:50 UTC
Hi Bob, can you upload a file containing the ls -l output from the directory you're trying to list, and also a smbd debug level 10 log from the client trying to list the same directory. A wireshark trace plus your smb.conf file would also be helpful.
Comment 3 Bob 2016-02-17 19:05:09 UTC
(In reply to Jeremy Allison from comment #2)
Yes, we wil work on gathering this data and let you know.
Comment 4 Bob 2016-02-17 20:05:45 UTC
Created attachment 11846 [details]
ls -l output, smbd debug level 10 log from the client, wireshark trace, and smb.conf file

Jeremy, the information that you requested is attached.
Comment 5 Bob 2016-02-17 20:17:04 UTC
Set to unconfirmed as the requested info is attached.
Comment 6 Jeremy Allison 2016-02-17 23:38:34 UTC
OK - the relevent frame is 66 is the wireshark trace. This contains a chained SMB2 packet containing a directory open (Create) followed by 2 find requests. 

Looking at log.oldstyle tells me something very interesting. I have a suspicion that telldir() on your LTFS filesystem is broken. Samba depends on a working telldir() to break directory listings into multiple chunks.

From the man pages:

---------------------------------------------------------------------
TELLDIR(3)                                                    Linux Programmer's Manual                                                    TELLDIR(3)

NAME
       telldir - return current location in directory stream

SYNOPSIS
       #include <dirent.h>

       long telldir(DIR *dirp);
---------------------------------------------------------------------

So on your x86_64 this should return a 64-bit cookie that represents the current directory position. Samba stores this off to allow restart from the 'long'. Now let's look at what is in log.oldstyle. If I restrict my query to the lines with mid=27 (the first FIND request) I see:


grep 'smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset' log.oldstyle.mod|wc -l
520

Hmmm. Close to the number you're seeing. Let's look closer...

grep 'smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset' log.oldstyle.mod

  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 0                    <- special start of directory offset
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2147483648           <- special Samba dot dot directory offset (0x80000000)
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 6066357
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 6329624
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 13096377
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 13998200
  ...
  more entries
  ...
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2130973939
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2132715850
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2135244613
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset 2147483647
  smbd_dirptr_get_entry: dirptr 0x16803d0 now at offset -1                  <- special Samba end of directory offset
  
However - look at the offset just *before* the -1 value. 2147483647 == 0x7FFFFFFF

This looks suspiciously to me like an internal 'magic' number that might be used by the LTFS filesystem to mean 'last offset'. But it's only a 32-bit value. This is a 64-bit system - these cookies should go much larger than that.

I'm guessing you might not see this in 'ls', as /bin/ls doesn't use the telldir() call, so ignores return cookies.

This is looking more and more to me like a LTFS bug. Can you investigate inside LTFS to discover when and why telldir() might return 0x7FFFFFFF ?
Comment 7 Bob 2016-02-19 17:19:39 UTC
The samba telldir call maps to the LTFSLE readdir call.
This calls into the EXT3 readdir function. The offset is not handled correctly.
We learned that the issue did not occur with Samba 2.x which was current when
LTFSLE code was originally developed.
Changes to Samba 3.x appear to have exacerbated this in our code.
We are investigating this from our side. Thanks for the help!
Please advise on whether you would prefer to close this, or wait for us to confirm our fix.  I am not sure how to set the status in either event.
Comment 8 Jeremy Allison 2016-02-19 17:24:57 UTC
Thanks for letting me know. I'll close this one out on our side, but I'd love an update on when you get a fixed version of LTFS Open to your customers so I can advise that modern Samba will work with it.

Thanks !

Jeremy.