Bug 3404 - infinite ls loop with mount.cifs on w2003 cifs server
infinite ls loop with mount.cifs on w2003 cifs server
Status: RESOLVED FIXED
Product: CifsVFS
Classification: Unclassified
Component: kernel fs
2.6
Alpha Windows 2003
: P3 major
: ---
Assigned To: Steve French
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-01-13 11:07 UTC by Miguel de Jesus
Modified: 2009-04-17 13:41 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Miguel de Jesus 2006-01-13 11:07:15 UTC
dist. CentOS-4.2
kernel  2.6.9-22.0.1 with cifs module 1.34
samba 3.0.21a

the kernel was recompiled for iso8859-1 due to char translation problems with utf8 (default)

____________
fstab entry:

//win2003/foo$ /foo cifs ro,credentials=/path/secret,uid=48,gid=48,dir_mode=0100,file_mode=0500,iocharset=iso8859-1 0 0
_____________

Problem:

when mounting a readonly win2003 server share, with multiple subdirectories, using mount.cifs an infinite loop in some subdirectories occours. E.g. the directory etc in //win2003/foo$,  gives in the linux server:

> ls /foo/etc/etc/etc/etc/ ...
> etc


the behaviour is quite odd, since it does not occur for every directory in the share. Some of them report accuratly.
Comment 1 Steve French 2006-01-13 11:17:55 UTC
I have a theory that this is related to the "resume name" getting corrupted (and would be fixed with slightly more current cifs).  When returned search data exceeds 16K (the smb/cifs network buffer size can be smaller than this too if server negotiates smaller at session setup time) then the first search request (SMB FindFirst) is followed by SMB FindNext request(s) which starts at a "resume name" which can be truncated/corrupted on cifs releases earlier than 1.35.   There was one additional search problem fixed in version 1.37 cifs in which incorrect search output could result when files were deleted from the directory as the client was trying to search (readdir) a large directory.

If this fails on 2.6.15 version of cifs (1.39 IIRC) - I would be even more interested in looking at this.   The version of cifs on the download site for RHEL4 (1.39 is probably the last one I backported) should work on most 2.6.9 based distros.   If you try that let me know if that fails.
Comment 2 Miguel de Jesus 2006-01-13 14:23:34 UTC
Dear Steve,

thanks for the help.

I've recompiled the source tree with cifs-1.39rhel4 source, however the same problem presists.
Also compiled cifs-1.40 but I couldn't load the module.

The problem seems to happen after the 2nd subdirectory level of some directories.

I was previously using mount.smb with 2.4 kernel and it worked beautifully, however we needed to upgrade the distro (kernel 2.6.9), due to security problems, and mount.smb is giving us a lot of problems now. Most of them are solved, so I read in some posts, by changing to mount.cifs. However ...

Thank you for your time.

Miguel
Comment 3 Steve French 2006-01-13 14:45:55 UTC
Could you attach an ethereal network trace of the failing directory search (or equivalently save a binary network trace via tcpdump if you don't have ethereal installed)?

An additional piece of data which would be somewhat helpful is - just before the failure turn on debugging ("echo 1 > /proc/fs/cifs/cifsFYI") recreate the problem and then send the debugging output ("dmesg > debug_output").
Comment 4 Miguel de Jesus 2006-01-16 08:18:35 UTC
(In reply to comment #3)

Dear Steve,

I've posted the some debug data directly to you.

Thank you,

Regards,

Miguel 
Comment 5 Miguel de Jesus 2006-02-01 05:56:48 UTC
tried to mount the same win2003 share with

dist. SuSE 9.2
kernel  2.6.8-24.18 with cifs module 1.40
samba-3.0.21b-1.1.2
arch x86_64

the infinite loop ls problem disapeared, however it cannot read relatively large files

kernel errors>
kernel:  CIFS VFS: Calculated size 0x27 vs actual length 0x48
kernel:  CIFS VFS: bad smb size detected for Mid=1141

apperantly the system reads the first bytes of the file and then the process hangs

regards
Comment 6 Steve French 2006-02-14 21:19:34 UTC
On the read from large file problem - any chance you could get an ethereal trace of that (I don't think you sent one of that - but let me know if you did already and I missed it)
Comment 7 Miguel de Jesus 2006-02-16 05:09:09 UTC
Dear Steve

I've posted some more ethereal traces.
Comment 8 Miguel de Jesus 2006-02-16 13:00:03 UTC
large transfer problem solved giving additional read/execute permissions to the "mounting" user on the win2003 file server. Good transfer rates are achieved also.

Infinite loop still persist on large directories.
Comment 9 Miguel de Jesus 2006-02-16 17:28:59 UTC
I think we might isolated the infinite loop problem.

As far as we know it only occours in alpha systems and for folders names equal or less than four character size. After renaming 2 char size folders to 5 char the problem disappears.
Comment 10 Bohdan Linda 2006-07-18 02:39:37 UTC
negative, I have encountered the same error with kernel 2.6.17 and samba version 3.0.22.

I can say I have x86 and dir names are longer that 2-3 chars. 
Comment 11 Bohdan Linda 2006-07-18 02:43:47 UTC
sorry, forgot to add:

/share/Infrastructure/Infrastructure/Infrastructure# ls
ls: reading directory .: Object is remote


where /share is the mountpoint. I need to traverse 3 leves to get this error message. subdirs are looped
Comment 12 Shirish S. Pargaonkar 2009-03-31 22:14:52 UTC
This problem may be fixed.

I have two shares, /mnt/smb_a mounted over with a share from a samba server (Version 3.0.33-3.7.el5)
and /mnt/smb_c mounted over with a share from a Windows 2003 server.

I do not encounter a looping or remote object error with the cifs version I have, 1.59.

cifstest6:# ls -l /mnt/smb_a/testdir/dir.0/dir.4/dir.4/dir.4
total 80
drwxr-xr-x 2 root root 0 Mar 31 21:05 dir.0
drwxr-xr-x 2 root root 0 Mar 31 21:05 dir.1
drwxr-xr-x 2 root root 0 Mar 31 21:05 dir.2
drwxr-xr-x 2 root root 0 Mar 31 21:05 dir.3
drwxr-xr-x 2 root root 0 Mar 31 21:05 dir.4
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.0
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.1
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.10
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.11
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.12
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.13
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.14
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.15
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.16
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.17
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.18
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.19
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.2
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.3
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.4
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.5
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.6
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.7
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.8
-rw-r--r-- 1 root root 0 Mar 31 21:05 file.9

cifstest6: # ls -l /mnt/smb_c/testdir/dir.0/dir.4/dir.4/dir.4
total 0
drwxrwxrwx 2 root root 0 Mar 31 21:58 dir.0
drwxrwxrwx 2 root root 0 Mar 31 21:58 dir.1
drwxrwxrwx 2 root root 0 Mar 31 21:58 dir.2
drwxrwxrwx 2 root root 0 Mar 31 21:58 dir.3
drwxrwxrwx 2 root root 0 Mar 31 21:58 dir.4
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.0
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.1
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.10
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.11
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.12
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.13
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.14
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.15
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.16
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.17
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.18
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.19
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.2
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.3
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.4
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.5
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.6
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.7
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.8
-rwxrwSrwx 1 root root 0 Mar 31 21:58 file.9
Comment 13 Steve French 2009-04-17 13:41:53 UTC
Appears to have been fixed long ago.