Bug 2698 - directory listings goes into infinite loop
directory listings goes into infinite loop
Status: CLOSED FIXED
Product: Samba 3.0
Classification: Unclassified
Component: File Services
3.0.14a
x86 Linux
: P3 normal
: none
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-11 00:56 UTC by Alexander Vodomerov
Modified: 2005-08-24 10:27 UTC (History)
0 users

See Also:


Attachments
tcpdump session log (in binary format) (65.93 KB, application/octet-stream)
2005-05-30 00:41 UTC, Alexander Vodomerov
no flags Details
tcpdump session log (in binary format), full version (235.74 KB, application/octet-stream)
2005-05-31 00:26 UTC, Alexander Vodomerov
no flags Details
Patch (3.49 KB, patch)
2005-05-31 11:59 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Vodomerov 2005-05-11 00:56:48 UTC
Attempt to list files on WinXP machine goes into infinite loop. This happens with 
3.0.13, 3.0.14a and 3.0.15pre2. With 3.0.12 everything works fine. 
 
$ bin/smbclient -d 3 //172.16.34.151/video 
... skipped some info ... 
Client started (version 3.0.15pre2). 
... 
Domain=[SMS] OS=[Windows 5.1] Server=[Windows 2000 LAN Manager] 
dos_clean_name [] 
smb: \> ls 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
received 30 entries (eos=0) 
... and this repeats forever ... 
 
There are really 30 files on this folder, nothing unusual (no DFS or domains or 
any such stuff). It seems the search is not terminating, but always repeating from 
start. 
 
I've seen diff between 3.0.12 and 3.0.13. The only interesting thing is change 
in source/libsmb/clilist.c:205 
 
-                       SSVAL(param,4,(FLAG_TRANS2_FIND_REQUIRE_RESUME|
FLAG_TRANS2_FIND_CLOSE_IF_END)); /* resume required + close on end */ 
+                       SSVAL(param,10,(FLAG_TRANS2_FIND_REQUIRE_RESUME|
FLAG_TRANS2_FIND_CLOSE_IF_END));        /* resume required + close on 
 
As I understand from documentation, if continue bit is clear, the search will be 
rewind. Before this change flags contains garbage (because they overwrite 
infolevel) and this bit have a random value.  
There is also a comment by JRA that continue bit doesn't work. However, this 
comment appeared before fix 4->10, so it's not continue bit that don't work, it is 
info level with 3 bit set doesn't work (just a hypothesis). May be we can use it? 
 
I would be glad to provide any additional info (debug sessions, sniffer logs, 
anything else). If nothing helps, I can provide a shell on my machine.
Comment 1 Gerald (Jerry) Carter 2005-05-11 05:05:05 UTC
Jeremy, I know you worked on this but I couldn't find a 
specific svn log entry that referred to a fix.
Comment 2 Alexander Vodomerov 2005-05-27 23:21:20 UTC
I'm using Debian, and it is fixed with latest smbclient update (package 
3.0.14a-3).  
Exactly the same bug as 
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=309798 
 
Comment 3 Derrell Lipman 2005-05-28 05:51:57 UTC
If the filesystem on the XP machine is FAT, this is likely related to the bug
that was fixed (properly, finally, we think) two days ago.  Please try the code
in svn to ensure that it is now working.
Comment 4 Alexander Vodomerov 2005-05-28 10:35:43 UTC
The directory for which I filed a bug report now really works (with new Debian 
package that contains patch from svn. 
However, I found another machine on which same bug is happening :( 
Here is the details: 
 
[alex@lorien scripts]$ smbclient //LIGHTHOUSE/Babylon5 -d 3 
lp_load: refreshing parameters 
Initialising global parameters 
params.c:pm_process() - Processing configuration file "/etc/samba/smb.conf" 
Processing section "[global]" 
added interface ip=172.16.16.100 bcast=172.16.19.255 nmask=255.255.252.0 
added interface ip=172.16.36.100 bcast=172.16.39.255 nmask=255.255.252.0 
Client started (version 3.0.14a-Debian). 
resolve_wins: Attempting wins lookup for name LIGHTHOUSE<0x20> 
resolve_wins: using WINS server 10.0.0.1 and tag '*' 
Got a positive name query response from 10.0.0.1 ( 172.16.14.136 ) 
Connecting to 172.16.14.136 at port 445 
Password: 
Doing spnego session setup (blob length=16) 
server didn't supply a full spnego negprot 
Got challenge flags: 
Got NTLMSSP neg_flags=0x628a0215 
NTLMSSP: Set final flags: 
Got NTLMSSP neg_flags=0x60080215 
NTLMSSP Sign/Seal - Initialising with flags: 
Got NTLMSSP neg_flags=0x60080215 
Domain=[LIGHTHOUSE] OS=[Windows 5.1] Server=[Windows 2000 LAN Manager] 
dos_clean_name [] 
smb: \> cd bonus\CD2\BOOK\DDM\ddm 
dos_clean_name [\bonus\CD2\BOOK\DDM\ddm\] 
dos_clean_name [\bonus\CD2\BOOK\DDM\ddm\\] 
smb: \bonus\CD2\BOOK\DDM\ddm\> ls 
received 18 entries (eos=0) 
received 17 entries (eos=0) 
received 17 entries (eos=0) 
received 17 entries (eos=0) 
received 17 entries (eos=0) 
received 17 entries (eos=0) 
... and so on 
 
The same loop. The filesystem is NTFS (i'm not sure, i've only checked from 
WinXP laptop: it shows NTFS file system if map network drive. Can I trust this 
info?) 
Also, OS=[Windows 5.1] Server=[Windows 2000 LAN Manager] means Win2K or winXP. 
Comment 5 Derrell Lipman 2005-05-29 13:25:58 UTC
Ok.  Would you please provide an ethereal or tcpdump packet trace, in RAW,
BINARY format.  We'll look into it.
Comment 6 Alexander Vodomerov 2005-05-30 00:41:01 UTC
I've grabbed dump using command 
  tcpdump -ni eth0 -w 1.log host 172.16.17.65 
and put into 1.log.gz attachment. 
The 172.16.17.65 is WinXP server, and 172.16.16.100 is Linux client. 
Comment 7 Alexander Vodomerov 2005-05-30 00:41:38 UTC
Created attachment 1249 [details]
tcpdump session log (in binary format)
Comment 8 Derrell Lipman 2005-05-30 06:12:04 UTC
Your patch is fabulous... except, unfortunately, tcpdump truncates packets at
only 68 bytes which isn't enough data to see what's going on here.

If you wouldn't mind rerunning your packet dump with "-s 0", I'd really
appreciate it:

  tcpdump -s 0 -ni eth0 -w 1.log host 172.16.17.65 

Thanks!
Comment 9 Alexander Vodomerov 2005-05-31 00:26:31 UTC
Created attachment 1251 [details]
tcpdump session log (in binary format), full version

Oh, I've not used tcpdump for a long time a forgot about this quirk. I'm very
sorry.
Here is the right dump (I hope).
Comment 10 Jeremy Allison 2005-05-31 09:18:51 UTC
No it's not the right dump. It's an smbd log, not a packet trace. We need to see
the packet trace.
Jeremy.
Comment 11 Derrell Lipman 2005-05-31 10:09:43 UTC
Disregard last message.  The packet trace is fine.  We're working on it.
Comment 12 Jeremy Allison 2005-05-31 10:57:09 UTC
Ok, I've figured out the problem. What happens is that the files being returned
are in cyrillic unicode. The smbclient code reads these names via the iconv
interface, converting from UCS-16 on the wire to whatever internal character
format smbclient is set to (usually utf8). It so happens that one of the
characters in the last returned name in the findfirst call is 0xab, which for
some reason (probably selected unix character set in smb.conf) is not valid in
this character set. So the iconv conversion returns '_' for this character. (I
bet there's an "iconv failed" message somewhere in the smbclient log.

So when the code asks for the "findnext" it sends the wrong name (as the 0xab
character has been translated to '_'). It seems that when a missing name is
requested then W2K just starts from the beginning of the directory again, and so
the whole thing loops.

There are two ways to fix this. First is to keep a "raw copy" of the last
returned filename and use that in the findnext call - bypassing the character
set to ucs-16 character conversion. This is probably the preferred way as it
will cause the directory listing to correctly continue. The second is to keep a
"first non . or .. name seen" record and terminate prematurely if we see this
name again. This will truncate the directory listing but alert people to the
fact the iconv conversion has failed.

I'd be interested to see if "unix character set = utf8" fixes this bug....

Jeremy.
Comment 13 Alexander Vodomerov 2005-05-31 11:38:29 UTC
> (I bet there's an "iconv failed" message somewhere in the smbclient log. 
Yes, you are absolutely right! 
 
If run with -d 3 it write such messages: 
 
smb: \> cd ddm 
dos_clean_name [\ddm\] 
dos_clean_name [\ddm\\] 
smb: \ddm\> ls 
convert_string_internal: Conversion error: Illegal multibyte sequence(?) 
convert_string_internal: Conversion error: Illegal multibyte sequence(?) 
convert_string_internal: Conversion error: Illegal multibyte sequence(?) 
convert_string_internal: Conversion error: Illegal multibyte sequence(?) 
... 
 
I've post detailed log (-d 10) at Debian bugzilla: 
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=311157 
 
> I'd be interested to see if "unix character set = utf8" fixes this bug.... 
"unix charset set = utf8" fixes the bug. It doesn't go into infinite loop. 
However directory listing is not correct: every filename is cut at the first 
non-convertable character (only the beginning is printed). Is it a supposed 
behaviour? 
Comment 14 Jeremy Allison 2005-05-31 11:59:06 UTC
Created attachment 1252 [details]
Patch
Comment 15 Jeremy Allison 2005-05-31 12:03:11 UTC
The patch I added fixes it, in that the directory listing should now always
terminate. We get the raw unicode data from the last filename and always use it
for findnext, so we know it's a vailid filename as given from Windows.
However, this will not fix the real bug here, which is that your iconv library
isn't translating characters such as 0xab correctly into utf8. 0xab should be a
valid unicode character (from the linux character map this is "U+00AB
LEFT-POINTING DOUBLE ANGLE QUOTATION MARK") and should convert correctly into
utf8 without truncation.
This should now be fixed in svn.
Jeremy.
Comment 16 Gerald (Jerry) Carter 2005-08-24 10:27:47 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.