smbc_dirent::namelen,commentlen are commented as such: "The length of the name/comment string in bytes (includes null terminator)" However testing shows that the strlen() on the members and the *len values are the same, i.e. the len values do not include the \0 character. This came up while investigating why Krusader and then Konqueror and then KIOSlave/SMB show the comments with the last character missing. The KIOSlave/SMB code uses commentlen - 1 as the number of characters to UTF8-decode, naturally thinking they ignore the terminating null. Testing against samba-client 3.0.2 shows that the *len members account for the terminating null. Test printing the dirent type, the name, the strlen(name), namelen, comment, strlen(comment), commentlen: box#1 { uname: Linux uriel 2.4.21-9.0.1.EL (i386) samba: samba-client-3.0.2-6.3E out : [WORKGROUP] ALDRATECH 9:10 (SPACE) 5:6 } box#2 { uname: Linux rpopescu 2.6.12-1.1398_FC4 (x86_64) samba: samba-client-3.0.14a-2 out : [WORKGROUP] ALDRATECH 9:9 (SPACE) 5:5 }
Confirmed that comments in header file were incorrect, and corrected the comments.
In order to fix the kioslave/smb code, can you please provide advice on how I can determine whether the samba-client returns N or N+1 ? Btw, why the change and when did it happen ? Thanks in advance.
I don't recall making that change to the code, but I'll take your word that it did change. I thought it had always been the length not including the null terminator (meaning that the comment was always wrong). In any case, if there are old versions out there where the length can include the null terminator, then if you need to work with all library versions, I recommend that you use strlen(name) and strlen(comment) directly rather than depending on the length fields.
The test case produces different results between samba 3.0.2 and 3.0.14a, versions which I have available. I'm attaching the test case. In version 1.73.2.4 int the old CVS code, branch SAMBA_3_0_RELEASE the code goes pretty much like: static int add_dirent(...) { //snip if (comment) u_comment_len = push_utf8_allocate(&u_comment, comment); //snip dirent->commentlen = u_comment_len; //snip } In the SVN code version 8329 I see the following: static int add_dirent(...) { int comment_len = (comment == NULL ? 0 : strlen(comment)); //snip dirent->commentlen = comment_len; //snip } So something definitely changed there, did not track down the behaviour of push_utf8_allocate() though, but from the name it seems that it allocates memory, and such code usually also allocates space for the final \0 while returning the allocated size, as opposed to strlen().
Created attachment 1397 [details] Test-case Test case code, showing different results between 3.0.2 and 3.0.14
Ok, those changes were a part of a series of bug fixes many many moons ago pertaining to incorrect character set conversion at times. The character set conversions were removed, and left to the caller to do with as he pleases.
Thanks for the explanation, that makes sense. I've issued a bug report with kde, suggesting they change the code to use strlen(commentlen) instead of commentlen - 1 Cheers!
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.