Bug 859 - smbclient outputs multibyte characters wrong
Summary: smbclient outputs multibyte characters wrong
Status: CLOSED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: smbclient (show other bugs)
Version: 3.0.0
Hardware: All All
: P3 normal
Target Milestone: none
Assignee: Jeremy Allison
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-12-05 10:53 UTC by Björn Jacke
Modified: 2005-08-24 10:19 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Jacke 2003-12-05 10:53:13 UTC
if "server string" of a Samba or Win* server contains multibyte characters,
smbclient -L shows this incorrect. If "server string" of the server is for just
an string containing am umlaut like "schön" smbclient shows it correct.
When it contains Japanes characters it shows it not correct.
An smbclient -L servername which should show "ありがとう" (thank you in
Japanese) as server string, shows something like this in UTF-8 locale:
     IPC$           IPC       IPC Service (ÒüéÒéèÒüîÒü¿Òüå)

I hope this bugzilla a configured so that it cares about charsets ...
Comment 1 Jeremy Allison 2003-12-05 15:22:35 UTC
What have you configured your character set parameters as in
your smb.conf ? You do know these need to be set correctly ?

Jeremy.
Comment 2 Björn Jacke 2003-12-06 07:34:16 UTC
yes, of course, if the default, utf8. The smb.conf is also utf-8 encoded and 
clients like w2k and xp also show the Japanese server string correctly. Just 
smbclient does not, as shown above.
Comment 3 Jeremy Allison 2003-12-08 16:42:41 UTC
Ok, I cannot reproduce this. I have the following set in my smb.conf

        unix charset = utf8
        dos charset = cp932
        server string = 院陰

Doing

bin/smbclient -L //localhost

I see :

.....
        IPC$           IPC       IPC Service (院陰)
        ADMIN$         IPC       IPC Service (院陰)
Anonymous login successful
 
        Server               Comment
        ---------            -------
        J2                   院陰

This looks correct to me. Please tell me what the problem you
see is. Remember, you must tell the smbd what dos client codepage
you are using - not just the unix codepage of utf8.

Jeremy.
Comment 4 Björn Jacke 2003-12-09 00:55:04 UTC
I'm sorry, I guess you tried the wrong things because this bugzilla is
configured in a way that it is just cappable of accepting ASCII correcly. I have
put the main information on

http://j3e.de/sambabug859.html

The dos charset should not matter at all in this case, just the unix charset
which is utf8. Win XP and and 2000 clients show the information of server string
correct. (Okay, XP also has a bug which causes the Japanese string to be shown
incorrect at one place)
Comment 5 Björn Jacke 2003-12-09 03:48:32 UTC
okay, additionally to what I wrote there I just found out that "dos charset"
*does* matter here though it should be irrelevant at this point in my opinion.
"dos charset" is used for smbclient for output encoding. smbclient should use
the locale's charset for output encoding. If I set "dos charset" to "utf8" the
output of smbclient is correct!
smbclient should get it's charset by doing nl_langinfo(CODESET), shouldn't it?
Comment 6 Jeremy Allison 2003-12-09 09:52:16 UTC
> smbclient should get it's charset by doing nl_langinfo(CODESET), shouldn't it?

No. smbclient gets it's charset from the "dos charset" parameter.
This *does* matter for displaying a server string as it is retrieved
using the RAP calls that use DOS charsets.

I don't think this is a bug. Closing.

Jeremy.
Comment 7 Björn Jacke 2003-12-09 12:01:46 UTC
hm, but smbclient is not a dos client but a unicode talking client isn't it?
and there are strange effects:
1. starting the server with "dos charset = utf8"
 - smbclient with "dos charset = utf8"  -> correct Jap. server string
 - smbclient with "dos charset = cp932" -> correct Jap. server string
 - smbclient without server string explictly set -> broken Jap server string
2. starting the server with "dos charset = cp932"
 - smbclient with "dos charset = utf8"  -> correct server string
 - smbclient with "dos charset = cp932" -> correct server string
 - smbclient without server string explictly set -> broken Jap server string
The "dos charset" seems to trigger something but it does not do what one would 
expect it to do at this point.
Comment 8 Jeremy Allison 2003-12-09 12:52:49 UTC
> but smbclient is not a dos client but a unicode talking client isn't it?

Not for server string. It's a RAP (DOS Codepage) client.

smbclient is behaving as designed. You *must* set
the correct DOS codepage.

I'm closing this (again). Please don't re-open it just because you don't
want to set the dos codepage.

Jeremy.
Comment 9 Björn Jacke 2003-12-09 14:07:21 UTC
I did not reopen because I did not believe you or because I didn't want to set 
"dos charset".
As I wrote, I can set "dos charset" to utf8 or to cp932 and in *both* cases I 
get a correct result, though one of the settings must be the wrong one.
Comment 10 Gerald (Jerry) Carter (dead mail address) 2004-03-16 11:58:28 UTC
jeremy says no bug here.
Comment 11 Gerald (Jerry) Carter (dead mail address) 2005-08-24 10:19:21 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.