Bug 6442 - smbclient cannot treat non-ASCII chars.
Summary: smbclient cannot treat non-ASCII chars.
Status: RESOLVED INVALID
Alias: None
Product: Samba 3.3
Classification: Unclassified
Component: Client tools (show other bugs)
Version: 3.3.4
Hardware: x86 Linux
: P3 normal
Target Milestone: ---
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-05 16:07 UTC by TAKAHASHI Motonobu
Modified: 2009-06-05 22:35 UTC (History)
0 users

See Also:


Attachments
Command output (UTF-8 encoded) (2.44 KB, text/plain)
2009-06-05 18:19 UTC, TAKAHASHI Motonobu
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description TAKAHASHI Motonobu 2009-06-05 16:07:51 UTC
smbclient for Samba 3.3.4 cannot treat Japanese filenames (probably other multibyte chars).
I examined smbclient for Samba 3.3.4/3.0.20/3.0.7, setting locale correctly.

----
smb: \> dir
----

Samba 3.3.4 : cannot display Japanese filenames.
Samba 3.0.20: can display Japanese filenames.
Samba 3.0.7 : can display Japanese filenames.


----
smb: \> mkdir <JAPANESE CHARS>
----

Samba 3.3.4 : dir name converts to numbers of "_"
Samba 3.0.20: dir name converts to numbers of "_"

-----
smb: \> get <JAPANESE CHARS>
----

Samba 3.3.4 / Samba 3.0.20:  
smb: \> get <JAPANESE CHARS>.txt
NT_STATUS_OBJECT_NAME_NOT_FOUND opening remote file \____.txt

Samba 3.0.7 : works well.
Comment 1 Jeremy Allison 2009-06-05 16:13:07 UTC
More details please on how to reproduce. This is very important. What settings do you have in your smb.conf ?
Comment 2 TAKAHASHI Motonobu 2009-06-05 18:17:16 UTC
(In reply to comment #1)
> What settings do you have in your smb.conf ?

I examine "smbclient -s /dev/null", which means no smb.conf.

setting smb.conf like:

-----
dos charset = CP932
unix charset = EUCJP-MS
-----

The smbclient debug l
also do not work.

Locale setting is:

-----
locale -a 
melrose:/usr/local/samba/bin> locale
LANG=ja_JP.EUC-JP
LC_CTYPE="ja_JP.EUC-JP"
LC_NUMERIC="ja_JP.EUC-JP"
...
-----


melrose:/usr/local/samba/bin> locale
LANG=ja_JP.EUC-JP
LC_CTYPE="ja_JP.EUC-JP"
LC_NUMERIC="ja_JP.EUC-JP"
LC_TIME="ja_JP.EUC-JP"
LC_COLLATE="ja_JP.EUC-JP"
LC_MONETARY="ja_JP.EUC-JP"
LC_MESSAGES="ja_JP.EUC-JP"
LC_PAPER="ja_JP.EUC-JP"
LC_NAME="ja_JP.EUC-JP"
LC_ADDRESS="ja_JP.EUC-JP"
LC_TELEPHONE="ja_JP.EUC-JP"
LC_MEASUREMENT="ja_JP.EUC-JP"
LC_IDENTIFICATION="ja_JP.EUC-JP"
LC_ALL=



Comment 3 TAKAHASHI Motonobu 2009-06-05 18:19:18 UTC
Created attachment 4247 [details]
Command output (UTF-8 encoded)

Command output for Log level 3
I tried Log level 10 but there are no other meaningfull messages
Comment 4 Jeremy Allison 2009-06-05 18:21:06 UTC
Does it work with "display charset" set as well as "dos charset" and "unix charset" ? Although it should work from the locale with no smb.conf.
Jeremy.
Comment 5 TAKAHASHI Motonobu 2009-06-05 18:29:43 UTC
(In reply to comment #4)

Sorry, my mistakes.

I mis-spelled ja_JP.eucJP -> jp_JP.eucJP.

setting "display charset = eucJP-ms", it works correctly.

And to correct my mis-spelled locale setting and remove "display charset", it also works well.

I will close this bug as INVALID.
Comment 6 Jeremy Allison 2009-06-05 18:38:56 UTC
Ok, without an smb.conf file here are the defaults for charsets:

        /* using UTF8 by default allows us to support all chars */
        string_set(&Globals.unix_charset, DEFAULT_UNIX_CHARSET);

#if defined(HAVE_NL_LANGINFO) && defined(CODESET)
        /* If the system supports nl_langinfo(), try to grab the value
           from the user's locale */
        string_set(&Globals.display_charset, "LOCALE");
#else
        string_set(&Globals.display_charset, DEFAULT_DISPLAY_CHARSET);
#endif

        /* Use codepage 850 as a default for the dos character set */
        string_set(&Globals.dos_charset, DEFAULT_DOS_CHARSET);

DEFAULT_UNIX_CHARSET is "UTF-8"
DEFAULT_DOS_CHARSET is "CP850"

So only "display" charset is getting picked up from the locale, and then only
#if defined(HAVE_NL_LANGINFO) && defined(CODESET) are true. On my Ubuntu box both these are true.

Can you try setting the defaults for unix_charset in param/loadparm.c to be "LOCALE" and see if this fixes the problem ? It won't fix the DOS charset (ascii on the wire) issue though, as there's no way of setting this unless you use an smb.conf - the UNIX system doesn't know about this character setting.

Jeremy.

Comment 7 TAKAHASHI Motonobu 2009-06-05 22:35:59 UTC
As you said, "unix charset" parameter must be set, to handle correctly with Japanese filename via keyboard.

As far as I examined, "dos charset" parameter is not affected but may be somewhat affected..., 

Additionally with readline library, these settings are needed in .inputrc to handle Japanese chars (charset using 8th bit) correctly via readline: 

-----
set convert-meta off
set meta-flag    on
set input-meta   on
set output-meta  on
-----