The Samba-Bugzilla – Bug 1572
japanese UTF8 chars and CIFS
Last modified: 2005-11-14 09:41:33 UTC
As I got recommended I use CIFS instead of SMBFS for mounting shares. But this
didn't change my problem of not seeing UTF8 (japanese kanji in this case)
The only difference is, instead of "cutting of the file" or "not showing it"
like SMBFS did, CIFS just shows ?
in the log file I see this:
[2004/07/30 13:09:33, 0] lib/iconv.c:utf8_pull(506)
short utf8 char
but I get exaclty the same message when mounted via SMBFS.
Server: Debian/Testing 2.6.4 with 3.0.5 samba
Client: Gentoo/stable 2.6.5-gentoo-1 3.0.5 samba
I can't test this from my debian test box, because debian doesn't ship
mount.cifs. I have to compile that by hand first.
I can also see problems with japanese filenames, though simple umlauts don't
make any problem. creating files works fine, "ls"'ing not. when jap. filenames
should be displayed, they are partly displayed but a lot of other garbage is
When I "ls" not the directory where the file remains but the full path including
the file, it is displayed correctly.
I have samba 3.0.5 and the SUSE 2.6.5 kernel of SLES9.
I have not even tried to create files via CFIS in japanese, I just tried to view
What is the cifs vfs version (see fs/cifs/CHANGES file or modinfo on cifs.ko)?
Are the UTF-8 NLS kernel modules loaded on the client? The translations from Unicode
(16 bit Unicode) to the client's code page is done by kernel functions that depend on the
optional build of certain NLS kernel modules. Samba on the server depends on a
different mapping table for mapping from Unicode to UTF-8.
Are you overriding iocharset on the mount option on the client?
samba works correct here, "unix charset=utf8" with windows clients no problem
with jap. filenames.
bjacke@pell:~> /sbin/modinfo cifs
version: 1.18 8EA897319BE63BB99E39A8E
you can test it on your own, just "touch ようこそ".
Unfortunately this bugzilla is still not running in utf8, so you won't see
usefull thing after touch ;-)
I'll attach a tar file of a utf-8 encoded japanese filename. Try to put that via
cifs on a win* or samba server.
Created attachment 591 [details]
tar archive with empty japanese filename
and I did not mention: yes, utf8 nls module is loaded and it's the default nls
here. It works fine with german umlauts, which are also multibyte but those
japanese files fail, and just in directory listings, opening seems to work,
creating works too.
same here, UTF8 is compiled in the kernel, CIFS same, so I'll probably get the
same data like him.
Overriding iocharset is not working with mount.cifs. At least there is none such
option mentioned in the man page.
Note that iocharset is a supported cifs mount option (see fs/cifs/README for
more details, and the code that implements it is in fs/cifs/connect.c)
if I mount with option "iocharset=utf8" then the ls either core dumps or goes
into "D" state.
second, if I go with konqueror and "smb://user@server/folder/" I can see all the
UTF8 japaense files there perfectly.
You are mixing up things. Konqueror does not use cifs, it uses libsmb of samba.
Steve, did you try extracting that tar onto a cifs mounted share?
yeah sorry, just wanted to notice that in konqueror.
yeah, and if you call that ls twice, it will hang and bring down cifsd and at
the end, I had to reboot my box :)
So don't use iocharset with mount.cifs
to keep this bug uptodate here some information from the cifs mailing list:
the cifs module will fail with filenames whose utf-8 presentation is longer than
the corresponding utf-16 presentation of the same filename. That is always a
problem with Japanese or Chinese filenames. This will eventually be fixed in
readdir is fixed in 2.6.10 or later for the case of utf8 characters in a string
that average longer than 2 bytes (patches also for some earlier kernels on the