Bug 8951 - filenames consisting of Chinese characters shown as ???
Summary: filenames consisting of Chinese characters shown as ???
Status: CLOSED FIXED
Alias: None
Product: CifsVFS
Classification: Unclassified
Component: kernel fs (show other bugs)
Version: 2.5
Hardware: All Linux
: P5 major
Target Milestone: ---
Assignee: Steve French
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-21 17:13 UTC by Thanos
Modified: 2016-02-25 18:22 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thanos 2012-05-21 17:13:20 UTC
Hi,

I'm a bit confused on whether multi-byte characters are fully supported or not.

On Debian unstable (kernel 3.2.0-2-686-pae) I mount a CIFS share where one file name consists of Chinese characters (e.g. 
Comment 1 Thanos 2012-05-22 09:41:04 UTC
(for some reason my initial comment got corrupted, here it goes again)

I'm a bit confused on whether multi-byte characters are fully supported or not.

On Debian unstable (kernel 3.2.0-2-686-pae) I mount a CIFS share where one file
name consists of Chinese characters (e.g. 
Comment 2 Thanos 2012-05-22 09:42:48 UTC
(arghhh... I had put some Chinese characters as a file name example and bugzilla ate the reset of the message!)

I'm a bit confused on whether multi-byte characters are fully supported or not.

On Debian unstable (kernel 3.2.0-2-686-pae) I mount a CIFS share where one file
name consists of Chinese characters. When listing the contents of the share, the Chinese file name appears as ????????, while other file names consisting of Latin/Greek characters appear fine. gvfs/smbclient show the Chinese file name correctly.

Is this something supposed to work? Is it a configuration issue?

Regards
Comment 3 Steve French 2014-10-19 00:23:51 UTC
Chinese characters should work fine as long as they are within the scope of the standard UCS-2 and UTF8 lists.  CIFS maps characters to UCS-2 by default on the wire (almost all modern servers use UCS-2 Unicode on the wire), and assuming that your client handles these and they are within both the UCS-2 character set and the UTF-8 character set - there should be no issues translating between them.

Does this still cause problems for you?
Comment 4 Steve French 2016-02-25 18:22:49 UTC
Please reopen if you still see this problem and have a way to reproduce it.  Chinese characters should be fully supported as long as they can be encoded in Unicode, just as they would be with smbclient (behavior should be similar, although smbclient uses userspace libraries for conversion of Unicode UCS-2 to UTF-8 while the kernel client uses the Linux kernel libraries for the conversions)