Bug 3231 - Bad character translation
Summary: Bad character translation
Status: RESOLVED FIXED
Alias: None
Product: CifsVFS
Classification: Unclassified
Component: kernel fs (show other bugs)
Version: 2.6
Hardware: All Linux
: P3 normal
Target Milestone: ---
Assignee: Steve French
QA Contact:
URL: http://lists.samba.org/archive/samba/...
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-01 15:14 UTC by Leonard den Ottolander
Modified: 2006-03-22 07:36 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Leonard den Ottolander 2005-11-01 15:14:12 UTC
CentOS-4, linux-2.6.9-11, LANG=en_US.UTF-8, samba-client-3.0.10-1.4E.
Server is NT-4.

Certain characters that are reported correctly by smbclient are not translated
correctly for a cifs mount.

$ smbclient -U auser //david-bowie/Test
Password:
Domain=[EVERYTHING] OS=[Windows NT 4.0] Server=[NT LAN Manager 4.0]
smb: \> ls
  .                                   D        0  Sat Oct 29 18:58:49
2005
  ..                                  D        0  Sat Oct 29 18:58:49
2005
  ellipsis zijn heel fijn (…).doc      A    24064  Sat Oct 29 18:57:14
2005
  Nogmaals ellipsis ….doc           A    24064  Sat Oct 29 18:58:31 2005
  één document á €50.doc         A    24064  Sat Oct 29 18:55:28 2005
  één document.doc                  A    24064  Sat Oct 29 18:54:20 2005
  ‘‰’.doc                       A    24064  Sat Oct 29 18:57:55 2005
  “quotes”.doc                    A    24064  Sat Oct 29 18:53:40 2005

                52004 blocks of size 262144. 2165 blocks available

However the last two file names are reported incorrectly on a cifs mount:

$ sudo mount -t cifs -o username=auser //david-bowie/Test /mnt/tmp
Password:
[admin@george-harrison ~]$ ls /mnt/tmp
één document á €50.doc  ellipsis zijn heel fijn (…).doc  ‘?颂ꋩ??
één document.doc        Nogmaals ellipsis ….doc          “???????鲂닩?
Comment 1 Leonard den Ottolander 2005-11-15 10:40:51 UTC
I'm also seeing a problem with certain Chinese characters. This causes my backup using rsync to fail as it can not find the file:

file has vanished: "<snip>/&#32854;&#35478;&#38570;&#39657;&#59818;&#43435;&#42734;&#60074;&#43942;&#39914;&#61102;&#43690;&#44778;&#59814;&#43695;&#47854;&#60078;&#43690;"
rsync warning: some files vanished before they could be transferred (code 24) at main.c(702)

As with my original report this file name is reported correctly in smbclient.
Comment 2 Leonard den Ottolander 2005-11-15 10:53:39 UTC
Although this is not obvious in the previous comment characters 1, 2, 3, 4 & 10 actually are displayed correctly. The others are displayed as "hex blocks" in the email.

Also note that the file name in the example actually ends in ".doc.lnk" (a link in the "Recent" folder of a user profile). smbclient reports:

&#32854;&#35477;&#24555;&#27138;&#21644;&#24184;&#31119;&#30340;&#26032;&#24180;.doc.lnk
Comment 3 Leonard den Ottolander 2005-11-15 11:03:14 UTC
Sorry for the spam and the mess. The copy and paste from comment #1 are taken from Evolution from the report sent by cron (the failing rsync).

The copy and paste in comment #2 is from smbclient in a gnome-terminal.

To complete this here is a copy and paste from an ls on the actual cifs mount in a gnome-terminal:

&#32854;&#35478;&#38570;&#39657;&#59818;??&#60074;?&#39914;&#61102;?&#44778;&#59814;?&#47854;&#60078;?

So please ignore the copy and paste from comment #1 and focus on #2 (smbclient) and #3 (ls on a cifs mount). Note the discrepancy between the second characters.
Comment 4 Leonard den Ottolander 2006-03-22 07:36:08 UTC
Issue has obviously been solved somewhere before kernel-2.6.14 (not sure when exactly).

Let's see if I can close this report...