Bug 7822 - mount.cifs and Umlaut in share name
Summary: mount.cifs and Umlaut in share name
Alias: None
Product: Samba 3.4
Classification: Unclassified
Component: Client Tools (show other bugs)
Version: 3.4.7
Hardware: Other Linux
: P3 normal
Target Milestone: ---
Assignee: Jeff Layton
QA Contact: Samba QA Contact
Depends on:
Reported: 2010-11-26 03:39 UTC by Andreas Heinlein
Modified: 2010-12-01 09:19 UTC (History)
2 users (show)

See Also:

Wireshark trace of unsuccessful attempt using mount.cifs (1.43 KB, application/cap)
2010-11-26 03:40 UTC, Andreas Heinlein
no flags Details
Wireshark trace of successful attempt using windows (15.76 KB, application/cap)
2010-11-26 03:40 UTC, Andreas Heinlein
no flags Details
Wireshark trace of successful attempt using nautilus (15.05 KB, application/cap)
2010-11-26 03:40 UTC, Andreas Heinlein
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Heinlein 2010-11-26 03:39:29 UTC
I need to mount a CIFS share (in the end via fstab, for now manually
from terminal) which has both a space and a german umlaut in its name. I
cannot get mount.cifs to mount it, it always complains it cannot find it. Mounting from windows or from nautilus works, however.

I managed to get around the space problem in fstab with the \040 trick,
but I cannot find a way to correctly encode the umlaut. When looking at
the output of "mount.cifs --verbose '//server/Täst Freigabe' /mnt", it
looks like it is accessing the correct share, but it does not work.

I also got a hint here
(https://bugs.launchpad.net/ubuntu/+source/gnome-vfs/+bug/414865) to
pipe the share name through iconv, but "mount.cifs $(echo //server/Täst
Freigabe | iconv -t850) /mnt" also does not work.

What can I do? Changing the share name is currently not an option, there
are just too many users with links/bookmarks to it.

Additional info as requested from Günter attached
$ sudo mount.cifs "//mail/Täst Freigabe" /mnt -o user=ah --verbose

mount.cifs kernel mount options: unc=//mail\Täst Freigabe,ver=1,user=ah,ip=,pass=********
mount error(6): No such device or address
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)
$ mount.cifs -V
mount.cifs version: 1.12-3.4.7
$ modinfo cifs
filename:       /lib/modules/2.6.32-25-generic/kernel/fs/cifs/cifs.ko
version:        1.61
description:    VFS to access servers complying with the SNIA CIFS Specification e.g. Samba and Windows
license:        GPL
author:         Steve French <sfrench@us.ibm.com>
srcversion:     144C5A7956082C40177846E
vermagic:       2.6.32-25-generic SMP mod_unload modversions 586 
parm:           CIFSMaxBufSize:Network buffer size (not including header). Default: 16384 Range: 8192 to 130048 (int)
parm:           cifs_min_rcv:Network buffers in pool. Default: 4 Range: 1 to 64 (int)
parm:           cifs_min_small:Small network buffers in pool. Default: 30 Range: 2 to 256 (int)
parm:           cifs_max_pending:Simultaneous requests to server. Default: 50 Range: 2 to 256 (int)
$ smbclient -L //mail -U ah
Domain=[AG] OS=[Unix] Server=[Samba 3.4.8]
$ printenv | egrep "(LANG|LC)"
Comment 1 Andreas Heinlein 2010-11-26 03:40:14 UTC
Created attachment 6088 [details]
Wireshark trace of unsuccessful attempt using mount.cifs
Comment 2 Andreas Heinlein 2010-11-26 03:40:38 UTC
Created attachment 6089 [details]
Wireshark trace of successful attempt using windows
Comment 3 Andreas Heinlein 2010-11-26 03:40:56 UTC
Created attachment 6090 [details]
Wireshark trace of successful attempt using nautilus
Comment 4 Andreas Heinlein 2010-11-26 03:42:33 UTC
Forgot to mention this is on Ubuntu 10.04 LTS, with the server being Debian 5.0.

I also tried using mount.cifs from cifs-utils 4.7 built from source, but nothing obvious changed.
Comment 5 Jeff Layton 2010-11-26 06:16:35 UTC
It looks like we're translating the a + umlaut incorrectly here. One big difference is that cifs isn't capitalizing it before sending the request but windows and nautilus do.

Does this work better if you capitalize the entire sharename first? That is, could you try mounting this instead?

Comment 6 Andreas Heinlein 2010-11-26 07:22:13 UTC
No, capitalizing does not work. Wireshark dump shows the UNC path is now translated to \\MAIL\T?\344ST FREIGABE, but that still doesn't work.
Comment 7 Guenter Kukkukk 2010-11-26 17:46:42 UTC
when i look at my own network sniff, the tcon tree name is *correctly*
filled in here in UCS.
So the question is, why the mapping from the local character set
to ucs is not working on your side.

When no "iocharset" mount option is specified, cifs internally uses
to set the mapping.
This default kernel nls charset is set with e.g.
in the kernel build (inside .config).

The string "Täst" is coded in utf8 and ucs as:
          T     |     ä     |     s     |   t
UTF8: 0x54      | 0xc3 0xa4 | 0x73      | 0x74      (so "ä" needs 2 bytes)
UCS:  0x54 0x00 | 0xe4 0x00 | 0x73 0x00 | 0x74 0x00 (ucs on the wire)

The interesting part in your error case is, that *both* bytes from
the UTF8 char "ä" are converted to ucs - which is just simply wrong:

          T     |          ä          |     s     |   t
UTF8: 0x54      |   0xc3      0xa4    | 0x73      | 0x74
UCS:  0x54 0x00 | 0x1c 0x25 0xf1 0x00 | 0x73 0x00 | 0x74 0x00

So what kind of mapping is done here?
0xc3 ---> 0x1c 0x25  (This resulting unicode sequence is really strange) 
0xa4 ---> 0xf1 0x00
Anyway, this doesn't look like utf8 to ucs mapping at all.

Andreas, what do you get with the following? 
wc -c <enter>
Täst <enter>

Here i get:
wc -c
wc -m
Or this one:
echo Täst > utf8.txt
hexdump -C utf8.txt 
00000000  54 c3 a4 73 74 0a                                 |T..st.|
You can also try to force cifs to use utf8 mapping by using
the "iocharset=utf8" mount option.

Atm i've no further ideas.
Cheers, Günter
Comment 8 Jeff Layton 2010-11-26 20:42:47 UTC
Good catch on the difference in the length Gunther...

Yes, I think the problem may be some sort of character set mismatch. Could you tell me what CONFIG_NLS_DEFAULT is set to on your kernel?
Comment 9 Guenter Kukkukk 2010-11-26 22:20:19 UTC
.config of my current kernel build contains

One must take care that the corresponding module (or build-in) is also set:

Normally a lot of NLS modules are set active.

Cheers, Günter

A nice site about "UTF-8 encoding table and Unicode characters":
Comment 10 Jeff Layton 2010-11-27 06:40:02 UTC
Right, I assume that Gunther's default mapping is "correct". I'm more curious if there's a mismatch in Andreas' rig.
Comment 11 Andreas Heinlein 2010-11-27 12:39:57 UTC
Thanks for working on this so hard. But I'll have to wait until monday to try anything out, as this is at work.
Comment 12 Guenter Kukkukk 2010-11-27 22:02:49 UTC
I was just thinking about this a bit more... how to get your results
on a "plain vanilla linux kernel"... like the one i also use.

I think i now know what's going on.

When i force cifs to use a different character mapping with:
  mount.cifs //server/share /mnt -o user=gk,iocharset=cp437
i get the same ucs mess on the wire like you.

Brad Hards (bradh) on the samba irc channel was so kind to install
the kernel source on his kubuntu 10.04 LTS machine.

Oh well,
CONFIG_NLS_DEFAULT="cp437"   !!!
was set on his machine.

Andreas, I'm sure when you use
   mount.cifs '//mail/Täst Freigabe' /mnt -o user=ah,iocharset=utf8
all will work as expected.

So, regarding cifs one could say "it's NOTABUG" ...

Cause other linux file systems also use
the side-effects of this CONFIG_NLS_DEFAULT setting should be
made available to a broader audience.

Andreas, would you please be so kind to open a bug with
ubuntu (canonical).
It's really strange that they still use cp437 these days.
As jlayton noted on irc (regarding UTF8):
   "especially given that they have libc set up for it"

Cheers, Günter

Only for kernel hackers. The usage of CONFIG_NLS_DEFAULT inside
the kernel source tree:

Comment 13 Andreas Heinlein 2010-11-29 04:02:10 UTC
What the *****?!? It works with iocharset=utf8, you're completely right. Well, I will certainly open that bug report with the Ubuntu kernel, but I'm somewhat disappointed now. I thought Ubuntu was one of the first distros with complete UTF-8 support, but it seems they missed something here.

Please keep this report open for a moment so I can post the link to the Ubuntu bug here for documentation.
Comment 14 Jeff Layton 2010-11-29 05:43:42 UTC
Ok, reassigning to Kukks since he did the heavy lifting on this.

I'm going to go ahead and mark the bug as INVALID. You should still be able to reference this bug in the Ubuntu bug report. Please reopen it if we need to discuss it further.