Bug 5959 - cp932 character that ends '\' put at end of line is mis-recognized as a continuous marker.
cp932 character that ends '\' put at end of line is mis-recognized as a conti...
Product: Samba 3.2
Classification: Unclassified
Component: i18n
x86 Linux
: P3 trivial
: ---
Assigned To: Björn Jacke
Samba QA Contact
: 957 (view as bug list)
Depends on:
  Show dependency treegraph
Reported: 2008-12-09 11:14 UTC by TAKAHASHI Motonobu
Modified: 2009-05-26 06:58 UTC (History)
0 users

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description TAKAHASHI Motonobu 2008-12-09 11:14:40 UTC
A multibyte char that ends '\' put at end of a line is mis-recognized as a continuous marker '\'. For example assuming a smb.conf:

  unix charset = CP932

  comment = testtest<0x955c>

  comment = test<0x955c>test


# <0x955c> means a Kanji-character encoded with CP932.

The comment line of share1 becomes broken, because of 0x5c put at end of line.
The comment line of share2 does not.

In CP932 encoding, a <0x955c> must be recognized as an inseperatable Kanji-character, not as a 0x95 and a 0x5c characters.
Comment 1 TAKAHASHI Motonobu 2008-12-16 09:50:42 UTC
*** Bug 957 has been marked as a duplicate of this bug. ***
Comment 2 Björn Jacke 2009-05-26 06:58:50 UTC
this is widely known problem with SHIFT JIS resp. CP932. That's why EUC-JP and now UTF-8 should be prefered. Any ASCII incompatible character set is broken by design for use in locale environments. Samba will most probably never do any work arounds for this. You can use CP932 for "dos charset" but for unix charset this will be (at least partly) broken. You can convert existing file names with convmv to a different charset.