5959 – cp932 character that ends '\' put at end of line is mis-recognized as a continuous marker.

Bug 5959 - cp932 character that ends '\' put at end of line is mis-recognized as a continuous marker.

Summary: cp932 character that ends '\' put at end of line is mis-recognized as a conti...

Status:	RESOLVED WONTFIX

Alias:	None

Product:	Samba 3.2
Classification:	Unclassified
Component:	i18n (show other bugs)
Version:	3.2.5
Hardware:	x86 Linux

Importance:	P3 trivial
Target Milestone:	---
Assignee:	Björn Jacke
QA Contact:	Samba QA Contact

URL:
Keywords:

Duplicates (1):	957 (view as bug list)
Depends on:
Blocks:

Reported:	2008-12-09 11:14 UTC by TAKAHASHI Motonobu
Modified:	2009-05-26 06:58 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description TAKAHASHI Motonobu 2008-12-09 11:14:40 UTC

A multibyte char that ends '\' put at end of a line is mis-recognized as a continuous marker '\'. For example assuming a smb.conf:

-----
[global]
  unix charset = CP932

[share1]
  comment = testtest<0x955c>
  ...

[share2]
  comment = test<0x955c>test
  ...

-----

# <0x955c> means a Kanji-character encoded with CP932.

The comment line of share1 becomes broken, because of 0x5c put at end of line.
The comment line of share2 does not.

In CP932 encoding, a <0x955c> must be recognized as an inseperatable Kanji-character, not as a 0x95 and a 0x5c characters.

Comment 1 TAKAHASHI Motonobu 2008-12-16 09:50:42 UTC

*** Bug 957 has been marked as a duplicate of this bug. ***

Comment 2 Björn Jacke 2009-05-26 06:58:50 UTC

this is widely known problem with SHIFT JIS resp. CP932. That's why EUC-JP and now UTF-8 should be prefered. Any ASCII incompatible character set is broken by design for use in locale environments. Samba will most probably never do any work arounds for this. You can use CP932 for "dos charset" but for unix charset this will be (at least partly) broken. You can convert existing file names with convmv to a different charset.