Bug 1156 - Filenames corrupt, if they contain "number" character
Summary: Filenames corrupt, if they contain "number" character
Status: RESOLVED INVALID
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: Extended Characters (show other bugs)
Version: 3.0.1
Hardware: All FreeBSD
: P3 normal
Target Milestone: none
Assignee: Alexander Bokovoy
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-06 05:13 UTC by Pavel V.Zheltobryukhov
Modified: 2005-11-14 09:24 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pavel V.Zheltobryukhov 2004-03-06 05:13:33 UTC
I have Samba 3.0.1 ÏÎ FreeBSD5.2 
My charset settings
display charset = KOI8-R
unix charset = KOI8-R
dos charset = CP866
When I create file from Windows with name, that contain "number" symbol ("#" in KOI8-R, I don't know 
how it representate  the same CP866 symbol), the name of this file become unreadable. 
But all OK with such filenames on Samba 2.2.8 - "number" symbol in CP866 successfully convert to # 
in KOI8-R during creation file.
Comment 1 Gerald (Jerry) Carter (dead mail address) 2004-03-12 08:31:34 UTC
Jeremy, any ideas ?
Comment 2 Pavel V.Zheltobryukhov 2004-03-14 21:22:55 UTC
One additional comment:

The "ü" symbol in filename convert in 
"copyright" symbol in Samba 2.2.8a,
not "#", as I posted before.
Comment 3 Alexander Bokovoy 2004-03-16 01:11:19 UTC
This is not a Samba error, neither iconv error. It is deliberate deviation in
Russian community from existing standard which was introduced by someone into
Samba 2.2 a long time ago. The simple fact is: KOI8-R does not contain Numero
sign at all by definition while CP866 (dos encoding) has. So it is impossible to
do proper conversion between them. People in past have violated KOI8-R standard
and mapped Numero to some non-important position for convenience. Since iconv(3)
in glibc and libiconv are implementing strict KOI8-R, this violation now removed
but inconvenience is introduced.

Solution is to switch to UTF-8 or ISO8859-5 or CP1251 as Unix encoding.
Comment 4 Gerald (Jerry) Carter (dead mail address) 2005-11-14 09:24:32 UTC
database cleanup