The Samba-Bugzilla – Bug 5162
using iconv with pre7 chops last special character in filenames
Last modified: 2008-07-26 10:22:52 UTC
i'm syncing two directories;
src on winxp @ x86
dst on ubunty 7.10 64bit
on my src i have hebrew filenames.
while syncing, filenames are correctly converted to utf8 on the dst (otherwise they are unreadable on the linux machine) however if a filename ends with a hebrew charater it's chopped.
i will examplify this transform:
let a-z signify english letters
and A-Z signify hebrew letters
src filename -> dst filename
abc -> abc
ABc -> ABc
ABC -> AB
A -> <empty>
I'm using pre release 7 with --enable-iconv=. on both machines.
on cygwin i had to hardcode #define HAVE_ICONV_OPEN 1 into config.h to get it to comile correctly.
note also that if i use --log-file on the client, the log file shows the correctly transformed filenames, yet they are created chopped on the dst filesystem.
note also that simply copying the files from the xp machine using a samba mount works perfectly.
this error applies both to files and directories.
if a directory was formed malnamed on the dst, then it will not be populated with anyfiles, only with it's subdirs.
Created attachment 3081 [details]
patch for finding iconv_open under cygwin
This fixes the configure script not finding iconv_open under cygwin. Seems to work ok on fedora 7 as well. The AM_CHECK_FUNCS with iconv_open fails because the libiconv header redefines iconv_open to be libiconv_open. AM_CHECK_FUNCS doesn't include iconv.h, so it fails to find the function. I'm not adept at autoconf; there may be a better way to do this.
Thanks for the configure patch. I added a slightly changed version to the latest dev version (which you can download via git, rsync or the latest nightly tar file).
To help me debug the problem, can you create a tar file with some filenames that don't convert correctly? And also specify a name for the source character set (so that I can use an --iconv=SOURCE,utf8 spec for the test) and the command-line you used.
Created attachment 3084 [details]
I'm not sure how to determine the locale charsets,
i guess windows uses iso88598-8, and linux uses utf8, however i'm not certain about it.
I just readlized that specifying explicitly --iconv=HEBREW,UTF-8 solves the issue.
So i guess the error is in the locale autodetection under cygwin and windows. using --iconv=.,UTF-8 still chops letters.
Thanks for the test file.
If you supply two --verbose options (e.g. -vv) to the rsync command, you'll see the charset info for the client and the server output like this:
client charset: HEBREW
server charset: UTF-8
That will show you what the default_charset() function determined for the transfer when using ".".
Since "HEBREW" worked for you (and indeed, worked fine for me too), this looks like it may be an iconv library bug, but I'd like to double-check that.
No response to last query, so closing (user has things working).