vfs_fruit must handle trailing period and space mapping to and from Unicode private range, because the OS X client will always map them. See bug #11207.
*** Bug 11965 has been marked as a duplicate of this bug. ***
I think we should consider to map trailing period and tailing space unconditionally in smbd.
Linux cifs vfs maps them the same way as OS X clients since d2809afc253bf6c90e15.
Windows clients would at least be able to read the files with trailing period and tailing space (which would just stay private unicode glyphs there accordingly).
Can you explain exactly what you're proposing ? In terms of incoming encodings -> smbd transformation -> disk storage and then disk storage -> smbd tranformation -> outgoing encoding ?
I mean always map tailing space (0x20) on local disc to 0xF028 on the wire and local trailing period (0x2E) to 0xF029 on the wire.
b704e70b7cf48f9b67c07d585168e102dfa30bb4 is the correct git hash for the change in cifs vfs
... and the always apply the reverse mapping from wire encoding to local "encoding"
Instead of name mangling trailing '.' and ' ' (dot and space) we should map those characters to the Unicode private range.
The following vfs_catia config illustrates this:
vfs objects = catia
catia:mappings = 0x20:0xf028,0x2e:0xf029
But we need this in the path translation and mapping subsystems.
We still see this issue in the 4.10 branch; whereby an OS X client cannot see a folder that ends in a trailing space. Using 'catia:mappings = 0x20:0xf028,0x2e:0xf029' shows the folder and allows it's contents to be removed, however the folder results in an error -8003.
Could you advise what the current recommendations around this are?
Unfortunately preventing the creation of these files is not possible, as they are being recovered from long term backups, and already have these invalid file names.
(In reply to Björn Jacke from comment #4)
Is there some explanation for this mapping? To preserve compatibility with macOS (and Linux) clients that create such files?
I mean - in general currently existing instances of those files or folders on the server side will get name mangled from the server, which makes usage less straightforward so I am for some nicer solution to this.
The mangling of names ending in space or dot is due to the fact these are illegal names under Windows, and thus SMB2.
By default Samba mangles illegal names on the server into 8.3 names.
If you want them mapped into the private unicode range, then using the catia VFS module with:
'catia:mappings = 0x20:0xf028,0x2e:0xf029'
should do the trick. If that has problems, then it's a bug in vfs_catia and should be addressed there.
(In reply to Jeremy Allison from comment #10)
Won't that map any space? If it was that easy I would have added that to fruit years ago... :)
Oh doh ! Of course Ralph, sorry :-).
This could be done by adding a new catia parameter -
and then extending string_replace_allocate() to take an additional:
struct char_mappings **trailing_cmaps
(can be null if unused) parameter which only does the mapping translation for the last ucs2 character in the string.
It's really ugly, but string_replace_allocate() is only used by vfs_catia and lib/adouble.c so it's not too bad.
What do you think ?
The new function signature would be:
NTSTATUS string_replace_allocate(connection_struct *conn,
const char *name_in,
struct char_mappings **cmaps,
struct char_mappings **trailing_cmaps,
enum vfs_translate_direction direction)
Would that work ?
(In reply to Jeremy Allison from comment #10)
I understood why those characters cannot be used, but was a bit curious why to map them to exactly the proposed characters (and not some else). But I guess that is just to support compatibility with macOS and Linux that already use them.
In general I'd just be happy if this could get implemented somehow, since the current mangling is not that nice from a user perspective.
Unfortunately I cannot be of much help on the technical side, though. It does not look too ugly, maybe you are referring to the single purpose of **trailing_cmaps that cannot be used for other cases like leading characters?
(In reply to Jeremy Allison from comment #14)
yes, that would work. Afaict it will only really help macOS clients as that's the only client that can handle the mapped characters.
We'd also need tests (vfs.fruit testsuite already has a few iirc).
Well once it's in place the Linux kernel cifsfs client could also do the remapping for terminating characters.
Created attachment 16378 [details]
raw patch implementing catia:terminating mappings.
Just in case you want to try it out. Compiles but obviously needs more work.
(In reply to Ralph Böhme from comment #16)
If I understand comment #2 correctly that is already supported on Linux right now?
(In reply to Claudius Ellsel from comment #19)
Maybe, haven't tested or checked the code or docs.
(In reply to Ralph Böhme from comment #20)
I just tried to find the relevant part, but no luck. I assumed that "d2809afc253bf6c90e15" refers to a commit hash. With browser based tools, I didn't find that, though.
Jeremy: the trailing dot/space mappings are in the Linux cifs module since quite a while, see bug 11206
Claudius: read comment 5 here for the correct commit hash, also see the problems that it has in bug 13712
(In reply to Björn Jacke from comment #22)
Thanks! I guess I have skipped that comment when skimming through the bug.
So basically it seems to be there on the client side for Linux with
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/cifs?id=45e8a2583d97ca758a55c608f78c4cef562644d1 (initial commit)
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/cifs?id=b704e70b7cf48f9b67c07d585168e102dfa30bb4 (the hash from comment #5 which fixes the a mixup in mapping characters)
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/cifs?id=57c176074057531b249cf522d90c22313fa74b0b (recent commit also applying the mapping when the trailing period or space is not in the last component of the path)
So if I understand this correctly, the client is able to map those characters correctly but currently they will be treated as is client-mapped on the server side. And this bug here is about also making the server able to understand the mapping of those characters. That will as I understand then do the following:
- Paths with trailing periods or spaces on the server are no longer delivered mangled to clients but instead also get mapped.
- When clients create paths with mapped trailing spaces or periods, the server knows how to "unmap" them again for storage on its internal file system.