Bug 11255 - Handling of trailing period and trailing space in vfs_fruit or smbd
Summary: Handling of trailing period and trailing space in vfs_fruit or smbd
Status: ASSIGNED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Ralph Böhme
QA Contact: Samba QA Contact
URL:
Keywords:
: 11965 (view as bug list)
Depends on:
Blocks: 3792 4317 7001 10491
  Show dependency treegraph
 
Reported: 2015-05-05 12:41 UTC by Ralph Böhme
Modified: 2021-01-10 16:17 UTC (History)
8 users (show)

See Also:


Attachments
raw patch implementing catia:terminating mappings. (5.39 KB, patch)
2021-01-07 19:09 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ralph Böhme 2015-05-05 12:41:53 UTC
vfs_fruit must handle trailing period and space mapping to and from Unicode private range, because the OS X client will always map them. See bug #11207.
Comment 1 Björn Jacke 2017-11-06 22:58:35 UTC
*** Bug 11965 has been marked as a duplicate of this bug. ***
Comment 2 Björn Jacke 2017-11-06 23:08:30 UTC
I think we should consider to map trailing period and tailing space unconditionally in smbd.

Linux cifs vfs maps them the same way as OS X clients since d2809afc253bf6c90e15.

Windows clients would at least be able to read the files with trailing period and tailing space (which would just stay private unicode glyphs there accordingly).
Comment 3 Jeremy Allison 2017-11-06 23:12:20 UTC
Can you explain exactly what you're proposing ? In terms of incoming encodings -> smbd transformation -> disk storage and then disk storage -> smbd tranformation -> outgoing encoding ?
Comment 4 Björn Jacke 2017-11-06 23:32:16 UTC
I mean always map tailing space (0x20) on local disc to 0xF028 on the wire and local trailing period (0x2E) to 0xF029 on the wire.
Comment 5 Björn Jacke 2017-11-06 23:36:07 UTC
b704e70b7cf48f9b67c07d585168e102dfa30bb4 is the correct git hash for the change in cifs vfs
Comment 6 Björn Jacke 2017-11-06 23:41:55 UTC
... and the always apply the reverse mapping from wire encoding to local "encoding"
Comment 7 Ralph Böhme 2017-11-07 12:02:51 UTC
Ack.

Instead of name mangling trailing '.' and ' ' (dot and space) we should map those characters to the Unicode private range.

The following vfs_catia config illustrates this:

    vfs objects = catia
    catia:mappings = 0x20:0xf028,0x2e:0xf029

But we need this in the path translation and mapping subsystems.
Comment 8 Ryan Launchbury 2020-03-04 11:20:37 UTC
Hi all,

We still see this issue in the 4.10 branch; whereby an OS X client cannot see a folder that ends in a trailing space. Using 'catia:mappings = 0x20:0xf028,0x2e:0xf029' shows the folder and allows it's contents to be removed, however the folder results in an error -8003.

Could you advise what the current recommendations around this are?

Unfortunately preventing the creation of these files is not possible, as they are being recovered from long term backups, and already have these invalid file names.

Best,
Ryan
Comment 9 Claudius Ellsel 2021-01-06 22:27:04 UTC
(In reply to Björn Jacke from comment #4)
Is there some explanation for this mapping? To preserve compatibility with macOS (and Linux) clients that create such files?

I mean - in general currently existing instances of those files or folders on the server side will get name mangled from the server, which makes usage less straightforward so I am for some nicer solution to this.
Comment 10 Jeremy Allison 2021-01-06 22:39:49 UTC
The mangling of names ending in space or dot is due to the fact these are illegal names under Windows, and thus SMB2.

By default Samba mangles illegal names on the server into 8.3 names.

If you want them mapped into the private unicode range, then using the catia VFS module with:

'catia:mappings = 0x20:0xf028,0x2e:0xf029'

should do the trick. If that has problems, then it's a bug in vfs_catia and should be addressed there.
Comment 11 Ralph Böhme 2021-01-06 23:06:59 UTC
(In reply to Jeremy Allison from comment #10)
Won't that map any space? If it was that easy I would have added that to fruit years ago... :)
Comment 12 Jeremy Allison 2021-01-07 01:28:41 UTC
Oh doh ! Of course Ralph, sorry :-).
Comment 13 Jeremy Allison 2021-01-07 01:42:49 UTC
This could be done by adding a new catia parameter -

catia:trailing_mappings

and then extending string_replace_allocate() to take an additional:

struct char_mappings **trailing_cmaps

(can be null if unused) parameter which only does the mapping translation for the last ucs2 character in the string.

It's really ugly, but string_replace_allocate() is only used by vfs_catia and lib/adouble.c so it's not too bad.

What do you think ?
Comment 14 Jeremy Allison 2021-01-07 01:44:36 UTC
The new function signature would be:

NTSTATUS string_replace_allocate(connection_struct *conn,
                                 const char *name_in,
                                 struct char_mappings **cmaps,
                                 struct char_mappings **trailing_cmaps,
                                 TALLOC_CTX *mem_ctx,
                                 char **mapped_name,
                                 enum vfs_translate_direction direction)

Would that work ?
Comment 15 Claudius Ellsel 2021-01-07 12:43:39 UTC
(In reply to Jeremy Allison from comment #10)
I understood why those characters cannot be used, but was a bit curious why to map them to exactly the proposed characters (and not some else). But I guess that is just to support compatibility with macOS and Linux that already use them.

In general I'd just be happy if this could get implemented somehow, since the current mangling is not that nice from a user perspective.

Unfortunately I cannot be of much help on the technical side, though. It does not look too ugly, maybe you are referring to the single purpose of **trailing_cmaps that cannot be used for other cases like leading characters?
Comment 16 Ralph Böhme 2021-01-07 17:41:53 UTC
(In reply to Jeremy Allison from comment #14)
yes, that would work. Afaict it will only really help macOS clients as that's the only client that can handle the mapped characters.

We'd also need tests (vfs.fruit testsuite already has a few iirc).
Comment 17 Jeremy Allison 2021-01-07 17:44:20 UTC
Well once it's in place the Linux kernel cifsfs client could also do the remapping for terminating characters.
Comment 18 Jeremy Allison 2021-01-07 19:09:38 UTC
Created attachment 16378 [details]
raw patch implementing catia:terminating mappings.

Just in case you want to try it out. Compiles but obviously needs more work.
Comment 19 Claudius Ellsel 2021-01-08 16:01:16 UTC
(In reply to Ralph Böhme from comment #16)
If I understand comment #2 correctly that is already supported on Linux right now?
Comment 20 Ralph Böhme 2021-01-08 16:10:08 UTC
(In reply to Claudius Ellsel from comment #19)
Maybe, haven't tested or checked the code or docs.
Comment 21 Claudius Ellsel 2021-01-08 16:46:52 UTC
(In reply to Ralph Böhme from comment #20)
I just tried to find the relevant part, but no luck. I assumed that "d2809afc253bf6c90e15" refers to a commit hash. With browser based tools, I didn't find that, though.
Comment 22 Björn Jacke 2021-01-09 23:50:14 UTC
Jeremy: the trailing dot/space mappings are in the Linux cifs module since quite a while, see bug 11206

Claudius: read comment 5 here for the correct commit hash, also see the problems that it has in bug 13712
Comment 23 Claudius Ellsel 2021-01-10 16:17:30 UTC
(In reply to Björn Jacke from comment #22)
Thanks! I guess I have skipped that comment when skimming through the bug.

So basically it seems to be there on the client side for Linux with
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/cifs?id=45e8a2583d97ca758a55c608f78c4cef562644d1 (initial commit)
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/cifs?id=b704e70b7cf48f9b67c07d585168e102dfa30bb4 (the hash from comment #5 which fixes the a mixup in mapping characters)
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/cifs?id=57c176074057531b249cf522d90c22313fa74b0b (recent commit also applying the mapping when the trailing period or space is not in the last component of the path)

So if I understand this correctly, the client is able to map those characters correctly but currently they will be treated as is client-mapped on the server side. And this bug here is about also making the server able to understand the mapping of those characters. That will as I understand then do the following:
- Paths with trailing periods or spaces on the server are no longer delivered mangled to clients but instead also get mapped.
- When clients create paths with mapped trailing spaces or periods, the server knows how to "unmap" them again for storage on its internal file system.