Looks like there's an incompatibility between streams and the internal snapshot path handling in Samba that nobody has hit as of yet. When a client wants to access a snapshot version of a file, we internally pass the snapshop date information by attaching it to the filename using this pattern: FILE\@GMT-YYYY.MM.DD-HH.MM.SS In this case, FILE has streams attached to it, so the client sends a request to open the snapshot of that stream: 637 - Copy.JPG:com.apple.quarantine so internally the path becomes: 637 - Copy.JPG:com.apple.quarantine\@GMT-YYYY.MM.DD-HH.MM.SS Later, we have a path validation function that has this condition: if (stream_started) { switch (*s) { case '/': case '\\': return NT_STATUS_OBJECT_NAME_INVALID; case ':': if (s[1] == '\0') { return NT_STATUS_OBJECT_NAME_INVALID; } if (strchr_m(&s[1], ':')) { return NT_STATUS_OBJECT_NAME_INVALID; } break; } } While processing the path, when we hit the ':' stream_started is set to true and from that point onward any backslash '\' in the path is treated as invalid. Unfortunately we've just added a backslash as part of the snapshot time tagging to the name, so the above check is triggered and we fail the request with NT_STATUS_OBJECT_NAME_INVALID.
Will be working on a fix after SambaXP...
(In reply to Ralph Böhme from comment #0) We actually did hit that :) Sits in my queue for farther investigation...
OK, the correct fix here is to change the TWRP handling inside smb2_create.c to add the @GMT path just before the last pathname component instead of at the end. We can't add it to the start of the path, as it may be a DFS path (which is why I put it at the end). And of course now I know we shouldn't add it at the end (due to the file:streamname problem :-). So we need to reverse look for a '/' (or a '\' if a Windows open) and insert the @GMT string just before that.
(In reply to Jeremy Allison from comment #3) What parts rely on the current mechanics of encoding it in the path? Is there any other consumer of this then vfs_shadow_copy[2]? Just trying to get an idea about how much work it would be to convert to putting the timestamp as an additional member into struct smb_filenmae...
Yes, this is the long-term solution but more work than we can do for a release fix. It means a VFS ABI change of course, and then we need to ensure all pathname processing copies around the TWRP timestamp (and there are still crappy areas of the code that use only const char * paths - look at the parent dir processing etc.). IMHO this is a Samba 5.0.0 fix.
(In reply to Jeremy Allison from comment #5) Hm, but wouldn't there be just two producers (create context processing for smb2 and incoming path processing for smb1) and one consumer (vfs_shadow_copy)? Any other consumer of this?
Yes, only two producers - but look at all the pathname consuming code. There are many cases were we use things like parent_directory() that returns const char * paths, not smb_filename structs. We need to understand what they need to do with a TWRP token in the smb_filename struct. The complexity of where the shadow copies can exist is the problem (fucking shadow_copy2 module). Note that the canonicalization code in source3/smbd/filename.c converts the pathname (after DFS processing) into: * Canonicalize any incoming pathname potentially containining * a @GMT-token into a path that looks like: * * @GMT-YYYY-MM-DD-HH-MM-SS/path/name/components/last_component * * Leaves single path @GMT-token -component alone: * * @GMT-YYYY-MM-DD-HH-MM-SS -> @GMT-YYYY-MM-DD-HH-MM-SS * * Eventually when struct smb_filename is updated and the VFS * ABI is changed this will remove the @GMT-YYYY-MM-DD-HH-MM-SS * and store in the struct smb_filename as a struct timeval field * instead. */ So the pathname should always *start* with the @GMT token, keeping it constant over any parent_directory pathname processing etc. That keeps it available to the shadow_copy2 code whatever processing gets done with it. In order to move to the nirvana mentioned in the comment above, we'll need to remove *all* const char * code and move to struct smb_filename instead.
(In reply to Jeremy Allison from comment #7) > Yes, only two producers - but look at all the pathname consuming code. hm. Does it matter? I still don't get it, why does any component other then the cosumers that do the TWRP -> path translation (ie the relevant VFS modules) have to care *at all*? Can you give me an example of a single path based function outside of the VFS that would break if, for a snapshot path, it's just handed the pure path? Eg, for SMB2, we'd just stick the timestamp into struct smb_filename and then the only ones that have to care about it are the VFS modules who then just do the final path translation. I guess I'm missing something somewhere? :)
In current code, incoming TWRP is currently processed into: "@GMT/path/to/snapshot/file" parent_pathname() returns a const char * of "@GMT/path/to/snapshot" so shadow_copy2 still can get to the correct snapshot parent directory. With TWRP copied into smb_filename, pathname becomes: struct smb_filename { "path/to/snapshot/file" struct timeval @GMT twrp } parent_pathname() now returns const char * "path/to/snapshot", loses the @GMT twrp token. Inside shadow_copy2 it doesn't see a snapshot pathname, looks at wrong parent directory. This may be wrong, but this is the scenario I'm thinking about. Test moving it to the struct smb_filename and see if the shadow_copy2 snapshot torture test code still works.
In order to remove this possible issue, *all* pathname processing needs to move to struct smb_filename, not const char * anywhere. Then any pathname containing a TWRP timestamp can be correctly passed into a derived struct smb_filename, and converted into a TWRP-less struct smb_filename that points to the actual snapshot path inside the shadow_copy2 code and passed onto the lower-level VFS functions. The complexity comes as VFS functions can recurse instead of always calling VFS_XXX_NEXT(). For example, SMB_VFS_OPEN() calls SMB_VFS_STAT, not SMB_VFS_NEXT_STAT() - so it restarts processing at the top level, including any non-idempotent pathname conversions. Maybe you and I need to sit down and determine if this needs fixing so that stacked VFS modules only ever call lower-level ones. This part of the VFS stacking model was never properly thought through in design I thin.
(In reply to Jeremy Allison from comment #9) ah, ok, got it. Sorry for being so dumb. :) check_parent_access() is one example that would break, non_widelink_open() another one. So at least all users of parent_dirname() need checking. Hopefully there aren't many other path munging function that get used in the same sense. Hm, hm.... Also, thanks to your efforts, the VFS is now completely struct smb_filename based, so the conversion might not be too troublesome. Otot, metze and I thought the same about the impersonation stuff, so... :)
These need checking: $ git grep "struct smb_filename" source3/ | egrep -v ')|,|modules|torture|\.h:' | wc -l 147 :)
Created attachment 14680 [details] WIP patch for master, needs test
Created attachment 14714 [details] Patch for 4.8 and 4.9 backported from master Patch for this bug must be pushed *after* bug #13688.
Re-assigning to Karolin for inclusion in 4.9.next, 4.8.next.
(In reply to Jeremy Allison from comment #15) Pushed to autobuild-v4-{8,9}-test.
(In reply to Karolin Seeger from comment #16) Pushed to both branches. Closing out bug report. Thanks!