Bug 15385 - vfs_shadow_copy2 bug with shadow:localtime=yes and shadow:snapprefix
Summary: vfs_shadow_copy2 bug with shadow:localtime=yes and shadow:snapprefix
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: VFS Modules (show other bugs)
Version: 4.16.10
Hardware: All All
: P5 minor (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-03 10:10 UTC by Andrea Venturoli
Modified: 2024-02-09 10:17 UTC (History)
4 users (show)

See Also:


Attachments
Patch that solves the double local to GMT conversion (365 bytes, text/x-csrc)
2023-06-03 10:10 UTC, Andrea Venturoli
no flags Details
Second revision of the patch (409 bytes, patch)
2023-06-09 16:51 UTC, Andrea Venturoli
no flags Details
Fixed patch (411 bytes, patch)
2023-06-13 12:07 UTC, Andrea Venturoli
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrea Venturoli 2023-06-03 10:10:36 UTC
Created attachment 17905 [details]
Patch that solves the double local to GMT conversion

Hello.

I think I found a bug in vfs_shadow_copy2.
The situation I'll describe arises on FreeBSD + ZFS + Samba 4.16 + py-zfs-autobackup; however, I think this is not specific to any of these components. I also believe the bug is still there in 4.18 (although I heard shadow_copy2 is going to be replaced in later versions).

I'm letting py-zfs-autobackup produce hourly snapshots with a retention policy that, while getting more and more sparse, goes back to a year.
E.g.

zfs list -t snap | grep share|sort
zroot/ezjail/fs/shares@auto_zroot-20220627080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20220727080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20220827190000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20220926080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20221025090000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20221124080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20221225120000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230123090000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230222090000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230412120000             17.3K      -     30.6K  -
zroot/ezjail/fs/shares@auto_zroot-20230424100000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230511080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230518080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230528130000             16.0K      -     32.0K  -
zroot/ezjail/fs/shares@auto_zroot-20230529080000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530090000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530110000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530120000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530130000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530140000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530150000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530160000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530170000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530180000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230530190000                0B      -     33.3K  -
zroot/ezjail/fs/shares@auto_zroot-20230531090000                0B      -     33.3K  -

Then in my smb.conf I have:
[global]
vfs objects=full_audit shadow_copy2
shadow:sort = desc
shadow:format=-%Y%m%d%H%M%S
shadow:delimiter=-
shadow:snapprefix=auto_zroot
shadow:localtime=yes
shadow:snapdir=.zfs/snapshot

Notice "shadow:localtime=yes", as the above timestamps are in the CEST timezone.

When a Windows (tried 10 and 2012) client connects to a share and asks for snapshots (with FSCTL_GET_SHADOW_COPY_DATA), tcpdump shows the full list is reported; however only snapshots of the last week are shown to the user in the property window; furthermore, attempting to open most snapshots fails with "the specified shapshots does not exist anymore" (or similar message).



Changing "shadow:localtime" to "no", everything works properly (apart of course, the displayed times being wrong).



So I debugged the code and I think I found the cause:
_ when FSCTL_GET_SHADOW_COPY_DATA is called, shadow_copy2_snapshot_to_gmt will parse all the snapshots name, correctly identifing them, converting the timestamps to GMT and saving them as such in the internal list of snapshots;
_ the list is sent to the client also with timestamps in GMT;
_ the client will then ask for snapshots by their GMT time; however shadow_copy2_posix_gmt_string will again consider this a local time and convert it again to GMT (so doing it twice).



I'm attaching a patch that solves this (or at least it does on all the systems where I've tried it).
Comment 1 Peter Eriksson 2023-06-07 08:07:35 UTC
I'm not sure the patch is correct for all cases. With the patch in use on our system my Windows 10 client does not show all snapshots/versions. Without it the snapshots show up and are usable.

FreeBSD 12.4, Samba 4.18.3

smb.conf:

shadow:snapdir = .zfs/snapshot
shadow:format = auto-%Y-%m-%d.%H:%M:%S
shadow:sort = desc
shadow:localtime = yes
shadow:snapdirseverywhere = yes
vfs objects = shadow_copy2 zfsacl


Our snapshot names use the local timezone:

root@filur00:/home/peter86 # ls -l .zfs/snapshot/|egrep 2023-06-07
drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-07.07:00:00
drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-07.08:00:00
drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-07.09:00:00
drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-07.10:00:00


Without the patch I see the following versions of a test file:

root@filur00:/home/peter86 # ls -l FILUR00
-rwx------  1 peter86  employee-liu.se  8 Jun  7 09:04 FILUR00

(Output from Windows 10 file browser, "Restore previous version")

Today (1)
FILUR00    2023-06-07 08:32

A long time ago (2)
FILUR00     2021-09-29 14:24
FILUR00     2020-03-28 17:13

(I updated it at 08:32, and then a snapshot was taken at 09:00 and then I modified it again)


With the patch in use only this version is displayed:

A long time ago (1)
FILUR00    2021-09-29 14:24
Comment 2 Andrea Venturoli 2023-06-07 08:45:39 UTC
Thanks for testing this.
As I said I'm not sure my patch is fine for all cases, but I'd be glad if we could reach that point.

How do you run Samba 4.18 on FreeBSD 12.4?
AFAIK even Samba 4.16 requires at least FreeBSD 13.1...
Then, there is no port for 4.18: are you trying to create one? In that case I'd be glad to test it.
Maybe the patch for 4.18 should be different than that for 4.16 or maybe there are other reasons.

So you're trying to get "previous versions" of a *file*?
I always get an empty list for that and I thought this was expected behaviour...

Does it work properly (with and without patch) if you try to get "previous versions" of a directory?

If so, possibly the patch is incomplete (but first I'd need to check why I don't get anything for files).
Comment 3 Peter Eriksson 2023-06-08 06:29:18 UTC
> How do you run Samba 4.18 on FreeBSD 12.4?
> AFAIK even Samba 4.16 requires at least FreeBSD 13.1...

I don't use the port version of Samba but build it myself instead.
I have a "BUILD" script that downloads, unpacks, patches, configures and builds it. You can find my script (and some patches) at: http://www.grebo.net/~peter/samba-BUILD.tgz

The patches are for some of our local fixes, just remove them if you don't want them. 

Build with "./BUILD". Install (into /liu/pkg/samba/<version>/{bin,sbin,lib,etc} with "./BUILD install"


> So you're trying to get "previous versions" of a *file*?

Yes, it works.

> Does it work properly (with and without patch) if you try to get 
> "previous  versions" of a directory?

Yes it works for directories (and files) without your patch.
Comment 4 Andrea Venturoli 2023-06-08 10:46:20 UTC
(In reply to Peter Eriksson from comment #3)

Thanks.

I set up a test environment, where I compiled plain 4.18 (no patches from port, no patches from you).
I still get no previous versions for files and those for dirs are wrong (as described in my first post).
With my patch, at least those for directories work properly.

I then added your patch and retested (with and without my patch): results are the same.

I really have no idea what could be different in our two setups.
I'll try to bring up a whole new system from scratch and try again.
Comment 5 Peter Eriksson 2023-06-08 11:21:24 UTC
Different time zones? I'm at CEST (CET + Summer Time 1) which is 2 hours off GMT.

root@filur00:/export/software/Build/samba # env TZ=GMT date
Thu Jun  8 11:10:20 GMT 2023

root@filur00:/export/software/Build/samba # date
Thu Jun  8 13:10:22 CEST 2023


Does py-zfs-autobackup use the local timezone or GMT for snapshot names?
Comment 6 Andrea Venturoli 2023-06-08 11:36:07 UTC
(In reply to Peter Eriksson from comment #5)

I'm at CEST too (Italy).

py-zfs-autobackup uses local time (otherwise all of this would be pointless :)
Comment 7 Peter Eriksson 2023-06-08 11:40:23 UTC
Ok, did some more testing here.

With these settings in smb.conf things work fine for me:

shadow:format = auto-%Y-%m-%d.%H:%M:%S
;; shadow:snapprefix=auto                                                                           
;; shadow:delimiter=-       


With these I only see some of the snapshots (and not the right ones):

shadow:format = -%Y-%m-%d.%H:%M:%S
shadow:snapprefix=auto
shadow:delimiter=-



These are the same for both tests:

shadow:snapdir = .zfs/snapshot
shadow:sort = desc
shadow:localtime = yes
shadow:snapdirseverywhere = yes
Comment 8 Andrea Venturoli 2023-06-08 14:04:07 UTC
Hmm... this is an interesting observation.

There are two code paths in shadow_copy2_posix_gmt_string (the function I patched):
_ in your case, with no shadow:snapprefix, regexes are not used and the output of localtime_r is passed directly to strftime;
_ OTOH, with shadow:snapprefix=whatever, regexes are used and the output of localtime_r is passed to shadow_copy2_saved_snapname.

The original source code works in the former case, but seems to be broken on the latter.
My previous patch inverted this (the former broke, the latter worked).

Both work here if I modify that line to:
if (config->use_localtime && priv->snaps->regex == NULL) {

Again, this is a dirty hack; someone with more in-depth knowledge might spot a better solution.

Of course, I could use your config (with just "shadow:format=auto_zroot-%Y%m%d%H%M%S"); probably it would even be faster.
Yet I still think this is a bug.

P.S.
I still see no snaps on files; since you do, could you try this please?
Comment 9 Peter Eriksson 2023-06-08 14:26:31 UTC
Do you have "snapdirseverywhere = yes" enabled? It doesn't matter for us, but we have the snapshot directories mostly in the same directory as the share dir is, but if it isn't then perhaps the code have problems locating the snapshot dir?

Where is the .zfs/snapshot directory located relative to the file and the share "root"?


> root@filur00:/home/peter86 # cd /home/peter86
> 
> root@filur00:/home/peter86 # df -h .
> Filesystem            Size    Used   Avail Capacity  Mounted on
> DATA/staff/peter86     20G    377M     20G     2%    /export/staff/peter86
> 
> root@filur00:/home/peter86 # ls -l .zfs/snapshot | tail -4
> drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-08.13:00:00
> drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-08.14:00:00
> drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-08.15:00:00
> drwx------  11 peter86  employee-liu.se  24 May 25 13:45 auto-2023-06-08.16:00:00
> 
> root@filur00:/home/peter86 # ls -l FILUR00
> -rwx------+ 1 peter86  employee-liu.se  19 Jun  8 11:18 FILUR00
>
> root@filur00:/home/peter86 # ls -l .zfs/snapshot/auto-2023-06-07.08:00:00/FILUR00
> -rwx------  1 peter86  employee-liu.se  6 Sep 29  2021 .zfs/snapshot/auto-2023-06-07.08:00:00/FILUR00
> root@filur00:/home/peter86 # ls -l .zfs/snapshot/auto-2023-06-07.09:00:00/FILUR00
> -rwx------  1 peter86  employee-liu.se  16 Jun  7 08:32 .zfs/snapshot/auto-2023-06-07.09:00:00/FILUR00
> root@filur00:/home/peter86 # ls -l .zfs/snapshot/auto-2023-06-07.10:00:00/FILUR00
> -rwx------  1 peter86  employee-liu.se  8 Jun  7 09:04 .zfs/snapshot/auto-2023-06-07.10:00:00/FILUR00


And the view from Windows file browser -> Right click on the file -> Restore previous versions shows:

> Yesterday (2)
> FILUR00     2023-06-06 09:04
> FILUR00     2023-06-07 08:32
> 
> A long time ago (2)
> FILUR00     2021-09-29 14:24
> FILUR00     2020-03-28 17:13
Comment 10 Peter Eriksson 2023-06-08 14:27:55 UTC
And from the Windows machine I mount that share as:

> NET USE X: \\filur00\peter86
Comment 11 Andrea Venturoli 2023-06-08 14:33:43 UTC
(In reply to Peter Eriksson from comment #9)

> Do you have "snapdirseverywhere = yes" enabled?

No.
The only shadow related parameters I have are those I posted.



> perhaps the code have problems locating the snapshot dir?

No.
I followed the code in gdb and the snapshot dir is properly found.
The problem is related to time handling.



> Where is the .zfs/snapshot directory located relative to the file and the share "root"?

It can depend, but usually I have a dedicated fileset, e.g. /shares, so it's /shares/.zfs/snapshot.
Comment 12 Andrea Venturoli 2023-06-09 16:51:51 UTC
Created attachment 17915 [details]
Second revision of the patch
Comment 13 Andrea Venturoli 2023-06-13 12:07:15 UTC
Created attachment 17916 [details]
Fixed patch