Bug 13972 - Different Device Id for GlusterFS FUSE mount is causing data loss in CTDB cluster
Summary: Different Device Id for GlusterFS FUSE mount is causing data loss in CTDB clu...
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: VFS Modules (show other bugs)
Version: 4.10.4
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-05-27 12:54 UTC by Anoop C S
Modified: 2020-01-03 09:05 UTC (History)
5 users (show)

See Also:


Attachments
patch for master (6.55 KB, patch)
2019-07-07 12:04 UTC, Michael Adam
no flags Details
patch for v4-10 (8.59 KB, patch)
2019-08-02 06:39 UTC, Anoop C S
obnox: review+
amitay: review+
Details
patch for v4-9 (8.59 KB, patch)
2019-08-02 06:41 UTC, Anoop C S
amitay: review+
obnox: review+
Details
patch for v4-11 (6.83 KB, patch)
2019-08-23 08:55 UTC, Anoop C S
amitay: review-
Details
patch for v4-11 (8.72 KB, patch)
2019-08-26 05:40 UTC, Anoop C S
amitay: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Anoop C S 2019-05-27 12:54:04 UTC
When vfs_glusterfs_fuse module is used to export FUSE mounted GlusterFS volumes as Samba shares across more than one node in a CTDB cluster, it is possible that same file/directory on the back-end file system is identified differently from clients connected via different public IPs. This happens due to the fact that Device Id, which constitute for file_id creation in Samba, corresponding to FUSE mounts on different nodes can be different depending upon the order in which they are getting mounted on the system. Following are the steps to reproduce the problem:

1. Have a Samba-CTDB-Gluster cluster spanning across two nodes with at least two public IPs.
2. Mount CTDB lock volume and other volume/s in different order such that you get two different Device Ids for `stat` command output on FUSE mount of the same volume(not the CTDB lock volume) on different nodes.
2. Use vfs_glusterfs_fuse module to export FUSE mount on two nodes as Samba shares.
3. Try to lock and write to the same file using two different public IPs in parallel from a client machine.

It is observed that data is being overwritten when same file is opened/locked for modification from client/s using different IPs. The bug is caused because the Device Id from step 2 are different for the two GlusterFS FUSE mounts. Samba servers identify the file/directories using a file_id which is created using the Device Id and inode number. Since the Device Id of the GlusterFS FUSE mount on the two Samba servers is not the same, the same file on the two Samba shares will end up having different file_ids. This means lock states are also not synchronized.

The bug is also present in configurations where GlusterFS FUSE mounts are exported without the help of vfs_glusterfs_fuse.
Comment 1 Ralph Böhme 2019-05-27 13:54:44 UTC
Maybe just document in the manpage that you have to use vfs_fileid alongside the module to address this?
Comment 2 Michael Adam 2019-05-29 11:20:58 UTC
(In reply to Ralph Böhme from comment #1)
> Maybe just document in the manpage that you have to use vfs_fileid alongside
> the module to address this?

The problem is that fileid does not work with gluster fuse: 
1) fsid is always 0 for fuse mounts
2) the fsname (from mtab) is not necessarily constant across nodes.

So we are working on a patch that will chop the host part off of the fsname and just use the volume name.

We are still thinking whether we should be doing a new mode in vfs_fileid (that would only be useful for gluster fuse mounts at the moment) or rather add an automatic mode into the glusterfs_fuse vfs module. (We do have patches for both.) The advantage of the fileid module is that it is a very small patch, but requires additional config changes for the users/customers and is a mode only (currently) useful for gluster. The advantage of doing it in the glusterfs_fuse module is that it would be fixed just by adding the glusterfs_fuse module (which you should do anyway when re-exporting a gluster-fuse mount). We can also special case some of the logic to this. But the patch is bigger.

Will update here. 

In the meantime, opinions on the above are very welcome! :-)
Comment 3 Michael Adam 2019-07-07 12:04:47 UTC
Created attachment 15290 [details]
patch for master

For reference here, this is the patch for master, already submitted as MR https://gitlab.com/samba-team/samba/merge_requests/620
Comment 4 Amitay Isaacs 2019-07-17 04:18:03 UTC
This patch breaks the build on freebsd 11 and 12.   There is no setmntent/getmntent on freebsd. 

1. Do we need to make this vfs module LINUX only?

2. Can we make part of the code ifdef LINUX?

3. Any other solution?


Here are the errors:


[3419/3918] Compiling source3/modules/vfs_glusterfs_fuse.c
../../source3/modules/vfs_glusterfs_fuse.c:94:6: warning: implicit declaration of function 'setmntent' is invalid in C99 [-Wimplicit-function-declaration]
        f = setmntent("/etc/mtab", "r");
            ^
../../source3/modules/vfs_glusterfs_fuse.c:94:4: warning: incompatible integer to pointer conversion assigning to 'FILE *' (aka 'struct __sFILE *') from 'int' [-Wint-conversion]
        f = setmntent("/etc/mtab", "r");
          ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../source3/modules/vfs_glusterfs_fuse.c:99:14: warning: implicit declaration of function 'getmntent' is invalid in C99 [-Wimplicit-function-declaration]
        while ((m = getmntent(f))) {
                    ^
../../source3/modules/vfs_glusterfs_fuse.c:99:12: warning: incompatible integer to pointer conversion assigning to 'struct mntent *' from 'int' [-Wint-conversion]
        while ((m = getmntent(f))) {
                  ^ ~~~~~~~~~~~~
../../source3/modules/vfs_glusterfs_fuse.c:104:13: error: incomplete definition of type 'struct mntent'
                if (stat(m->mnt_dir, &st) != 0) {
                         ~^
../../source3/modules/vfs_glusterfs_fuse.c:89:9: note: forward declaration of 'struct mntent'
        struct mntent *m;
               ^
../../source3/modules/vfs_glusterfs_fuse.c:110:16: error: incomplete definition of type 'struct mntent'
                p = strrchr(m->mnt_fsname, ':');
                            ~^
../../source3/modules/vfs_glusterfs_fuse.c:89:9: note: forward declaration of 'struct mntent'
        struct mntent *m;
               ^
../../source3/modules/vfs_glusterfs_fuse.c:112:9: error: incomplete definition of type 'struct mntent'
                        p = m->mnt_fsname;
                            ~^
../../source3/modules/vfs_glusterfs_fuse.c:89:9: note: forward declaration of 'struct mntent'
        struct mntent *m;
               ^
../../source3/modules/vfs_glusterfs_fuse.c:138:2: warning: implicit declaration of function 'endmntent' is invalid in C99 [-Wimplicit-function-declaration]
        endmntent(f);
        ^
Comment 5 Michael Adam 2019-07-31 22:24:26 UTC
The patch has been merged. I will provide a follow-up patch to fix the FreeBSD build. Apologies for the delay.
Comment 6 Michael Adam 2019-07-31 22:54:48 UTC
Proposed a fix patch as https://gitlab.com/samba-team/samba/merge_requests/682 : Only compile the module when `setmntent()` is found in the system.
Comment 7 Anoop C S 2019-08-02 06:39:52 UTC
Created attachment 15365 [details]
patch for v4-10
Comment 8 Anoop C S 2019-08-02 06:41:04 UTC
Created attachment 15366 [details]
patch for v4-9
Comment 9 Michael Adam 2019-08-08 13:26:39 UTC
Comment on attachment 15365 [details]
patch for v4-10

Thanks for doing the backport, Anoop!
Comment 10 Michael Adam 2019-08-20 05:53:51 UTC
Karo, can you please add the patches to the 4.9 and 4.10 branches?
Comment 11 Karolin Seeger 2019-08-23 08:38:23 UTC
(In reply to Michael Adam from comment #10)
Pushed to autobuild-v4-{10,9}-test.
Comment 12 Anoop C S 2019-08-23 08:55:41 UTC
Created attachment 15419 [details]
patch for v4-11

Just noticed that backport for v4-11 is not requested. Attaching the patch.
Comment 13 Amitay Isaacs 2019-08-26 03:22:30 UTC
Comment on attachment 15419 [details]
patch for v4-11

The second patch for checking setmnt is missing in this backport..
Comment 14 Anoop C S 2019-08-26 05:40:26 UTC
Created attachment 15428 [details]
patch for v4-11

Sorry, forgot to include follow-up fix previously. Attached new patch set.
Comment 15 Karolin Seeger 2019-08-27 10:18:32 UTC
(In reply to Anoop C S from comment #14)
Pushed to autobuild-v4-11-test.
Comment 16 Karolin Seeger 2019-09-03 11:41:46 UTC
Pushed to all branches.
Closing out bug report.

Thanks!