We are currently testing the Ceph VFS module to provide access to our Ceph cluster for Windows and MacOS clients with an Ubuntu SMB server acting as a gateway for these clients to access the cluster. We can establish sessions for our clients with the Samba server and they can view and transfer data from the Ceph cluster to their local devices. However, we are unable to transfer files from the clients to the cluster. We have tried Samba versions 4.14 to 4.17 and Ubuntu versions 20.04 to 23.04 but we have been unsuccessful so far where we are currently using the Samba version 4.17.7 under a Ubuntu 23.04 distribution with a 6.2 kernel. When we try to write a test file (DellInstaller_x64.exe) from a Windows client to a samba_tests/ folder on the cluster via the "vfs" share on the SMB server, we receive the following errors in the server logs: [2023/07/14 14:14:54.336643, 5, pid=1332465, effective(27080, 27080), real(27080, 0)] ../../source3/smbd/smb2_trans2.c:3490(smbd_do_qfilepathinfo) smbd_do_qfilepathinfo: samba_tests/DellInstaller_x64.exe (fnum 319841074) level=1048 max_data=252 [2023/07/14 14:14:54.336648, 10, pid=1332465, effective(27080, 27080), real(27080, 0)] ../../source3/smbd/dosmode.c:715(fdos_mode) fdos_mode: samba_tests/DellInstaller_x64.exe [2023/07/14 14:14:54.336653, 10, pid=1332465, effective(27080, 27080), real(27080, 0), class=vfs] ../../source3/modules/vfs_ceph.c:1309(cephwrap_fgetxattr) cephwrap_fgetxattr: [CEPH] fgetxattr(0x5609a735f4b0, 0x5609a74bf220, user.DOSATTRIB, 0x7ffc79383910, 256) [2023/07/14 14:14:54.336658, 0, pid=1332465, effective(27080, 27080), real(27080, 0)] ../../source3/smbd/fd_handle.c:115(fsp_get_io_fd) fsp_get_io_fd: fsp [samba_tests/DellInstaller_x64.exe] is a path referencing fsp [2023/07/14 14:14:54.336678, 10, pid=1332465, effective(27080, 27080), real(27080, 0), class=vfs] ../../source3/modules/vfs_ceph.c:1311(cephwrap_fgetxattr) cephwrap_fgetxattr: [CEPH] fgetxattr(...) = -9 [2023/07/14 14:14:54.336683, 5, pid=1332465, effective(27080, 27080), real(27080, 0)] ../../source3/smbd/dosmode.c:387(fget_ea_dos_attribute) fget_ea_dos_attribute: Cannot get attribute from EA on file samba_tests/DellInstaller_x64.exe: Error = Bad file descriptor This is the "vfs" share section in our /etc/samba/smb.conf: [vfs] comment = Home Directories path = /ivan/ vfs objects = ceph ceph: config_file = /etc/ceph/ceph.conf ceph: user_id = samba.gw read only = no oplocks = no kernel share modes = no inherit acls = Yes valid users = ivan I think this may arise from the following lines in the cephwrap_fgetxattr function assigning "ret": static ssize_t cephwrap_fgetxattr(struct vfs_handle_struct *handle, struct files_struct *fsp, const char *name, void *value, size_t size) { int ret; DBG_DEBUG("[CEPH] fgetxattr(%p, %p, %s, %p, %llu)\n", handle, fsp, name, value, llu(size)); ret = ceph_fgetxattr(handle->data, fsp_get_io_fd(fsp), name, value, size); It seems that "fsp" is being passed to "fsp_get_io_fd" even though it is a pathref file handle. I was wondering whether a conditional similar to in "cephwrap_flistxattr" could be used to catch this in a manner such as below? static ssize_t cephwrap_fgetxattr(struct vfs_handle_struct *handle, struct files_struct *fsp, const char *name, void *value, size_t size) { int ret; DBG_DEBUG("[CEPH] fgetxattr(%p, %p, %s, %p, %llu)\n", handle, fsp, name, value, llu(size)); if (!fsp->fsp_flags.is_pathref) { /* * We can use an io_fd to get an xattr. */ ret = ceph_fgetxattr(handle->data, fsp_get_io_fd(fsp), name, value, size); } else { /* * This is no longer a handle based call. */ ret = ceph_getxattr(handle->data, fsp->fsp_name->base_name, name, value, size); } I'm however not an experienced filesystem developer so I'm unsure if this would result in further problems or even fix this issue. Curiously, uploading using "dd" from a Linux client was successful (though the transfer rates were below 1 MB/s). This may be a case of me writing my config incorrectly or having the wrong set-up, in which case I would more than happy to receive pointers on how we can make our VFS server work.
(In reply to Ivan from comment #0) Yes, this is basically the correct approach. Cc'ing some folks involved with the ceph module. Hopefully one of em can pick this up? Besides that, from what I've seen, people seem to make better experience with using a Ceph kernel mount and then just sharing that filesystem without the vfs_ceph module.
(In reply to Ralph Böhme from comment #1) We've been using the kernel mount and sharing that via Samba in production for a couple of years now and it has been working very successfully. However we are seeing that transfer rates from Windows clients plateau out at to a little over 1 Gbps (particularly with larger files > 10 GB) on 10 Gbit interfaces whilst Linux clients can maintain ~400 MB/s so long as the wsize and rsize are increased (we've found that wsize=rsize=8MB is generally optimal). We've not found a clear way to increase the SMB packet size on the Windows client side despite the Samba server advertising a larger maximum size. Thus the motivation for us to explore the VFS module was to see how Windows behaves and whether that could provide an avenue for > 200 MB/s transfer rates without incorporating Windows servers in our estate.
(In reply to Ivan from comment #2) https://lists.samba.org/archive/samba/2023-September/246446.html It looks like performance issue is resolved :)
This bug was referenced in samba master: 83edfcff5ccd8c4c710576b6d5612e0578d168c8
Created attachment 18197 [details] patch from master
Comment on attachment 18197 [details] patch from master Reassigning for inclusion in 4.19 and 4.18.
Pushed to autobuild-v4-{19,18}-test.
This bug was referenced in samba v4-19-test: fcbda8c7525400fe85dde5b8edd1818a9d86f307
This bug was referenced in samba v4-18-test: 849c370d92a1fca18450ba7d0064e1adab4a77e4
Closing out bug report. Thanks!
This bug was referenced in samba v4-19-stable (Release samba-4.19.4): fcbda8c7525400fe85dde5b8edd1818a9d86f307
This bug was referenced in samba v4-18-stable (Release samba-4.18.10): 849c370d92a1fca18450ba7d0064e1adab4a77e4