I had a NAS (based on samba) didn't send disk i/o error code to client on directory scanning operation. If it is true, is there chance to send error code such as STATUS_FILE_CORRUPT_ERROR to client? // It is used in real file system driver like fastfat. // https://github.com/Microsoft/Windows-driver-samples/blob/6c1981b8504329521343ad00f32daa847fa6083a/filesys/fastfat/dirsup.c I could reproduce this problem using dmsetup error mapping simulation. Here is ls command output saying about an error at /mnt/test/broken/ ``` ku@samba-server:~$ ls /mnt/test/broken/ ls: reading directory '/mnt/test/broken/': Input/output error ``` However smbd through smbclient won't say error at //127.0.0.1/test/broken/ ``` ku@samba-server:~$ /usr/local/samba/bin/smbclient //127.0.0.1/test Unable to initialize messaging context Enter WORKGROUP\ku's password: Try "help" to get a list of possible commands. smb: \> ls . D 0 Mon Dec 17 13:58:35 2018 .. D 0 Mon Dec 17 12:34:35 2018 ok D 0 Mon Dec 17 13:58:44 2018 broken D 0 Mon Dec 17 13:54:24 2018 lost+found D 0 Mon Dec 17 12:48:40 2018 53 blocks of size 1024. 34 blocks available smb: \> cd broken smb: \broken\> ls . D 0 Mon Dec 17 13:54:24 2018 .. D 0 Mon Dec 17 13:58:35 2018 53 blocks of size 1024. 34 blocks available smb: \broken\> ``` Windows share folder can say a status code like NT_STATUS_DEVICE_NOT_READY on device error: ``` smb: \folder1\folder2\folder3\> cd folder4\ smb: \folder1\folder2\folder3\folder4\> ls NT_STATUS_DEVICE_NOT_READY listing \folder1\folder2\folder3\folder4\* ``` // NT_STATUS_DEVICE_NOT_READY is reported because iSCSI target returns "0010 = Sense Key: Not Ready (0x2)" on disk error. // I have build iSCSI target on CentOS7 using dmsetup error mapping. I'm not sure about smbd source code. So I'm afraid could you look into this problem? I think samba is great software. I'll support for fixing this problem! kenjiuno
Here are some commands I have used to create an ext2 image file and block device reporting i/o error. # dd if=/dev/zero of=~/ext2 bs=60KiB count=1 # mkfs.ext2 ~/ext2 # losetup -f ~/ext2 # losetup -l # mount /dev/loop0 /mnt/test/ # mkdir /mnt/test/folder1 # mkdir /mnt/test/folder1/folder2 # mkdir /mnt/test/folder1/folder2/folder3 # umount /mnt/test # vi errormap 0 116 linear /dev/loop0 0 116 1 error 117 3 linear /dev/loop0 117 # dmsetup create baddisk < errormap # mount /dev/mapper/baddisk /mnt/test/ [root@localhost ~]# ls /mnt/test/folder1/folder2/ ls: reading directory /mnt/test/folder1/folder2/: Input/output error Sharing /mnt/test with smbd will be useful to verify.
Can you get a debug level 10 log trace from smbd, that will allow us to see what error code we're getting back from a system call that fails when accessing the corrupt disk ?
Created attachment 14755 [details] 2 logs: log-bad=smbdShouldEncountIoError, log-good=shouldnt
Thanks, I have attached 2 logs. log-bad = smbclient requested smbd to access i/o error directory. log-good = smbclient requested smbd to access normal directory. just my quick look for significant difference starts at: log-good 8034: smbd_dirptr_get_entry: dirptr 0x5655334faff0 now at offset 9223372036854775807 log-bad 8034: smbd_dirptr_get_entry: dirptr 0x5604de79ac60 now at offset -1 my commands: # smbd -i -S --debuglevel=10 -s ~/smb.conf > log # smbclient -c "cd /folder1/folder2/ ; ls" //192.168.2.51/test
In my quick research, readdir failure (errno variable) in vfswrap_readdir seems not to be processed. --- at vfswrap_readdir > result = readdir(dirp); https://github.com/samba-team/samba/blob/390871602d244510c941f3c978d2f4371bb62bb7/source3/modules/vfs_default.c#L430 ~~~ Referenced at vfs_readdirname > ptr = SMB_VFS_READDIR(conn, (DIR *)p, sbuf); https://github.com/samba-team/samba/blob/a4a85aca3242f56e94a487e7eb3e09684bd397da/source3/smbd/vfs.c#L753 --- Referenced at ReadDirName > while ((n = vfs_readdirname(conn, dirp->dir, sbuf, &talloced))) { https://github.com/samba-team/samba/blob/f21bc3addaafc857f0645378d4635d91c620c2f9/source3/smbd/dir.c#L1903 --- Referenced at at dptr_normal_ReadDirName > while ((name = ReadDirName(dptr->dir_hnd, poffset, pst, &talloced)) https://github.com/samba-team/samba/blob/f21bc3addaafc857f0645378d4635d91c620c2f9/source3/smbd/dir.c#L729 --- Referenced at at dptr_ReadDirName > name_temp = dptr_normal_ReadDirName(dptr, poffset, pst, https://github.com/samba-team/samba/blob/f21bc3addaafc857f0645378d4635d91c620c2f9/source3/smbd/dir.c#L764 --- Referenced at at smbd_dirptr_get_entry > dname = dptr_ReadDirName(ctx, dirptr, &cur_offset, &sbuf); https://github.com/samba-team/samba/blob/f21bc3addaafc857f0645378d4635d91c620c2f9/source3/smbd/dir.c#L1133 --- Referenced at get_dir_entry > ok = smbd_dirptr_get_entry(ctx, https://github.com/samba-team/samba/blob/f21bc3addaafc857f0645378d4635d91c620c2f9/source3/smbd/dir.c#L1361 --- Referenced at reply_search > finished = !get_dir_entry(ctx, https://github.com/samba-team/samba/blob/f03392a0aef9da195b1f9cb2442802d82e2dcb55/source3/smbd/reply.c#L1942
For later reference: readdir fails in errno EIO. ``` [root@localhost ~]# ./a.out /mnt/test/folder1/folder2/ opendir 0xc3e010 0 readdir (nil) 5 ``` #define EIO 5 /* I/O error */ > EIO > Input/output error (POSIX.1) https://linux.die.net/man/3/errno
Yes, vfswrap_readdir() won't process the errno, that should be done at the calling layer if readdir() returns NULL. Ah, I see the problem. readdir() is a horrid interface in that end-of-directory and error are both returned as NULL, and the caller must distinguish. That means errno must be set to 0 *before* calling readdir and then checked on return. I'll see what I can do.
Created attachment 15337 [details] 0001-readdir-can-return-NT_STATUS_FILE_CORRUPT_ERROR.patch
I have posted a rough patch. It can reply STATUS_FILE_CORRUPT_ERROR to smbclient when readdir() fails with EIO. ``` smb: \> ls . D 0 Tue Jul 30 15:41:40 2019 .. D 0 Tue Jul 30 14:40:23 2019 file1 N 6 Tue Jul 30 15:31:37 2019 folder2 D 0 Tue Jul 30 15:56:30 2019 file2 N 6 Tue Jul 30 15:31:40 2019 folder3 D 0 Tue Jul 30 15:56:36 2019 file3 N 6 Tue Jul 30 15:31:42 2019 folder1 D 0 Tue Jul 30 15:56:28 2019 8887 blocks of size 1024. 7991 blocks available smb: \> ls NT_STATUS_FILE_CORRUPT_ERROR listing \* ```