Attempting to create a symlink from a Mac OS X 10.10.5 client fails with an Input/output error when the share is enabled for AIO. Samba version is 4.2.4. $ ln -s foo.txt symlink3.txt ln: symlink3.txt: Input/output error The content of the file after the error message appears to be an "mfsymlink" with the correct contents but the client treats them as files rather than symlinks. When the directory is listed (ls -l, for instance) all of the mfsymlinks in the share appear as text files in the share with AIO enabled. When the directory is listed with AIO disabled all of the mfsymlinks that were created look like symlinks regardless of if there was an error on create. From the server, the resulting files, either created successfully or with an error appear to have the same contents: # md5sum symlink*txt cdc02dd0400e70ba5126c5c8ea45e137 symlink.txt cdc02dd0400e70ba5126c5c8ea45e137 symlink2.txt cdc02dd0400e70ba5126c5c8ea45e137 symlink3.txt cdc02dd0400e70ba5126c5c8ea45e137 symlink4.txt cdc02dd0400e70ba5126c5c8ea45e137 symlink5.txt cdc02dd0400e70ba5126c5c8ea45e137 symlink6.txt cdc02dd0400e70ba5126c5c8ea45e137 symlink7.txt [/srv/symlink7.txt] 00000000 58 53 79 6d 0a 30 30 30 37 0a 34 66 64 38 63 63 |XSym.0007.4fd8cc| 00000010 38 35 63 61 39 65 65 62 64 32 66 61 33 63 35 35 |85ca9eebd2fa3c55| 00000020 30 30 36 39 63 65 32 38 34 36 0a 66 6f 6f 2e 74 |0069ce2846.foo.t| 00000030 78 74 0a 20 20 20 20 20 20 20 20 20 20 20 20 20 |xt. | 00000040 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | | * 00000420 20 20 20 20 20 20 20 20 20 20 20 | | 0000042b I am attaching smbd logs and tcpdump packet captures for the create error to this bug. File system is XFS on CentOS 7. I have pasted below what I think may be the relevant section of the log from the error case. This corresponds to a STATUS_INTERNAL_ERROR appearing in the related tcpdump: [2015/11/09 16:19:31.824661, 10, pid=6794, effective(401113, 400513), real(401113, 0)] ../source3/smbd/aio.c:962(aio_pwrite_smb2_done) pwrite_recv returned 1067, err = no error [2015/11/09 16:19:31.824721, 3, pid=6794, effective(401113, 400513), real(401113, 0)] ../source3/smbd/aio.c:966(aio_pwrite_smb2_done) aio_pwrite_smb2_done: request cancelled (mid[2338]) [2015/11/09 16:19:31.824790, 10, pid=6794, effective(401113, 400513), real(401113, 0)] ../source3/smbd/smb2_server.c:2706(smbd_smb2_request_error_ex) smbd_smb2_request_error_ex: idx[9] status[NT_STATUS_INTERNAL_ERROR] || at ../source3/smbd/smb2_write.c:135 [2015/11/09 16:19:31.824871, 10, pid=6794, effective(401113, 400513), real(401113, 0)] ../source3/smbd/smb2_server.c:2597(smbd_smb2_request_done_ex) smbd_smb2_request_done_ex: idx[9] status[NT_STATUS_INTERNAL_ERROR] body[8] dyn[yes:1] at ../source3/smbd/smb2_server.c:2754 We have tested with aio read/write size as both 1 and 1024. A summarized/filtered smb.conf: [global] workgroup = FOREST1 realm = FOREST1.EXAMPLE.NET security = ADS syslog = 0 debug pid = Yes min receivefile size = 131072 load printers = No printcap name = /dev/null disable spoolss = Yes show add printer wizard = No template shell = /sbin/nologin idmap config qa2k8: read only = no idmap config qa2k8: backend = rid idmap config qa2k8: range = 8000100-16000100 idmap config *: read only = yes idmap config *: range = 200000-210000 idmap config forest1: read only = no idmap config forest1: backend = rid idmap config forest1: range = 400000-8000000 idmap config * : backend = tdb force unknown acl user = Yes [symtest_aio] path = /srv read only = No inherit permissions = Yes inherit acls = Yes aio read size = 1 aio write size = 1 map acl inherit = Yes hide unreadable = Yes veto files = /.snapshot/ store dos attributes = Yes delete readonly = Yes dos filemode = Yes vfs objects = acl_xattr [symtest_noaio] path = /srv read only = No inherit permissions = Yes inherit acls = Yes map acl inherit = Yes hide unreadable = Yes veto files = /.snapshot/ store dos attributes = Yes delete readonly = Yes dos filemode = Yes vfs objects = acl_xattr My search uncovered Samba bug 11224, which seems like it could be related, however I'm using Samba 4.2.4 which should already include the fix for that bug. Please let me know if there is any other information I can provide.
Tarball was too large to attach, please access it here. http://asynchrono.us/files/symlink-aio-issue.tar.xz Let me know if there are any issues getting to it.
I'll take a look at the wireshark traces. That error comes from doing a cancel whilst the IO is still outstanding (which is strange).
Oh, looks like a chaining issue. Client is doing SMB2 Create + GetInfo + Write + Close and it looks like the file id isn't set inside the write when it gets scheduled...
Ah - looking at : https://bugzilla.samba.org/show_bug.cgi?id=11224 looks to me like we're deliberately refusing this request as we want the write to go async and it's not the last in the chain. Is this allowable behavior from the Mac client w.r.t. the SMB2 spec ? I don't think so (i.e. this looks like a Mac client SMB2 bug to me) but I'll check.
OK, pretty sure this is a Mac client bug but I'll know more tomorrow. There are some mitigation strategies we can do but I'll update tomorrow. Jeremy.
Right - went through the spec very carefully. This is a Mac client bug. They send (from your packet trace): 4173 SMB2 Create Request File: symlink5.txt;GetInfo Request FILE_INFO/SMB2_FILE_INTERNAL_INFO;Write Request Len:1067 Off:0;Close Request Now in MS-SMB2.pdf we have: ------------------------------------------------------------------------------------------------ 3.3.5.2.7 Handling Compounded Requests If the NextCommand field in the SMB2 header of the request is not equal to 0, the server MUST process the received request as a compounded series of requests. The server SHOULD<206> fail requests in a compound chain which require asynchronous processing. and note <206> is: <206> Section 3.3.5.2.7: When an operation in a compound request requires asynchronous processing, Windows servers fail them with STATUS_INTERNAL_ERROR except for the following two cases: when a create request in the compound request triggers an oplock break, or when the operation is last in the compound request. If a create request in a compound chain is processed asynchronously due to an oplock break, Windows servers send only one interim response to the client, which includes the compounded results of all prior operations in the request. If other create requests in the compound chain are completed before the oplock break, the level of the broken oplock is not updated in those create responses in the compound response. If the operation is last in the compound request, a single interim response is sent, which includes the compounded results of all prior operations in the request. Processing of the final operation in the request then proceeds asynchronously. ------------------------------------------------------------------------------------------------ When you set "aio write size = 1" in your smb.conf it means that *ALL* SMB2 write calls will be forced to use asynchronous processing. So the client should be able to handle this by resending the SMB2 write call in a non-compound request - and it looks like against a Windows server that write call doesn't go async, so they never see this problem. The only immediate fix is to set: "aio write size = X" where X is the max length of a possible symlink that the Mac can set. That should solve the issue in production. In the meantime I'll raise this with Apple. The only other fix we could do it explicitly refuse to go async in a write call that isn't the end of a chain, but that might cause other problems.
(In reply to Jeremy Allison from comment #6) aio write size = 1068 should work around this. Can someone test this?
(In reply to Stefan Metzmacher from comment #7) Sorry for the delay. Yes the workaround of setting 'aio write size = 1068' does appear to work for the basic tests we performed.
Now if only we could get Apple to stop their client sending illegal requests :-(.
Sorry guys, this bugreport slipped my attention. I could have saved you some valuable time, as I ran across this issue last year. This is a known OS X client issue that I have already reported to Apple (via sidechannels). I haven't filed a proper Apple radar for this one yet. This issue only affects MF-symlinks, which is just a fallback from a OS X SMB perspective. A server side fix would be to implement symlinks via reparse points which is the preferred method of OS X SMB client. This is also what Apple recommended in private conversation.
Ralph - Apple is violating the protocol spec here. Yes, they're getting away with it *for now* on Windows, but that could change at any time. The Apple client needs to change here.
(In reply to Jeremy Allison from comment #11) Sure. Maybe Apple fixed this in a release lately, haven't tested since then. Anyone who is affected may either file a radar with Apple and hope for the best, or help us to implement symlinks via reparse points. :)
Was catching up on the mailing list and it looks like fix for bug 13047 might resolve this issue. If I'm correct should this bug be marked as fixed or duplicate?
Yep, duplicate. I *thought* we already had a bug for this but couldn't find it :-). *** This bug has been marked as a duplicate of bug 13047 ***