Bug 7067 - Linux asynchronous IO (aio) can cause smbd to fail to respond to a read or write.
Linux asynchronous IO (aio) can cause smbd to fail to respond to a read or wr...
Product: Samba 3.5
Classification: Unclassified
Component: File services
All Linux
: P3 major
: ---
Assigned To: Karolin Seeger
Samba QA Contact
Depends on:
  Show dependency treegraph
Reported: 2010-01-26 18:48 UTC by Jeremy Allison
Modified: 2010-03-08 18:48 UTC (History)
0 users

See Also:

git-am format patch for 3.5.0. (4.78 KB, patch)
2010-01-28 16:26 UTC, Jeremy Allison
jra: review? (metze)
vl: review+
git-am format patch for 3.4.6. (4.78 KB, patch)
2010-01-28 16:47 UTC, Jeremy Allison
no flags Details
git-am format patch for 3.3.11. (4.83 KB, patch)
2010-01-28 16:57 UTC, Jeremy Allison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Allison 2010-01-26 18:48:57 UTC
On Linux, aio is implemented via threads within glibc that signal completion by sending a POSIX realtime signal to the main thread. If an aio thread takes too long to complete, and smbd switches from a non-zero userid (which initiated the IO) to root (in order to do housekeeping tasks) then the aio thread can fail to send the completion signal to the main thread with a permission denied error.

smbd uses setresuid() to change from a non-zero uid to root, or from root to a non-zero uid, and there is a race condition inside the glibc setresuid() implementation that can cause the main thread to be uid root, whilst the aio thread is a non-zero uid (threads under linux can have different uids but glibc must present a posix thread interface which specifies all threads in a process must be under the same uid/gid credentials). If the aio completes during this race window, and the aio thread tries to signal the main thread, then the signal is lost and the sending thread gets an EPERM response.

smbd then has "lost" the aio reply to the scheduled read or write, and can never reply to the SMB request.
Comment 1 Jeremy Allison 2010-01-26 18:51:51 UTC
See tridge's junkcode aio_uid.c program:


that demonstrates the problem.

Comment 2 Jeremy Allison 2010-01-28 16:26:59 UTC
Created attachment 5242 [details]
git-am format patch for 3.5.0.

This has gone into master. Metze & Volker please review for 3.5.0.
Comment 3 Jeremy Allison 2010-01-28 16:47:06 UTC
Created attachment 5243 [details]
git-am format patch for 3.4.6.
Comment 4 Jeremy Allison 2010-01-28 16:57:11 UTC
Created attachment 5244 [details]
git-am format patch for 3.3.11.
Comment 5 Volker Lendecke 2010-02-03 00:57:24 UTC
Comment on attachment 5242 [details]
git-am format patch for 3.5.0.

It won't do any harm (I think) if it does not work right, I don't think we kill other processes out of the blue with this.

Another question around this patch: Can we get rid of the become_root calls in message_notify() with this patch in?

Comment 6 Jeremy Allison 2010-02-03 14:00:35 UTC
Only for Linux I think (get rid of the become_root() calls). So for portability I'd leave them. Also that's *way* more of a change than I'm prepared to make right now :-).

Comment 7 Jeremy Allison 2010-02-03 14:25:25 UTC
Re-assigning to Karolin for inclusion in 3.5.0, 3.4.6 and 3.3.11.

Comment 8 Karolin Seeger 2010-02-04 03:08:53 UTC
Pushed to all branches.
Closing out bug report.

Comment 9 Jeremy Allison 2010-03-08 18:48:46 UTC
Just as a postmortem comment.

This bug actually was already fixed by commits in master:


which changed us to use setreuid() in preference to setresuid(), and added become_root()/unbecome_root() pairs around the aio_read()/aio_write() calls.

So this fix was unnecessary and simply caused a security hole. Damn :-(.