Bug 6693 - wrong logic in "Failed to read all inotify data" - crash with large directories
wrong logic in "Failed to read all inotify data" - crash with large directories
Status: RESOLVED FIXED
Product: Samba 3.3
Classification: Unclassified
Component: File services
unspecified
Other Linux
: P3 normal
: ---
Assigned To: Karolin Seeger
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-09-05 07:40 UTC by Michael Tokarev
Modified: 2009-09-08 02:46 UTC (History)
1 user (show)

See Also:
vl: review+


Attachments
Fix for reading the full buffer from inotify (1.50 KB, patch)
2009-09-05 11:15 UTC, Simo Sorce
no flags Details
Fix for reading the full buffer from inotify (1.59 KB, patch)
2009-09-05 11:43 UTC, Simo Sorce
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Tokarev 2009-09-05 07:40:06 UTC
in inotify code (source/smbd/notify_inotify.c) there's this code fragment (simplified):

        ioctl(in->fd, FIONREAD, &bufsize);
        e0 = (struct inotify_event *)TALLOC_SIZE(in, bufsize);

        if (sys_read(in->fd, e0, bufsize) != bufsize) {
                DEBUG(0,("Failed to read all inotify data\n"));
                talloc_free(e0);
                return;
        }

This code assumes that read(2) returns all available data in single step.  There's no such guarantee in POSIX, and it seems that at least with linux kernel 2.6.31-rc8 this assumption in this very place is not true anymore.

In my case FIONREAD returns more than read actually reads (read returned some value rounded to 4096, but FIONREAD had several bytes more).  This results in the above message printed in the log, which is harmless by its own.

But the thing is that we now out of sync with the structures coming from kernel, so the next read on this inotify fd will start from some middle of the structure.  And smbd happily crash here tryin to interpret the garbage.

[2009/09/05 14:16:09,  0] smbd/notify_inotify.c:inotify_handler(247)
  Failed to read all inotify data
*** glibc detected *** /usr/sbin/smbd: free(): invalid next size (fast): 0x08636618 ***
======= Backtrace: =========
/lib/i686/cmov/libc.so.6[0xf7d91624]
/lib/i686/cmov/libc.so.6(cfree+0x96)[0xf7d93826]
/usr/lib/libtalloc.so.1(talloc_free+0x153)[0xf7e8c443]
/usr/sbin/smbd[0x821661e]
/usr/sbin/smbd(run_events+0x1c2)[0x81f3947]
/usr/sbin/smbd[0x80d781c]
/usr/sbin/smbd(smbd_process+0xf8)[0x80d9ed0]
/usr/sbin/smbd(main+0xfa2)[0x80a1d1f]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xf7d39455]
/usr/sbin/smbd[0x809edb1]
...

This happens with several 3.3.x and 3.4.0 versions (I didn't try 3.2.x).
Reverting to kernel 2.6.30 fixes it again.

But note that I'd not say it's a kernel issue - the whole logic is wrong in samba.

See also http://forums.gentoo.org/viewtopic-t-790838.html with exactly the same problem.
Comment 1 Kai Blin 2009-09-05 08:47:09 UTC
This also applies to 3.4, even though the read has been replaced by a sys_read, that only fixes the EINTR handling, this bug still persists.
Comment 2 Simo Sorce 2009-09-05 11:15:02 UTC
Created attachment 4649 [details]
Fix for reading the full buffer from inotify

This patch should ensure that the full buffer is read even if the kernel returns a short read.
We also kill the inotify descriptor if something bad happens so we do not read garbage going forward.
Comment 3 Simo Sorce 2009-09-05 11:43:27 UTC
Created attachment 4650 [details]
Fix for reading the full buffer from inotify

Adds nt_errstr(status) output to the debug message and casts e2 to (char *) when calling read_data() to silence a warning.
Comment 4 Simo Sorce 2009-09-05 11:45:09 UTC
Volker,
please review.

Karolin,
Volker already acked on samba-technical@, if this new revision is fine it should be committed to 3.4.1 as well as it fixes potential segfaults and in general crazed out inotify events.
Comment 5 Jeremy Allison 2009-09-06 18:42:56 UTC
Re-assigning to Karolin for inclusion in 3.4.1 (and the v3-3-test tree, should go into there cleanly with source3 -> source).
Jeremy.
Comment 6 Michael Tokarev 2009-09-07 11:20:57 UTC
It looks like the real bug was actually in kernel, not in samba.  In kernel 2.6.31-rc8, fixed in 2.6.31-rc9.

<a href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=83cb10f0ef3c96162be92339ccf8c0c9c9f2d13e">commit 83cb10f0ef3c96162be92339ccf8c0c9c9f2d13e</a>:
 0db501bd0610ee0c0 introduced a regresion in that it now sends a nul
 terminator but the length accounting when checking for space or
 reporting to userspace did not take this into account.

So no way for samba to fix that by trying to read "full data".
Yet I think it's good this is addressed in samba: original logic was somewhat more fragile than necessary.
Comment 7 Karolin Seeger 2009-09-08 02:46:02 UTC
Pushed to v3-4-test and v3-3-test.
Closing out bug report.

Thanks!