7687 – tdb does not scale on zfs

Bug 7687 - tdb does not scale on zfs

Summary: tdb does not scale on zfs

Status:	RESOLVED WONTFIX

Alias:	None

Product:	TDB
Classification:	Unclassified
Component:	libtdb (show other bugs)
Version:	unspecified
Hardware:	x86 Solaris

Importance:	P3 major
Target Milestone:	---
Assignee:	Samba QA Contact
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-09-17 04:39 UTC by Arne Jansen
Modified:	2010-09-18 14:07 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Arne Jansen 2010-09-17 04:39:31 UTC

If the connections.tdb is hosted on a zfs filesystem, the system slows down massively when reaching a few hundred smbd processes. The reason is that the mmap implemention for zfs based files is still fundamentally broken wrt to performance.
Workaround: set use mmap=no or host the connections.tdb in /tmp or on UFS.

Comment 1 Volker Lendecke 2010-09-17 12:30:31 UTC

Can you give us some hints how Samba can detect that it's running into that situation? This will btw be valid for all tdb files, not just connections.tdb.

I'm inclined to reject this as invalid, from user space I don't know a sane way to detect this. The only thing that we might want to do is to make the "use mmap" option a per-tdb thing.

Volker

Comment 2 Arne Jansen 2010-09-17 12:57:30 UTC

Volker,

thanks for your reply. Of course it's valid for all tdbs, but with the connections.tdb it is very prominent. The main reason I opened this bug is that I don't want anyone else to run into this situation, because it took a very long time to pin down. Using mmap=no is IMHO a valid solution, but I don't know how it can best be communicated to the user, so an autodetection might be good. A simple way to detect on which filesystem a file is hosted is by issuing a fstatvfs call to the fd. On zfs, the f_basetype is "zfs".

Arne

Comment 3 Volker Lendecke 2010-09-18 14:07:33 UTC

f_basetype seems to be non-standard, it seems to be a Solaris thing. Moreover, fstatvfs at least on Linux can be *VERY* expensive, I have seen calls around that walk /etc/mtab and touch all directories mentioned there.

I'm closing this as WONTFIX. We do have a workaround, but Oracle should much rather go fix ZFS.

Nevertheless, thanks for pointing out this.

Volker