In tdb_transaction_commit() the DB will occasionally repack.
To do this, after the transaction locks are released to other callers, a new tdb_transaction is opened, used and attempted to be closed.
This fails if the lock order is violated, eg:
ltdb: tdb(../private/sam.ldb.d/DC=SAMBA,DC=EXAMPLE,DC=COM.ldb): tdb_transaction_prepare_commit: failed to upgrade hash locks: Locking error
ldb: ltdb: tdb(../private/sam.ldb.d/DC=SAMBA,DC=EXAMPLE,DC=COM.ldb): ../../lib/tdb/common/tdb.c:1120 Failed to commit
Created attachment 15160 [details]
First iteration patch under test in the merge request
Can you take a very careful look at this.
I've not dared trying to rework TDB to do this without giving up the transaction locks, but that would also be a reasonable approach.
This original code was added by tridge as:
Author: Andrew Tridgell <email@example.com>
Date: Mon Jun 1 13:11:39 2009 +1000
auto-repack in transactions that expand the tdb
The idea behind this is to recover from badly fragmented free
lists. Choosing the point where the file expands is fairly arbitrary,
but seems to work well.
so it's been there for a long time. I see why he placed it where it is, but there's no reason it *has* to be in the exit path of
tdb_transaction_commit(), after the transaction has been committed successfully.
Initially, your patch certainly looks like it will do the right thing, but I'd like to discuss more first, to see if there are other options we might look at.
Created attachment 15162 [details]
Slightly modified version
So I'd be slightly happier with the following, as it only fails the transaction if tdb_repack() returns -1 and tdb->ecode != TDB_ERR_LOCK, which I think might be safer. Otherwise we might miss an error in write inside the repack code.
Also adds your patch #2 to make a complete replacement.
What do you think ? Happy to discuss on the phone.
Another thing we might think about is adding in a tdb_repack() on the ldb tdb databases as a periodic (60 seconds?) timer in the main samba ldap server code, outside of any transactions (possibly from the main daemon). Again, ignore any errors except for lock errors.
ctdb_vacuum.c already does this (not on a timer).
Sorry, meant to add:
ctdb_vacuum.c already does this (not on a timer) but on child connection after fork.
FYI, I also looked at moving the tdb_repack() call to just before the call to _tdb_transaction_cancel() - which might also be a solution without having to check for lock order violations.
However we'd then also need to check that the flag TDB_DISALLOW_NESTING isn't set for the tdb, otherwise the nested transaction request inside tdb_repack() would always fail.
That would mean that any tdb opened with TDB_DISALLOW_NESTING doesn't get implicit repack in tdb_transaction_commit() calls. This seems to be heavily used inside ctdb, so we'd need to consult with the ctdb folks before landing that alternate change.
Your patch (or my variation thereof) looks safer IMHO.
Yeah, LDB sets TDB_DISALLOW_NESTING because it does it's own (even dodgier) nesting. It would have to be a special modification of the code that rebuilds the memory/disk side of the transaction but not the locking side.
I'm also quite concerned however that we should NEVER return an error from tdb_transaction_commit() after successfully writing data, as that would mislead the application and client, which needs to know what is in the DB.
I'll give you a call.
(In reply to Jeremy Allison from comment #6)
Thanks for taking the phone call. There we agreed that TDB should never return tdb_repack() errors on the basis that the transaction that was successfully written must return success to the callers, even if the repack went very badly.
You asked that I try to make repack compulsory to determine if Samba can ever repack. The answer is yes, we can perform some operations, so repack is happening except when we loose the race.
I'll therefore propose my patch. It may still cause lock order issues elsewhere but it should be rarer.
We can later determine if it is possible to do the repack while still holding the lock, after locking away the user's original work.