Created attachment 12253 [details] Level 10 Log On a Samba 4.4.4 ctdb cluster the following messages where (repeatedly) found in the log: [2016/06/30 08:45:02.645180, 1] ../librpc/ndr/ndr.c:578(ndr_pull_error) ndr_pull_error(11): Pull bytes 8 (../librpc/ndr/ndr_basic.c:236) [2016/06/30 08:45:02.645213, 1] ../source3/locking/share_mode_lock.c:312(parse_share_modes) ndr_pull_share_mode_lock failed: Buffer Size Error Looking closer this is coming from ctdb tombstone records in the node local ltdb. I was able to reproduce this yesterday on my lxc based devel cluster, for some reason however since doing a git bisect I can't anymore. I see two ways to address this: o add yet another check for tombstone records to a dbwrap caller, or o fix dbwrap In dbwrap we already check for tombstone records in at least two places: o when fetching records from remote ctdb nodes via ctdbd_parse() o in db_ctdb_traverse() Just db_ctdb_ltdb_parser() lacks a check for tombstone records.
Created attachment 12254 [details] Possible patch for master
One note: the crash at the end of the log stems from adding a smb_panic() for debugging.
Created attachment 12255 [details] Full SBT
Oh, and I can reproduce it, I was just looking at the wrong place. Looks like this is present since a looooong time, still bisecting. Wonder why this wasn't noticed before...
Oh, I missed this: the patch from bug 10008 already had this fix in, 1cae59ce112ccb51b45357a52b902f80fce1eef1, but it had to be reverted in 925625b52886d40b50fc631bad8bdc81970f7598. I *think* my patch might be a proper fix without the risk of a deadlock, because it *won't* call out to ctdb but return ENOENT (im terms of NTSTATUS).
Created attachment 12345 [details] Patch for 4.5 cherry-picked from master
Created attachment 12346 [details] Patch for 4.3 and 4.4 backported from master
Comment on attachment 12254 [details] Possible patch for master >From c26c3c354db42a688551d46d8583352a5c7c9ad7 Mon Sep 17 00:00:00 2001 >From: Ralph Boehme <slow@samba.org> >Date: Fri, 1 Jul 2016 17:45:53 +0200 >Subject: [PATCH 1/2] WIP: dbwrap_ctdb: move struct db_ctdb_parse_record_state > >Will be needed in the next commit. No change in behavour. > >Signed-off-by: Ralph Boehme <slow@samba.org> >--- > source3/lib/dbwrap/dbwrap_ctdb.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > >diff --git a/source3/lib/dbwrap/dbwrap_ctdb.c b/source3/lib/dbwrap/dbwrap_ctdb.c >index df5a34f..c557902 100644 >--- a/source3/lib/dbwrap/dbwrap_ctdb.c >+++ b/source3/lib/dbwrap/dbwrap_ctdb.c >@@ -99,6 +99,14 @@ static int db_ctdb_ltdb_parser(TDB_DATA key, TDB_DATA data, > return 0; > } > >+struct db_ctdb_parse_record_state { >+ void (*parser)(TDB_DATA key, TDB_DATA data, void *private_data); >+ void *private_data; >+ uint32_t my_vnn; >+ bool ask_for_readonly_copy; >+ bool done; >+}; >+ > static NTSTATUS db_ctdb_ltdb_parse( > struct db_ctdb_ctx *db, TDB_DATA key, > void (*parser)(TDB_DATA key, struct ctdb_ltdb_header *header, >@@ -1215,14 +1223,6 @@ static struct db_record *db_ctdb_try_fetch_locked(struct db_context *db, > return fetch_locked_internal(ctx, mem_ctx, key, true); > } > >-struct db_ctdb_parse_record_state { >- void (*parser)(TDB_DATA key, TDB_DATA data, void *private_data); >- void *private_data; >- uint32_t my_vnn; >- bool ask_for_readonly_copy; >- bool done; >-}; >- > static void db_ctdb_parse_record_parser( > TDB_DATA key, struct ctdb_ltdb_header *header, > TDB_DATA data, void *private_data) >-- >2.5.0 > > >From 5ddd6e537beffe3fef855d81a0c342a8feda235b Mon Sep 17 00:00:00 2001 >From: Ralph Boehme <slow@samba.org> >Date: Fri, 1 Jul 2016 17:21:04 +0200 >Subject: [PATCH 2/2] WIP: dbwrap_ctdb: check for ltdb tombstone records > >When fetching records from remote ctdb nodes via ctdbd_parse() or in >db_ctdb_traverse(), we already checks for tombstone records and skip >them. Add the same check to db_ctdb_ltdb_parser() when fetching records >from the ltbd. > >Signed-off-by: Ralph Boehme <slow@samba.org> >--- > source3/lib/dbwrap/dbwrap_ctdb.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > >diff --git a/source3/lib/dbwrap/dbwrap_ctdb.c b/source3/lib/dbwrap/dbwrap_ctdb.c >index c557902..b3097b8 100644 >--- a/source3/lib/dbwrap/dbwrap_ctdb.c >+++ b/source3/lib/dbwrap/dbwrap_ctdb.c >@@ -79,6 +79,7 @@ struct db_ctdb_ltdb_parse_state { > void (*parser)(TDB_DATA key, struct ctdb_ltdb_header *header, > TDB_DATA data, void *private_data); > void *private_data; >+ bool tombstone; > }; > > static int db_ctdb_ltdb_parser(TDB_DATA key, TDB_DATA data, >@@ -91,6 +92,14 @@ static int db_ctdb_ltdb_parser(TDB_DATA key, TDB_DATA data, > return -1; > } > >+ if (data.dsize == sizeof(struct ctdb_ltdb_header)) { >+ /* >+ * record tombstone >+ */ >+ state->tombstone = true; >+ return -1; >+ } >+ > state->parser( > key, (struct ctdb_ltdb_header *)data.dptr, > make_tdb_data(data.dptr + sizeof(struct ctdb_ltdb_header), >@@ -105,6 +114,7 @@ struct db_ctdb_parse_record_state { > uint32_t my_vnn; > bool ask_for_readonly_copy; > bool done; >+ bool tombstone; > }; > > static NTSTATUS db_ctdb_ltdb_parse( >@@ -113,6 +123,8 @@ static NTSTATUS db_ctdb_ltdb_parse( > TDB_DATA data, void *private_data), > void *private_data) > { >+ struct db_ctdb_parse_record_state *rec_state = >+ (struct db_ctdb_parse_record_state *)private_data; > struct db_ctdb_ltdb_parse_state state; > int ret; > >@@ -122,6 +134,9 @@ static NTSTATUS db_ctdb_ltdb_parse( > ret = tdb_parse_record(db->wtdb->tdb, key, db_ctdb_ltdb_parser, > &state); > if (ret == -1) { >+ if (state.tombstone) { >+ rec_state->tombstone = true; >+ } > return NT_STATUS_NOT_FOUND; > } > return NT_STATUS_OK; >@@ -1267,6 +1282,7 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key, > state.parser = parser; > state.private_data = private_data; > state.my_vnn = ctdbd_vnn(ctx->conn); >+ state.tombstone = false; > > if (ctx->transaction != NULL) { > struct db_ctdb_transaction_handle *h = ctx->transaction; >@@ -1301,6 +1317,12 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key, > return NT_STATUS_OK; > } > >+ if (NT_STATUS_EQUAL(status, NT_STATUS_NOT_FOUND) >+ && state.tombstone) >+ { >+ return NT_STATUS_NOT_FOUND; >+ } >+ > ret = ctdbd_parse(ctx->conn, ctx->db_id, key, > state.ask_for_readonly_copy, parser, private_data); > if (ret != 0) { >-- >2.5.0 >
Damn! :)
Reassigning to Karolin for inclusion in 4.3, 4.4 and 4.5. Thanks!
(In reply to Ralph Böhme from comment #10) Pushed to autobuild-v4-[5|4|3]-test.
(In reply to Karolin Seeger from comment #11) Pushed to all branches. Closing out bug report. Thanks!