The Samba-Bugzilla – Attachment 11817 Details for
Bug 11707
CTDB crashes during database recovery
Home
|
New
|
Browse
|
Search
|
[?]
|
Reports
|
Requests
|
Help
|
New Account
|
Log In
[x]
|
Forgot Password
Login:
[x]
[patch]
Patch for v4-4 branch
0001-Revert-ctdb-daemon-Check-packet-generation-against-d.patch (text/plain), 7.21 KB, created by
Amitay Isaacs
on 2016-02-09 13:16:36 UTC
(
hide
)
Description:
Patch for v4-4 branch
Filename:
MIME Type:
Creator:
Amitay Isaacs
Created:
2016-02-09 13:16:36 UTC
Size:
7.21 KB
patch
obsolete
>From ac1a7d5bf44a2ed45cc3a0e147109fee2fbca9df Mon Sep 17 00:00:00 2001 >From: Amitay Isaacs <amitay@gmail.com> >Date: Tue, 2 Feb 2016 15:58:37 +1100 >Subject: [PATCH] Revert "ctdb-daemon: Check packet generation against database > generation" > >This reverts commit 0ff90f4fac74e61192aff100b168e38ce0adfabb. > >BUG: https://bugzilla.samba.org/show_bug.cgi?id=11707 > >The checks against database generation are not required since >the global generation is updated as part of updating vnnmap >before the actual database recovery. This change was done in >5aab31a39a3589b910a78b96071d6aa5e6547696. > >Checking only against the database generation is incomplete. It can >cause CTDB to abort if the following sequence of events happen. > > - CTDB gets REQ_DMASTER packet (gen1) > This packet processing gets deferred to get a record lock > > - CTDB goes into recovery, marks RECOVERY_ACTIVE > CTDB recovery helper updates vnnmap (gen2) > > - CTDB processes REQ_DMASTER packet (gen1) > The check against database generation (gen1) succeeds. > The check for lmaster is now invalid because VNNMAP has changed. > This will cause CTDB to abort due to protocol error. > >Reverting the patch stops processing packets of older generation before >they get into call processing. > >Signed-off-by: Amitay Isaacs <amitay@gmail.com> >Signed-off-by: Martin Schwenke <martin@meltin.net> > >Autobuild-User(master): Martin Schwenke <martins@samba.org> >Autobuild-Date(master): Tue Feb 9 12:39:24 CET 2016 on sn-devel-144 > >(cherry picked from commit b71c2e42308d23f08e1dd38c9a45ee8f25c65404) >--- > ctdb/server/ctdb_call.c | 59 ++++++++++------------------------------------- > ctdb/server/ctdb_server.c | 21 +++++++++++++---- > 2 files changed, 28 insertions(+), 52 deletions(-) > >diff --git a/ctdb/server/ctdb_call.c b/ctdb/server/ctdb_call.c >index db9fb6a..b61754e 100644 >--- a/ctdb/server/ctdb_call.c >+++ b/ctdb/server/ctdb_call.c >@@ -586,23 +586,6 @@ void ctdb_request_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr > size_t len; > int ret; > >- ctdb_db = find_ctdb_db(ctdb, c->db_id); >- if (!ctdb_db) { >- ctdb_send_error(ctdb, hdr, -1, >- "Unknown database in request. db_id==0x%08x", >- c->db_id); >- return; >- } >- >- if (hdr->generation != ctdb_db->generation) { >- DEBUG(DEBUG_DEBUG, >- ("ctdb operation %u request %u from node %u to %u had an" >- " invalid generation:%u while our generation is:%u\n", >- hdr->operation, hdr->reqid, hdr->srcnode, hdr->destnode, >- hdr->generation, ctdb_db->generation)); >- return; >- } >- > key.dptr = c->data; > key.dsize = c->keylen; > data.dptr = c->data + c->keylen; >@@ -614,6 +597,14 @@ void ctdb_request_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr > sizeof(record_flags)); > } > >+ ctdb_db = find_ctdb_db(ctdb, c->db_id); >+ if (!ctdb_db) { >+ ctdb_send_error(ctdb, hdr, -1, >+ "Unknown database in request. db_id==0x%08x", >+ c->db_id); >+ return; >+ } >+ > dmaster_defer_setup(ctdb_db, hdr, key); > > /* fetch the current record */ >@@ -642,7 +633,7 @@ void ctdb_request_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr > if (header.dmaster != hdr->srcnode) { > DEBUG(DEBUG_ALERT,("pnn %u dmaster request for new-dmaster %u from non-master %u real-dmaster=%u key %08x dbid 0x%08x gen=%u curgen=%u c->rsn=%llu header.rsn=%llu reqid=%u keyval=0x%08x\n", > ctdb->pnn, c->dmaster, hdr->srcnode, header.dmaster, ctdb_hash(&key), >- ctdb_db->db_id, hdr->generation, ctdb_db->generation, >+ ctdb_db->db_id, hdr->generation, ctdb->vnn_map->generation, > (unsigned long long)c->rsn, (unsigned long long)header.rsn, c->hdr.reqid, > (key.dsize >= 4)?(*(uint32_t *)key.dptr):0)); > if (header.rsn != 0 || header.dmaster != ctdb->pnn) { >@@ -657,7 +648,7 @@ void ctdb_request_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr > if (header.rsn > c->rsn) { > DEBUG(DEBUG_ALERT,("pnn %u dmaster request with older RSN new-dmaster %u from %u real-dmaster=%u key %08x dbid 0x%08x gen=%u curgen=%u c->rsn=%llu header.rsn=%llu reqid=%u\n", > ctdb->pnn, c->dmaster, hdr->srcnode, header.dmaster, ctdb_hash(&key), >- ctdb_db->db_id, hdr->generation, ctdb_db->generation, >+ ctdb_db->db_id, hdr->generation, ctdb->vnn_map->generation, > (unsigned long long)c->rsn, (unsigned long long)header.rsn, c->hdr.reqid)); > } > >@@ -902,6 +893,7 @@ void ctdb_request_call(struct ctdb_context *ctdb, struct ctdb_req_header *hdr) > return; > } > >+ > ctdb_db = find_ctdb_db(ctdb, c->db_id); > if (!ctdb_db) { > ctdb_send_error(ctdb, hdr, -1, >@@ -910,15 +902,6 @@ void ctdb_request_call(struct ctdb_context *ctdb, struct ctdb_req_header *hdr) > return; > } > >- if (hdr->generation != ctdb_db->generation) { >- DEBUG(DEBUG_DEBUG, >- ("ctdb operation %u request %u from node %u to %u had an" >- " invalid generation:%u while our generation is:%u\n", >- hdr->operation, hdr->reqid, hdr->srcnode, hdr->destnode, >- hdr->generation, ctdb_db->generation)); >- return; >- } >- > call = talloc(hdr, struct ctdb_call); > CTDB_NO_MEMORY_FATAL(ctdb, call); > >@@ -1193,15 +1176,6 @@ void ctdb_reply_call(struct ctdb_context *ctdb, struct ctdb_req_header *hdr) > return; > } > >- if (hdr->generation != state->generation) { >- DEBUG(DEBUG_DEBUG, >- ("ctdb operation %u request %u from node %u to %u had an" >- " invalid generation:%u while our generation is:%u\n", >- hdr->operation, hdr->reqid, hdr->srcnode, hdr->destnode, >- hdr->generation, state->generation)); >- return; >- } >- > > /* read only delegation processing */ > /* If we got a FETCH_WITH_HEADER we should check if this is a ro >@@ -1296,16 +1270,7 @@ void ctdb_reply_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr) > DEBUG(DEBUG_ERR,("Unknown db_id 0x%x in ctdb_reply_dmaster\n", c->db_id)); > return; > } >- >- if (hdr->generation != ctdb_db->generation) { >- DEBUG(DEBUG_DEBUG, >- ("ctdb operation %u request %u from node %u to %u had an" >- " invalid generation:%u while our generation is:%u\n", >- hdr->operation, hdr->reqid, hdr->srcnode, hdr->destnode, >- hdr->generation, ctdb_db->generation)); >- return; >- } >- >+ > key.dptr = c->data; > key.dsize = c->keylen; > data.dptr = &c->data[key.dsize]; >diff --git a/ctdb/server/ctdb_server.c b/ctdb/server/ctdb_server.c >index b30ecaa..7d42c38 100644 >--- a/ctdb/server/ctdb_server.c >+++ b/ctdb/server/ctdb_server.c >@@ -225,11 +225,22 @@ void ctdb_input_pkt(struct ctdb_context *ctdb, struct ctdb_req_header *hdr) > goto done; > } > >- /* Push the check for generation in the handlers for these >- * operations. Check database generation instead of global >- * generation. Since the database context is not available >- * here, push the check in the operations. >- */ >+ /* for ctdb_call inter-node operations verify that the >+ remote node that sent us the call is running in the >+ same generation instance as this node >+ */ >+ if (ctdb->vnn_map->generation != hdr->generation) { >+ DEBUG(DEBUG_DEBUG,(__location__ " ctdb operation %u" >+ " request %u" >+ " length %u from node %u to %u had an" >+ " invalid generation id:%u while our" >+ " generation id is:%u\n", >+ hdr->operation, hdr->reqid, >+ hdr->length, >+ hdr->srcnode, hdr->destnode, >+ hdr->generation, ctdb->vnn_map->generation)); >+ goto done; >+ } > } > > switch (hdr->operation) { >-- >2.5.0 >
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Raw
Flags:
martins
:
review+
Actions:
View
Attachments on
bug 11707
: 11817