Bug 15294 - Backing up offline domain with lmdb >= 0.9.29 doesn't work
Summary: Backing up offline domain with lmdb >= 0.9.29 doesn't work
Status: RESOLVED INVALID
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB (show other bugs)
Version: 4.18.0rc1
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-27 17:32 UTC by Andreas Schneider
Modified: 2023-02-15 14:03 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Schneider 2023-01-27 17:32:26 UTC
Using /home/asn/workspace/projects/samba/asn-fix/st/tmp/iuC7GKiUIv/smb.conf as restored domain's smb.conf
Invalid data for index  DN=@INDEXLIST

Failed to connect to 'mdb:///home/asn/workspace/projects/samba/asn-fix/st/offlinebackupdc/private/sam.ldb.d/DC=BACKUPDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb' with backend 'mdb': Unable to load ltdb cache records for backend 'ldb_mdb backend'
module samba_dsdb initialization failed : Operations error
Unable to load modules for /home/asn/workspace/projects/samba/asn-fix/st/offlinebackupdc/private/sam.ldb: Unable to load ltdb cache records for backend 'ldb_mdb backend'
ERROR(ldb): uncaught exception - Unable to load ltdb cache records for backend 'ldb_mdb backend'
  File "/home/asn/workspace/projects/samba/asn-fix/bin/python/samba/netcmd/__init__.py", line 230, in _run
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asn/workspace/projects/samba/asn-fix/bin/python/samba/netcmd/domain_backup.py", line 512, in run
    samdb = SamDB(url=samdb_path, session_info=system_session(), lp=lp,
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asn/workspace/projects/samba/asn-fix/bin/python/samba/samdb.py", line 90, in __init__
    super(SamDB, self).__init__(url=url, lp=lp, modules_dir=modules_dir,
  File "/home/asn/workspace/projects/samba/asn-fix/bin/python/samba/__init__.py", line 114, in __init__
    self.connect(url, flags, options)
  File "/home/asn/workspace/projects/samba/asn-fix/bin/python/samba/samdb.py", line 106, in connect
    super(SamDB, self).connect(url=url, flags=flags,
Failed to restore backup using: 
python3 ./bin/samba-tool domain backup restore --backup-file=/home/asn/workspace/projects/samba/asn-fix/st/tmp/eauyZmonOv/samba-backup-2023-01-27T17-25-54.215455.tar.bz2 --targetdir=/home/asn/workspace/projects/samba/asn-fix/st/offlinebackupdc --newservername=offlinebackupdc --host-ip=10.53.57.44 --configfile=/home/asn/workspace/projects/samba/asn-fix/st/tmp/iuC7GKiUIv/smb.conf at /home/asn/workspace/projects/samba/asn-fix/selftest/target/Samba4.pm line 3226.

Reproducible with:

make testenv SELFTEST_TESTENV=offlinebackupdc
Comment 1 Andreas Schneider 2023-01-27 17:33:26 UTC
Looks like the workaround from https://bugzilla.samba.org/show_bug.cgi?id=14676 doesn't work anymore.
Comment 2 Andreas Schneider 2023-01-27 20:25:23 UTC
#0  lmdb_parse_record (ldb_kv=<optimized out>, key=..., parser=<optimized out>, ctx=<optimized out>) at ../../lib/ldb/ldb_mdb/ldb_mdb.c:413
#1  0x00007fffdf25b1d6 in ldb_kv_cache_load (module=<optimized out>) at ../../lib/ldb/ldb_key_value/ldb_kv_cache.c:453
#2  0x00007fffdf24b22b in ldb_kv_init_store (ldb_kv=ldb_kv@entry=0x61100001c020, name=name@entry=0x7fffdf26a9e0 "ldb_mdb backend", ldb=ldb@entry=0x612000008da0, options=options@ent
ry=0x0, _module=_module@entry=0x7fffffff98e0) at ../../lib/ldb/ldb_key_value/ldb_kv.c:2168 
#3  0x00007fffdf2705b0 in lmdb_connect (ldb=<optimized out>, url=<optimized out>, flags=<optimized out>, options=<optimized out>, _module=<optimized out>) at ../../lib/ldb/ldb_mdb/
ldb_mdb.c:1141                                                                            
#4  0x00007ffff6270f77 in ldb_module_connect_backend (ldb=ldb@entry=0x612000008da0, url=0x61100001bee0 "mdb:///home/asn/workspace/projects/samba/asn-fix/st/offlinebackupdc/private/
sam.ldb.d/DC=BACKUPDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb", options=0x0, backend_module=backend_module@entry=0x7fffffff98e0) at ../../lib/ldb/common/ldb_modules.c:223
#5  0x00007fffdf1574bd in new_partition_from_dn (ldb=ldb@entry=0x612000008da0, data=data@entry=0x60e000026520, mem_ctx=<optimized out>, dn=dn@entry=0x60f000016f30, filename=filenam
e@entry=0x610000004dc8 "sam.ldb.d/DC=BACKUPDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb", backend_db_store=<optimized out>, partition=<optimized out>) at ../../source4/dsdb/samdb/ldb_modules
/partition_init.c:257
#6  0x00007fffdf1592ea in partition_reload_if_required (module=module@entry=0x60d000005ad0, data=data@entry=0x60e000026520, parent=parent@entry=0x0) at ../../source4/dsdb/samdb/ldb
_modules/partition_init.c:513
#7  0x00007fffdf151e28 in partition_read_lock (module=0x60d000005ad0) at ../../source4/dsdb/samdb/ldb_modules/partition.c:1492
#8  0x00007ffff62727be in ldb_next_read_lock (module=0x60d000005ad0, module@entry=0x60d000007670) at ../../lib/ldb/common/ldb_modules.c:664
#9  0x00007fffdef4ed00 in schema_read_lock (module=0x60d000007670) at ../../source4/dsdb/samdb/ldb_modules/schema_load.c:614
#10 0x00007ffff62727be in ldb_next_read_lock (module=0x60d000007670, module@entry=0x60d000004750) at ../../lib/ldb/common/ldb_modules.c:664
#11 0x00007fffdefb163b in samba_dsdb_init (module=<optimized out>) at ../../source4/dsdb/samdb/ldb_modules/samba_dsdb.c:483
#12 0x00007ffff62716d9 in ldb_module_init_chain (ldb=ldb@entry=0x612000008da0, module=0x60d000004750) at ../../lib/ldb/common/ldb_modules.c:365
#13 0x00007ffff6271ade in ldb_load_modules (ldb=ldb@entry=0x612000008da0, options=options@entry=0x0) at ../../lib/ldb/common/ldb_modules.c:447
#14 0x00007ffff626f3d7 in ldb_connect (ldb=ldb@entry=0x612000008da0, url=0x7fffdfb60060 "/home/asn/workspace/projects/samba/asn-fix/st/offlinebackupdc/private/sam.ldb", flags=<opti
mized out>, options=options@entry=0x0) at ../../lib/ldb/common/ldb.c:274
#15 0x00007ffff62e11e6 in py_ldb_connect (self=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at ../../lib/ldb/pyldb.c:1260


410             lmdb->error = mdb_get(txn, dbi, &mdb_key, &mdb_data);

(gdb) p lmdb->error                                                                       
$17 = -30796


-30796 is MDB_CORRUPTED
Comment 3 Andreas Schneider 2023-01-27 20:29:23 UTC
Interestingly with the following patch:

diff --git a/lib/ldb/ldb_mdb/ldb_mdb.c b/lib/ldb/ldb_mdb/ldb_mdb.c
index c163321d5a7..762bc2c9d92 100644
--- a/lib/ldb/ldb_mdb/ldb_mdb.c
+++ b/lib/ldb/ldb_mdb/ldb_mdb.c
@@ -406,6 +406,7 @@ static int lmdb_parse_record(struct ldb_kv_private *ldb_kv,
 
        mdb_key.mv_size = key.length;
        mdb_key.mv_data = key.data;

+       fprintf(stderr, "XXXXXX %*s\n", (int)key.length, (char *)key.data);
 
        lmdb->error = mdb_get(txn, dbi, &mdb_key, &mdb_data);
        if (lmdb->error != MDB_SUCCESS) {

It prints a lot of DNs and then I get a stack overflow:

=================================================================                                                                                                         [76/57817]
==4075957==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffcc7fc80a5 at pc 0x7f645327481e bp 0x7ffcc7fc7cb0 sp 0x7ffcc7fc7460
READ of size 23 at 0x7ffcc7fc80a5 thread T0
    #0 0x7f645327481d in printf_common(void*, char const*, __va_list_tag*) (/lib64/libasan.so.8+0x7481d)
    #1 0x7f645327504e in vfprintf (/lib64/libasan.so.8+0x7504e)
    #2 0x7f645327513e in __interceptor_fprintf (/lib64/libasan.so.8+0x7513e)
    #3 0x7f64388c95f0 in lmdb_parse_record ../../lib/ldb/ldb_mdb/ldb_mdb.c:409
    #4 0x7f64388a7ca0 in ldb_kv_search_key ../../lib/ldb/ldb_key_value/ldb_kv_search.c:194 
    #5 0x7f64388a81c4 in ldb_kv_search_dn1 ../../lib/ldb/ldb_key_value/ldb_kv_search.c:273 
    #6 0x7f64388a928c in ldb_kv_search_and_return_base ../../lib/ldb/ldb_key_value/ldb_kv_search.c:505
    #7 0x7f64388a928c in ldb_kv_search ../../lib/ldb/ldb_key_value/ldb_kv_search.c:671
    #8 0x7f64388a5d28 in ldb_kv_callback ../../lib/ldb/ldb_key_value/ldb_kv.c:1971
    #9 0x7f6451a643fd in tevent_common_invoke_timer_handler ../../lib/tevent/tevent_timed.c:376
    #10 0x7f6451a64ace in tevent_common_loop_timer_delay ../../lib/tevent/tevent_timed.c:453
    #11 0x7f6451a6936e in epoll_event_loop_once ../../lib/tevent/tevent_epoll.c:923
    #12 0x7f6451a6255d in std_event_loop_once ../../lib/tevent/tevent_standard.c:110
    #13 0x7f6451a530e7 in _tevent_loop_once ../../lib/tevent/tevent.c:823
    #14 0x7f6451aac938 in ldb_wait ../../lib/ldb/common/ldb.c:645
    #15 0x7f645223986e in py_ldb_add ../../lib/ldb/pyldb.c:1496
    #16 0x7f6452ddeedd in method_vectorcall_VARARGS_KEYWORDS (/lib64/libpython3.11.so.1.0+0x1deedd)
    #17 0x7f6452dd0b56 in PyObject_Vectorcall (/lib64/libpython3.11.so.1.0+0x1d0b56)
    #18 0x7f6452dc2f43 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c2f43)
    #19 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #20 0x7f6452de94aa in _PyVectorcall_Call (/lib64/libpython3.11.so.1.0+0x1e94aa)
    #21 0x7f6452dc7176 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c7176)
    #22 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #23 0x7f6452de94aa in _PyVectorcall_Call (/lib64/libpython3.11.so.1.0+0x1e94aa)
    #24 0x7f6452dc7176 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c7176)
    #25 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #26 0x7f6452de94aa in _PyVectorcall_Call (/lib64/libpython3.11.so.1.0+0x1e94aa)
    #27 0x7f6452dc7176 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c7176)
    #28 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #29 0x7f6452dfcf4c in method_vectorcall (/lib64/libpython3.11.so.1.0+0x1fcf4c)
    #30 0x7f6452de94aa in _PyVectorcall_Call (/lib64/libpython3.11.so.1.0+0x1e94aa)
    #31 0x7f6452dc7176 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c7176)
    #32 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #33 0x7f6452dfce94 in method_vectorcall (/lib64/libpython3.11.so.1.0+0x1fce94)
    #34 0x7f6452dc7176 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c7176)
    #35 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #36 0x7f6452dc7176 in _PyEval_EvalFrameDefault (/lib64/libpython3.11.so.1.0+0x1c7176)
    #37 0x7f6452dbed69 in _PyEval_Vector (/lib64/libpython3.11.so.1.0+0x1bed69)
    #38 0x7f6452e485ab in PyEval_EvalCode (/lib64/libpython3.11.so.1.0+0x2485ab)
    #39 0x7f6452e676e2 in run_eval_code_obj (/lib64/libpython3.11.so.1.0+0x2676e2)
    #40 0x7f6452e63c19 in run_mod (/lib64/libpython3.11.so.1.0+0x263c19)
    #41 0x7f6452e79c21 in pyrun_file (/lib64/libpython3.11.so.1.0+0x279c21)
    #42 0x7f6452e793e8 in _PyRun_SimpleFileObject (/lib64/libpython3.11.so.1.0+0x2793e8)
    #43 0x7f6452e79057 in _PyRun_AnyFileObject (/lib64/libpython3.11.so.1.0+0x279057)
    #44 0x7f6452e72d1a in Py_RunMain (/lib64/libpython3.11.so.1.0+0x272d1a)
    #44 0x7f6452e72d1a in Py_RunMain (/lib64/libpython3.11.so.1.0+0x272d1a)
    #45 0x7f6452e3846a in Py_BytesMain (/lib64/libpython3.11.so.1.0+0x23846a)
    #46 0x7f6452a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
    #47 0x7f6452a4a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
    #48 0x5597eecca094 in _start (/usr/bin/python3.11+0x1094)

Address 0x7ffcc7fc80a5 is located in stack of thread T0 at offset 85 in frame
    #0 0x7f64388a7db2 in ldb_kv_search_dn1 ../../lib/ldb/ldb_key_value/ldb_kv_search.c:225 

  This frame has 2 object(s):
    [32, 48) 'key' (line 231)
    [64, 85) 'guid_key' (line 230) <== Memory access at offset 85 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/lib64/libasan.so.8+0x7481d) in printf_common(void*, char const*, __va_list_tag*)
Shadow bytes around the buggy address:
  0x100018ff0fc0: 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 04 f2 00 00
  0x100018ff0fd0: f2 f2 00 00 f2 f2 00 00 f3 f3 00 00 00 00 00 00
  0x100018ff0fe0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x100018ff0ff0: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
  0x100018ff1000: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
=>0x100018ff1010: f2 f2 00 00[05]f3 f3 f3 f3 f3 00 00 00 00 00 00
  0x100018ff1020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100018ff1030: 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 04 f3 f3 f3
  0x100018ff1040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100018ff1050: 00 00 00 00 00 00 f1 f1 f1 f1 00 f3 f3 f3 00 00
  0x100018ff1060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07  
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb


Is the key invalid?
Comment 4 Andreas Schneider 2023-02-07 19:11:07 UTC
I don't know why and how, but it started to work. Very strange.
Comment 5 Andreas Schneider 2023-02-14 13:29:32 UTC
I can reproduce it only on Fedora 37 but not on openSUSE Tumbleweed.

The difference is that Fedora has python 3.11 while tw still has 3.10.
Comment 6 Andreas Schneider 2023-02-15 14:03:39 UTC
This seems to be a btrfs bug!

See
https://bugzilla.redhat.com/show_bug.cgi?id=2169947
https://bugzilla.kernel.org/show_bug.cgi?id=217042