Bug 14199 - LMDB domaindns/forestdns partition corruption with bind9_dlz
Summary: LMDB domaindns/forestdns partition corruption with bind9_dlz
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-11-14 16:37 UTC by Denis Cardon
Modified: 2020-01-23 20:04 UTC (History)
3 users (show)

See Also:


Attachments
Proposed patch apples to V4.11 (7.23 KB, patch)
2020-01-12 19:58 UTC, Gary Lockyer
abartlet: review+
gary: ci-passed+
Details
Proposed patch applies to V4.10 (8.75 KB, patch)
2020-01-12 20:04 UTC, Gary Lockyer
abartlet: review+
gary: ci-passed+
Details
Proposed patch applies to V4.9 (8.79 KB, patch)
2020-01-13 00:55 UTC, Gary Lockyer
abartlet: review+
gary: ci-passed+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Denis Cardon 2019-11-14 16:37:31 UTC
samba_dnsupgrade --dns-backend=BIND9_DLZ creates hardlink /var/lib/samba/bind-dns/dns/sam.ldb.d/{DC=DOMAINDNSZONES,DC=FORESTDNSZONES},DC=TESTING,DC=LAN for LDB DNS partition files, but it does not create them for *-lock files.

I have had a few time corrupted DNS partition with BIND-DLZ along LMDB.

[root@srvads.testing.lan ~]# ls -li /var/lib/samba/private/sam.ldb.d/
total 375340
525701 -rw-r--r-- 1 root root   8552448 août   5 09:05 CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb
525681 -rw-r--r-- 1 root root   6400128 nov.  14 17:24 CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb-lock
525654 -rw-r--r-- 1 root root   8843264 févr.  5  2019 CN=SCHEMA,CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb
525653 -rw-r--r-- 1 root root   6400128 nov.  14 17:24 CN=SCHEMA,CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb-lock
525770 -rw-r--r-- 1 root root 354435072 nov.  14 17:23 DC=TESTING,DC=LAN.ldb
525704 -rw-r--r-- 1 root root   6400128 nov.  14 17:24 DC=TESTING,DC=LAN.ldb-lock
525798 -rw-rw---- 2 root bind  10674176 nov.  14 16:58 DC=DOMAINDNSZONES,DC=TESTING,DC=LAN.ldb
525782 -rw-r--r-- 1 root root   6400128 nov.  14 17:24 DC=DOMAINDNSZONES,DC=TESTING,DC=LAN.ldb-lock
525810 -rw-rw---- 2 root bind   1368064 févr.  5  2019 DC=FORESTDNSZONES,DC=TESTING,DC=LAN.ldb
525800 -rw-r--r-- 1 root root   6400128 nov.  14 17:24 DC=FORESTDNSZONES,DC=TESTING,DC=LAN.ldb-lock
525470 -rw-rw---- 2 root bind    421888 nov.  14 16:58 metadata.tdb
[root@srvads.testing.lan ~]# ls -li /var/lib/samba/bind-dns/dns/sam.ldb.d/
total 29224
525141 -rw-rw---- 1 root bind  8552448 oct.  22 18:58 CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb
524808 -rw-r--r-- 1 bind bind  6400128 nov.  14 17:23 CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb-lock
525144 -rw-rw---- 1 root bind  8843264 oct.  22 18:58 CN=SCHEMA,CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb
524382 -rw-r--r-- 1 bind bind  6400128 nov.  14 17:23 CN=SCHEMA,CN=CONFIGURATION,DC=TESTING,DC=LAN.ldb-lock
524980 -rw-rw---- 1 root bind    40960 oct.  22 18:58 DC=TESTING,DC=LAN.ldb
524981 -rw-rw---- 1 root bind  6400128 nov.  14 17:23 DC=TESTING,DC=LAN.ldb-lock
525798 -rw-rw---- 2 root bind 10674176 nov.  14 16:58 DC=DOMAINDNSZONES,DC=TESTING,DC=LAN.ldb
524866 -rw-r--r-- 1 bind bind  6400128 nov.  14 17:23 DC=DOMAINDNSZONES,DC=TESTING,DC=LAN.ldb-lock
525810 -rw-rw---- 2 root bind  1368064 févr.  5  2019 DC=FORESTDNSZONES,DC=TESTING,DC=LAN.ldb
524941 -rw-r--r-- 1 bind bind  6400128 nov.  14 17:23 DC=FORESTDNSZONES,DC=TESTING,DC=LAN.ldb-lock
525470 -rw-rw---- 2 root bind   421888 nov.  14 16:58 metadata.tdb

The *-lock file is not handled in the script /usr/lib64/python3.6/site-packages/samba/provision/sambadns.py:856 :

    try:
        os.link(os.path.join(samldb_dir, metadata_file),
                os.path.join(dns_samldb_dir, metadata_file))
        os.link(os.path.join(private_dir, domainzone_file),
                os.path.join(dns_dir, domainzone_file))
        if forestzone_file:
            os.link(os.path.join(private_dir, forestzone_file),
                    os.path.join(dns_dir, forestzone_file))
Comment 1 Gary Lockyer 2020-01-12 19:58:39 UTC
Created attachment 15723 [details]
Proposed patch apples to V4.11

CI run: https://gitlab.com/samba-team/devel/samba/pipelines/108226370
Comment 2 Gary Lockyer 2020-01-12 20:04:28 UTC
Created attachment 15724 [details]
Proposed patch applies to V4.10

CI run: https://gitlab.com/samba-team/devel/samba/pipelines/108246724
Comment 3 Andrew Bartlett 2020-01-12 21:05:03 UTC
Comment on attachment 15723 [details]
Proposed patch apples to V4.11

Thanks. 

We should probably come up with some explanatory text to put in WHATSNEW at least suggesting a dbcheck and how to fix existing databases. 

However avoiding issues on any new domains is an important start so I've approved this.
Comment 4 Andrew Bartlett 2020-01-12 21:06:57 UTC
Assigning to Karolin for 4.10.next and 4.11.next
Comment 5 Gary Lockyer 2020-01-13 00:55:23 UTC
Created attachment 15725 [details]
Proposed patch applies to V4.9

CI run: https://gitlab.com/samba-team/devel/samba/pipelines/108246200
Comment 6 Karolin Seeger 2020-01-14 08:18:59 UTC
(In reply to Andrew Bartlett from comment #4)
Pushed to autobuild-v4-11-test.
Comment 7 Karolin Seeger 2020-01-15 09:15:28 UTC
(In reply to Karolin Seeger from comment #6)
Pushed to v4-11-test.
Closing out bug report.

Thanks!
Comment 8 Andrew Bartlett 2020-01-23 20:04:25 UTC
We still need to work out some robust steps to fix existing installations.  

I've confirmed running the script

./source4/scripting/bin/samba_upgradedns  --dns-backend=BIND9_DLZ

WILL create the correct lock files.  Of course, BIND9 must be stopped at the time this is done otherwise it will still have the old, seperate lock files open. 

It is important to confirm that the -lock files, like the DNS partition .ldb files, have 2 (hard) links, per the second column of:

[abartlet@labdc samba]$ ls -la st/labdc/bind-dns/dns/sam.ldb.d/
777;preexectotal 45204
drwxrwxr-x. 2 abartlet abartlet     4096 Jan 23 20:00  .
drwxrwx---. 3 abartlet abartlet     4096 Jan 23 20:00  ..
-rw-rw-r--. 1 abartlet abartlet 14127104 Jan 23 20:00 'CN=CONFIGURATION,DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb'
-rw-rw-r--. 1 abartlet abartlet 17866752 Jan 23 20:00 'CN=SCHEMA,CN=CONFIGURATION,DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb'
-rw-r--r--. 2 abartlet abartlet   647168 Jan 23 19:58 'DC=DOMAINDNSZONES,DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb'
-rw-r--r--. 2 abartlet abartlet  6400128 Jan 23 20:00 'DC=DOMAINDNSZONES,DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb-lock'
-rw-r--r--. 2 abartlet abartlet   368640 Jan 23 19:58 'DC=FORESTDNSZONES,DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb'
-rw-r--r--. 2 abartlet abartlet  6400128 Jan 23 20:00 'DC=FORESTDNSZONES,DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb-lock'
-rw-rw-r--. 1 abartlet abartlet    40960 Jan 23 20:00 'DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb'
-rw-r--r--. 1 abartlet abartlet  6400128 Jan 23 20:00 'DC=LABDOM,DC=SAMBA,DC=EXAMPLE,DC=COM.ldb-lock'
-rw-rw----. 2 abartlet abartlet   421888 Jan 23 19:58  metadata.tdb

However, this will not resolve any existing corruption, for that it is best to re-join from whichever DC has not been impacted by this issue.