Bug 14970 - SIGSEGV in resolve_oids_parse_tree_need when using paged_results with "anr"
Summary: SIGSEGV in resolve_oids_parse_tree_need when using paged_results with "anr"
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Andrew Bartlett
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-07 17:42 UTC by Arvid Requate
Modified: 2023-08-02 12:11 UTC (History)
4 users (show)

See Also:


Attachments
gdb.log (9.85 KB, text/plain)
2022-02-07 17:42 UTC, Arvid Requate
no flags Details
0001-Fix-segfault-in-paged_results.patch (1.95 KB, patch)
2022-02-07 17:43 UTC, Arvid Requate
no flags Details
WIP patches for master (3.27 KB, patch)
2022-02-07 23:44 UTC, Stefan Metzmacher
no flags Details
0001-paged_results-Avoid-ldb_oom-if-no-attrs-are-specified.patch (1.28 KB, patch)
2022-02-08 15:38 UTC, Arvid Requate
no flags Details
gdb-2.log (22.77 KB, text/plain)
2022-02-10 18:08 UTC, Arvid Requate
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate 2022-02-07 17:42:32 UTC
Created attachment 17150 [details]
gdb.log

Bug 14952 fixed a segmentation fault in paged_results which was introduced by 4d99cab6172 while fixing CVE-2020-10730.

We found another issue in the paged_results module, which causes segmentation faults that are triggered in real life e.g. when an Administrator uses ADUC to perform wildcard searches for accounts. In that case we see LDAP search requests like this in the LDB traces:

ldbsearch -H "ldap://$(hostname -f)"  -s sub  '(&(|(&(objectCategory=person)
(objectSid=*)(!(samAccountType:1.2.840.113556.1.4.804:=3)))
(&(objectCategory=person)(!(objectSid=*)))(&(objectCategory=group)(groupType:
1.2.840.113556.1.4.804:=14)))(anr=a*))' objectClass userAccountControl name 
description -P  --controls "paged_results:0:16" ## or similar

Metze suggested debugging this with a direct ldbsearch against the sam.ldb and we found that simple paged searches like

ldbsearch -H sam.ldb -s sub '(|(foo=bar)(anr=c*))' \
          --controls "paged_results:1:1"

ldbsearch -H sam.ldb -s sub '(!(anr==Administrator))' foo \
          --controls "paged_results:1:1"

triggers undefined behavior that frequently ends up in a segmentation fault. As usual with memory corruption, the exact behavior varies on details like the exact search filter and the attributes requested.

The common thing is, that the `anr` (Ambiguous Name Resolution) is used in the search filter and that paged_results are requested and return more than a single page. In that situation, the core dump is triggered in a subsequent call to resolve_oids, which attempts to use a nested ldb_parse_tree structure that was generated by anr.c and stored by paged_results in its private_data for later reference. Unfortunately the code of paged_results attaches the required talloc memory chunk to the `req`, which seems to get freed in between requested result pages.

The attached gdb output shows the backtrace (generated with Samba 4.13.13, but the issue persists in later versions too).

I'll attach a proposal for a patch.
Comment 1 Arvid Requate 2022-02-07 17:43:59 UTC
Created attachment 17151 [details]
0001-Fix-segfault-in-paged_results.patch

The attached patch fixed the problem in my tests.
Comment 2 Stefan Metzmacher 2022-02-07 23:44:53 UTC
Created attachment 17152 [details]
WIP patches for master

Hi Arvid,

thanks for the information!

Can you try if this patch would also fix the problem for you?
Comment 3 Arvid Requate 2022-02-08 08:45:04 UTC
Yes, works and avoids the memory leak that I proposed.
Comment 4 Arvid Requate 2022-02-08 15:38:24 UTC
Created attachment 17153 [details]
0001-paged_results-Avoid-ldb_oom-if-no-attrs-are-specified.patch

The attached patch fixes one regression from your patch: In case I don't specify any particular attributes with ldbsearch, with the new patch the search aborts like this:

root@test:~/ ldbsearch  -H sam.ldb '(|(foo=bar)(anr=a*))' \
              --controls=paged_results:1:1  
search failed - ldb out of memory at ../../source4/dsdb/samdb/ldb_modules/paged_results.c:751


Using the --trace option I see:

ldb_trace_request: SEARCH
 dn: DC=ucs50domain,DC=net
 scope: UNKNOWN
 expr: (|(foo=bar)(anr=a*))
 attr: <ALL>
 control: 1.2.840.113556.1.4.319  crit:1  data:yes

So, in this case req->op.search.attrs is NULL and that should not be mistaken for a talloc oom condition. The attached patch fixed this in my test.
Comment 5 Arvid Requate 2022-02-10 18:08:07 UTC
Created attachment 17156 [details]
gdb-2.log

Unfortunately Metzes patch (Comment 2 plus mine from Comment 4) does not fix the issues completely. A colleague of mine chose a different page size of 2 and then the issue occurred again, e.g. with:

ldbsearch -H /var/lib/samba/private/sam.ldb \
 "(|(objectclass=person)(anr=Administrator))" objectClass \
 --controls paged_results:0:2

Attached you can find a gdb bt full plus some observations throughout some points in the call stack.

I checked again with my initial path proposal (which may be bad for other reasons) and it fixed the issue.
Comment 6 Jennifer Sutton 2022-02-14 00:55:57 UTC
In my testing, this bug is fixed by commit 5f0590362c5c0c5ee20503a67467f9be2d50e73b:

    CVE-2021-3670 dsdb/anr: Do a copy of the potentially anr query before starting to modify it
    
    RN: Do not modify the caller-supplied memory in the anr=* handling to
    allow clear logging of the actual caller request after it has been processed.

which appears in Samba 4.14 and 4.15, but is not in 4.13.
Comment 7 Stefan Metzmacher 2022-03-28 11:41:01 UTC
(In reply to Joseph Sutton from comment #6)

Arvid, can you please check it 5f0590362c5c0c5ee20503a67467f9be2d50e73b
works for 4.13 as well? So that we can close this?
Comment 8 Arvid Requate 2022-03-29 16:28:09 UTC
Segfaults are still happening with ldbsearches like

ldbsearch \
  -H /var/lib/samba/private/sam.ldb \
  "(|(foo=bar)(anr=a))" \
  objectClass \
  --controls paged_results:0:2

The only case this patch fixed is the ldbsearch with anr and page size 1. 
With every other number of page sizes the samba process still crashes. 

Since the default page size in ADUAC uses 64, this patch won't help IRL.

I wanted to add some test cases in source4/dsdb/tests/python/vlv.py to make it reproducible but was unable to figure out where/how/when these test cases get called. If you could give a hint, I could offer adding the tests as a patch. Btw, what do you prefer, attaching patches here or creating a pull request in Gitlab?
Comment 9 Andrew Bartlett 2022-03-29 18:33:51 UTC
Patches for master should start in GitLab per https://wiki.samba.org/index.php/Contribute
Comment 10 Arvid Requate 2022-10-13 10:30:35 UTC
(In reply to Andrew Bartlett from comment #9)

Is this a request to me? I don't get it. I only reported the issue and suggested an area in the source code that may be relevant to the issue. But the issue is still unsolved and of practical relevance, as a simple ADUC search triggers this, and as the service segfaults, there are no results returned.
Comment 11 Andrew Bartlett 2022-10-13 18:46:46 UTC
(In reply to Arvid Requate from comment #8)
I just tried this on master and it doesn't segfault, even under valgrind.

Is this only an issue on older versions?
Comment 12 Arvid Requate 2023-08-01 16:21:51 UTC
I did a quick check with Samba 4.18.4 and couldn't reproduce it either, neither with the synthetic ldbsearch examples given above nor with the ADUC user search (which uses anr). Strange but in a good way :-)
Comment 13 Andrew Bartlett 2023-08-01 20:55:23 UTC
I've been looking over this carefully, and I will likely take the memory correctness patches anyway and add a shallow copy of the tree (because it is easily implemented).

I'm pretty sure the patch referenced in comment #6 is the correct one, as the way that the ldb_parse_tree would corrupt is by being modified without being copied, such that the new children in the tree are not part of the memory held long-term by the talloc_steal().

Arvid, Do you know what version you were testing with in comment #8?
Comment 14 Andrew Bartlett 2023-08-02 01:41:18 UTC
With Samba 4.13.13 I can reproduce this, and it is fixed once 5f0590362c5c0c5ee20503a67467f9be2d50e73b is applied.

Given this, my analysis and comment #12 I'm going to close this, but open a related bug and MR for the memory handling improvements.
Comment 15 Arvid Requate 2023-08-02 08:24:29 UTC
Yes, 4.13.13 it was.
Comment 16 Samba QA Contact 2023-08-02 12:11:13 UTC
This bug was referenced in samba master:

c67534fe3ff1652dcf95eac2030778b066cdf7a4