Bug 14684 - buffer overruns in talloc_string_sub2() due to false strstr_m matches
Summary: buffer overruns in talloc_string_sub2() due to false strstr_m matches
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Jeremy Allison
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-06 02:34 UTC by Douglas Bagnall
Modified: 2022-03-07 00:13 UTC (History)
1 user (show)

See Also:


Attachments
The fuzzer. (7.74 KB, patch)
2021-04-06 02:34 UTC, Douglas Bagnall
no flags Details
the read crash (2.00 KB, application/octet-stream)
2021-04-06 02:35 UTC, Douglas Bagnall
no flags Details
the write crash (1.94 KB, application/octet-stream)
2021-04-06 02:37 UTC, Douglas Bagnall
no flags Details
proof of concept not reliant on fuzzers (10.81 KB, patch)
2021-04-06 23:06 UTC, Douglas Bagnall
no flags Details
proof of concept (10.75 KB, patch)
2021-04-06 23:10 UTC, Douglas Bagnall
no flags Details
the crash test reduced to short mostly printable strings (439 bytes, text/plain)
2021-04-08 05:13 UTC, Douglas Bagnall
no flags Details
The fix (3.29 KB, patch)
2021-04-08 09:24 UTC, Douglas Bagnall
no flags Details
WIP patch for master (6.17 KB, patch)
2021-06-16 06:05 UTC, Douglas Bagnall
no flags Details
patch for master (6.95 KB, patch)
2021-06-16 07:09 UTC, Douglas Bagnall
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Douglas Bagnall 2021-04-06 02:34:45 UTC
Created attachment 16580 [details]
The fuzzer.

In some circumstances, I think where the strings don't make much sense as any encoding and/or have nul words, strstr_m(haystack, needle) will return haystack when there is no match (i.e. it should return NULL). That in turn seems to be due to confusion in strstr_w().

In one case we have a little out of bounds read. In the other it looks like heap corruption resulting in talloc abort.

strstr_w is only used in strstr_m.
strstr_m is used all over the place.
talloc_string_sub2 is used all over the place, mostly via talloc_string_sub.


Valgrind of the read example, the the write example. The write example crashes Valgrind.

$ valgrind bin/fuzz_talloc_string_sub2 ../samba-fuzz/fuzz/results/fuzz_talloc_string_sub2/SIGABRT.PC.555555586099.STACK.18b0690404.CODE.-6.ADDR.0.INSTR.lea____-0x840\(%rbp\)\,%rdx.fuzz
==3719941== Memcheck, a memory error detector
==3719941== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3719941== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==3719941== Command: bin/fuzz_talloc_string_sub2 ../samba-fuzz/fuzz/results/fuzz_talloc_string_sub2/SIGABRT.PC.555555586099.STACK.18b0690404.CODE.-6.ADDR.0.INSTR.lea____-0x840(%rbp),%rdx.fuzz
==3719941== 
==3719941== Invalid read of size 1
==3719941==    at 0x483EF46: strlen (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719941==    by 0x48CFFA9: talloc_string_sub2 (util_str.c:279)
==3719941==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719941==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719941==  Address 0x7508842 is 30 bytes before a block of size 528 in arena "client"
==3719941== 
==3719941== Invalid read of size 1
==3719941==    at 0x4842B60: memmove (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719941==    by 0x48CFFCE: talloc_string_sub2 (util_str.c:279)
==3719941==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719941==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719941==  Address 0x7508842 is 30 bytes before a block of size 528 in arena "client"
==3719941== 
==3719941== 
==3719941== HEAP SUMMARY:
==3719941==     in use at exit: 1,384 bytes in 8 blocks
==3719941==   total heap usage: 1,309 allocs, 1,301 frees, 189,176 bytes allocated
==3719941== 
==3719941== LEAK SUMMARY:
==3719941==    definitely lost: 0 bytes in 0 blocks
==3719941==    indirectly lost: 0 bytes in 0 blocks
==3719941==      possibly lost: 1,384 bytes in 8 blocks
==3719941==    still reachable: 0 bytes in 0 blocks
==3719941==         suppressed: 0 bytes in 0 blocks
==3719941== Rerun with --leak-check=full to see details of leaked memory
==3719941== 
==3719941== For lists of detected and suppressed errors, rerun with: -s
==3719941== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)




$ valgrind bin/fuzz_talloc_string_sub2 ../samba-fuzz/fuzz/results/fuzz_talloc_string_sub2/SIGABRT.PC.55555558a309.STACK.1876abc907.CODE.-6.ADDR.0.INSTR.lea____-0x840\(%rbp\)\,%rdx.fuzz
==3719673== Memcheck, a memory error detector
==3719673== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3719673== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==3719673== Command: bin/fuzz_talloc_string_sub2 ../samba-fuzz/fuzz/results/fuzz_talloc_string_sub2/SIGABRT.PC.55555558a309.STACK.1876abc907.CODE.-6.ADDR.0.INSTR.lea____-0x840(%rbp),%rdx.fuzz
==3719673==
==3719673== Invalid write of size 8
==3719673==    at 0x4842A9B: memmove (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48CFFE5: talloc_string_sub2 (util_str.c:281)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==  Address 0x7508818 is 568 bytes inside a block of size 572 alloc'd
==3719673==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48DE365: __talloc_with_prefix (talloc.c:783)
==3719673==    by 0x48DE4FF: __talloc (talloc.c:825)
==3719673==    by 0x48E1FDC: __talloc_strlendup (talloc.c:2454)
==3719673==    by 0x48E2096: talloc_strdup (talloc.c:2470)
==3719673==    by 0x48CFCF5: talloc_string_sub2 (util_str.c:217)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==
==3719673== Invalid write of size 8
==3719673==    at 0x4842A83: memmove (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48CFFE5: talloc_string_sub2 (util_str.c:281)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==  Address 0x7508820 is 4 bytes after a block of size 572 alloc'd
==3719673==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48DE365: __talloc_with_prefix (talloc.c:783)
==3719673==    by 0x48DE4FF: __talloc (talloc.c:825)
==3719673==    by 0x48E1FDC: __talloc_strlendup (talloc.c:2454)
==3719673==    by 0x48E2096: talloc_strdup (talloc.c:2470)
==3719673==    by 0x48CFCF5: talloc_string_sub2 (util_str.c:217)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)

==3719673==
==3719673== Invalid write of size 8
==3719673==    at 0x4842A8B: memmove (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48CFFE5: talloc_string_sub2 (util_str.c:281)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==  Address 0x7508828 is 12 bytes after a block of size 572 alloc'd
==3719673==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48DE365: __talloc_with_prefix (talloc.c:783)
==3719673==    by 0x48DE4FF: __talloc (talloc.c:825)
==3719673==    by 0x48E1FDC: __talloc_strlendup (talloc.c:2454)
==3719673==    by 0x48E2096: talloc_strdup (talloc.c:2470)
==3719673==    by 0x48CFCF5: talloc_string_sub2 (util_str.c:217)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==
==3719673== Invalid write of size 8
==3719673==    at 0x4842A93: memmove (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48CFFE5: talloc_string_sub2 (util_str.c:281)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==  Address 0x7508830 is 20 bytes after a block of size 572 alloc'd
==3719673==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3719673==    by 0x48DE365: __talloc_with_prefix (talloc.c:783)
==3719673==    by 0x48DE4FF: __talloc (talloc.c:825)
==3719673==    by 0x48E1FDC: __talloc_strlendup (talloc.c:2454)
==3719673==    by 0x48E2096: talloc_strdup (talloc.c:2470)
==3719673==    by 0x48CFCF5: talloc_string_sub2 (util_str.c:217)
==3719673==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3719673==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3719673==

valgrind: m_mallocfree.c:305 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 640, hi = 144680345676153346.

[... valgrind proceeds to die ...]
Comment 1 Douglas Bagnall 2021-04-06 02:35:50 UTC
Created attachment 16581 [details]
the read crash
Comment 2 Douglas Bagnall 2021-04-06 02:37:21 UTC
Created attachment 16582 [details]
the write crash
Comment 3 Douglas Bagnall 2021-04-06 10:18:33 UTC
(In reply to Douglas Bagnall from comment #0)

The valgrind line numbers and file names were off (master has changed). Here are up-to-date snippets:

==3784219== Invalid read of size 1
==3784219==    at 0x483EF46: strlen (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3784219==    by 0x486D409: talloc_string_sub2 (substitute.c:248)
==3784219==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3784219==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3784219==  Address 0x74f4842 is 30 bytes before a block of size 528 in arena "client"


==3785419== Invalid write of size 8
==3785419==    at 0x4842A9B: memmove (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3785419==    by 0x486D445: talloc_string_sub2 (substitute.c:250)
==3785419==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3785419==    by 0x1092F9: main (afl-fuzz-main.c:46)
==3785419==  Address 0x74f4818 is 568 bytes inside a block of size 572 alloc'd
==3785419==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==3785419==    by 0x48D0365: __talloc_with_prefix (talloc.c:783)
==3785419==    by 0x48D04FF: __talloc (talloc.c:825)
==3785419==    by 0x48D3FDC: __talloc_strlendup (talloc.c:2454)
==3785419==    by 0x48D4096: talloc_strdup (talloc.c:2470)
==3785419==    by 0x486D142: talloc_string_sub2 (substitute.c:185)
==3785419==    by 0x1098C8: LLVMFuzzerTestOneInput (fuzz_talloc_string_sub2.c:63)
==3785419==    by 0x1092F9: main (afl-fuzz-main.c:46)


In substitute.c, 248 is this memmove, and 250 is the memcpy, and the problem in either case is that p is wrong, due to the strstr_m error:


		if (li != lp) {
			memmove(p+li,p+lp,strlen(p+lp)+1);
		}
		memcpy(p, in, li);
		s = p + li;
		ls += ld;
Comment 4 Jeremy Allison 2021-04-06 19:53:05 UTC
Hi Douglas, can you give me an example of a "bad" input to strstr_m() so I can try and follow what is going wrong here. Currently I don't understand the bug :-(.
Comment 5 Douglas Bagnall 2021-04-06 23:06:12 UTC
Created attachment 16583 [details]
proof of concept not reliant on fuzzers

Not quite what you asked for Jeremy, but a step in that direction. Here is a stand alone proof of concept.

$ bin/default/lib/util/crash_talloc_string_sub2
Aborted
Comment 6 Douglas Bagnall 2021-04-06 23:10:12 UTC
Created attachment 16584 [details]
proof of concept

THIS is the proof-of-concept patch that will actually apply on master.
Comment 7 Jeremy Allison 2021-04-06 23:11:24 UTC
Oh that's perfect Douglas, thanks ! I can now walk through this with a debugger and figure out what's going wrong (hopefully). Thanks *so much* for doing this.
Comment 8 Douglas Bagnall 2021-04-06 23:25:03 UTC
In strstr_m, push_ucs2_talloc(mem_ctx, &find_w, findstr, &converted_size) is putting a 0 uint16_t at around position 146? 

and so it looks like that matches the beginning of src_w.
Comment 9 Douglas Bagnall 2021-04-06 23:31:22 UTC
(In reply to Douglas Bagnall from comment #8)
because codepoint F002 is in a private use area?

I will stop for now.
Comment 10 Douglas Bagnall 2021-04-08 05:13:38 UTC
Created attachment 16585 [details]
the crash test reduced to short mostly printable strings

The new attachment is this:

#include "includes.h"

const char string[] =  "++++++\xD8\xBB";
const char pattern[] = "++++\xF0\x80\x80\x80+++++";
const char replace[] = "......................";

int main(int argc, char *argv[])
{
	char *t = NULL;
	printf("%s\n %zu\n", string, strlen(string));
	printf("%s\n %zu\n", pattern, strlen(pattern));

	t = talloc_string_sub(NULL, string, pattern, replace);

	printf("%s\n %zu\n", t, strlen(t));
	talloc_free(t);
	return 0;
}



which produces a lot of output under valgrind, and this when run normally:


++++++ػ
 8
++++����+++++
 13
......................
 22

(sometimes finishing earlier with "munmap_chunk(): invalid pointer\n Aborted").


My theory is still that the high bytes in *pattern* look like a terminator, so *pattern* looks like it is "++++", matching the start of *string*. Then it loses track of which length it is using.

If *string* contains no high bytes we don't seem to go down this road.

So there are sort of 2 bugs. One is that we do this replacement at all. The other is that we do it so badly.
Comment 11 Douglas Bagnall 2021-04-08 05:17:21 UTC
I haven't found callers where the pattern is not a known constant string (typically something like "%M"), in which case there is no security concern, or even any need to backport. But I haven't looked exhaustively.
Comment 12 Douglas Bagnall 2021-04-08 08:56:06 UTC
Ha, yes, the sequence "\xF0\x80\x80\x80" is read by utf8_pull() in the smb_iconv code as the 4-byte encoding of '\0'.

As you can see, it skips past the high bits check here, then as there are no low bits it encodes zero:

		if ((c[0] & 0xf8) == 0xf0) {
			unsigned int codepoint;
			if (in_left < 4 ||
			    (c[1] & 0xc0) != 0x80 ||
			    (c[2] & 0xc0) != 0x80 ||
			    (c[3] & 0xc0) != 0x80) {
				errno = EILSEQ;
				goto error;
			}
			codepoint =
				(c[3]&0x3f) |
				((c[2]&0x3f)<<6) |
				((c[1]&0x3f)<<12) |
				((c[0]&0x7)<<18);


(that's lib/util/charset/iconv.c line 858-872).


*Presumably* real iconv rejects over-length encodings like this, and users of that will be fine. I believe the spec and battle-worn wisdom agree on that behaviour.

The question is: does anyone really use our iconv? If they do, we really need to fix this, as there will be other routes to the same bug. If not, well, we might as well fix it to get compatible behaviour.
Comment 13 Douglas Bagnall 2021-04-08 09:24:55 UTC
Created attachment 16586 [details]
The fix

With this patch, we get the following. Note that not only does it not upset valgrind, it also produces the right answer ("++++����+++++" does not occur in "++++++ػ", so there is no substitution).


$ valgrind ./bin/default/lib/util/crash_talloc_string_sub2
==29596== Memcheck, a memory error detector
==29596== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==29596== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==29596== Command: ./bin/default/lib/util/crash_talloc_string_sub2
==29596== 
++++++ػ
 8
++++����+++++
 13
++++++ػ
 8
==29596== 
==29596== HEAP SUMMARY:
==29596==     in use at exit: 1,106 bytes in 6 blocks
==29596==   total heap usage: 1,238 allocs, 1,232 frees, 173,179 bytes allocated
==29596== 
==29596== LEAK SUMMARY:
==29596==    definitely lost: 0 bytes in 0 blocks
==29596==    indirectly lost: 0 bytes in 0 blocks
==29596==      possibly lost: 1,106 bytes in 6 blocks
==29596==    still reachable: 0 bytes in 0 blocks
==29596==         suppressed: 0 bytes in 0 blocks
==29596== Rerun with --leak-check=full to see details of leaked memory
==29596== 
==29596== For counts of detected and suppressed errors, rerun with: -v
==29596== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Comment 14 Jeremy Allison 2021-04-08 15:42:28 UTC
(In reply to Douglas Bagnall from comment #13)

Douglas, this is *amazingly* great work - thanks ! I don't think anyone uses our iconv in preference to the system iconv, so I'm guessing this isn't a security issue.

But thanks a million for the fix !
Comment 15 Douglas Bagnall 2021-04-08 18:44:13 UTC
(In reply to Jeremy Allison from comment #14)
> I don't think anyone uses our iconv in preference to the system iconv, 
> so I'm guessing this isn't a security issue.

Yeah, I wonder about embedded things, but they wouldn't get the patches anyway.

Should we just get rid of the internal iconv?

I will add a test or two.
Comment 16 Douglas Bagnall 2021-04-16 03:59:31 UTC
Björn, since you just offered an informed opinion on the list about --without-iconv, I am wondering how many people you think might be a) affected by this, and b) able to patch?

(I'm assuming --without-iconv means using our local mini-iconv, where the bug lies).
Comment 17 Jeremy Allison 2021-05-19 21:33:32 UTC
Ping. Can we get this fix into master please ?
Comment 18 Björn Jacke 2021-05-20 12:11:21 UTC
(In reply to Douglas Bagnall from comment #16)
the problem with some iconv implementations is that they do not have the charset names that Samba likes to make use of. On AIX for example you have these charsets available in iconv:

ASCII-GR BIG5-HKSCS CNS11643.1986-1 CNS11643.1986-2 GB18030 GBK IBM-037 IBM-1006 IBM-1046 IBM-1124 IBM-1129 IBM-1251 IBM-1252 IBM-1363 IBM-278 IBM-500 IBM-850 IBM-850 IBM-850-GL IBM-850-GR IBM-856 IBM-921 IBM-922 IBM-932 IBM-943 IBM-eucCN IBM-eucJP IBM-eucKR IBM-eucTW IBM-sbdTW IBM-udcJP IBM-udcTW ISCII.1991 ISO-2022-JP ISO-8859-1 ISO8859-1 ISO8859-1 ISO8859-1-GL ISO8859-1-GL ISO8859-1-GR ISO8859-1-GR ISO8859-15 ISO8859-15 ISO8859-15-GL ISO8859-15-GR ISO8859-2 ISO8859-2-GL ISO8859-2-GR ISO8859-3 ISO8859-3-GL ISO8859-3-GR ISO8859-4 ISO8859-4-GL ISO8859-4-GR ISO8859-5 ISO8859-5-GL ISO8859-5-GR ISO8859-6 ISO8859-6-GL ISO8859-6-GR ISO8859-7 ISO8859-7-GL ISO8859-7-GR ISO8859-8 ISO8859-8-GL ISO8859-8-GR ISO8859-9 ISO8859-9-GL ISO8859-9-GR JISX0201.1976-0 JISX0208.1983-0 KSC5601.1987-0 KSC5601.1987-1 TIS-620 UCS-2 UNICODE-2 UTF-16 UTF-16le UTF-32 UTF-8 big5 ct fold7 fold8 uucode

No plain ASCII, no UCS-2LE for example, even though "UCS-2", which is available, is the same as UCS-2LE on AIX and on Linux' iconv also.

Currently all the buildtools/wafsamba/samba_conftests.py charset tests fail but the HAVE_CHARSET_ defines are never used, so the tests are quite useless at the moment. Errors will pop up at runtime, when unsupported charsets are being used only.
Comment 19 Douglas Bagnall 2021-06-03 03:36:55 UTC
(In reply to Jeremy Allison from comment #17)
ping acknowledged.

I am hoping to get onto it in the enxt few days.
Comment 20 Douglas Bagnall 2021-06-16 06:05:44 UTC
Created attachment 16655 [details]
WIP patch for master

I'll put this here because it is dinner time, and as I was going to make the knownfail to bridge the tests and the fix, I found one test crashed instead of failed. Which I know shouldn't be a surprise, given what it's testing.

So I am putting this here lest I forget about it again.
Comment 21 Douglas Bagnall 2021-06-16 07:09:05 UTC
Created attachment 16656 [details]
patch for master

dinner is late. Here is the working patch.
Comment 22 Samba QA Contact 2021-06-18 04:28:04 UTC
This bug was referenced in samba master:

50047588c0c8da2e1ffa0b08a8dc5d31e49f6a3b
1ea1816629104e16d9b180bee8f4ccc6292869ed
Comment 23 Douglas Bagnall 2021-06-18 04:44:05 UTC
This will backport easily if we want it.
Comment 24 Björn Jacke 2022-03-07 00:13:02 UTC
looks like we can close this now.