Bug 14429 - time machine backups crash smbd when fruit:locking = netatalk
Summary: time machine backups crash smbd when fruit:locking = netatalk
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.12.3
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-02 20:00 UTC by Andrew Walker
Modified: 2022-01-17 08:44 UTC (History)
2 users (show)

See Also:


Attachments
WIP patch (2.31 KB, patch)
2020-07-02 20:00 UTC, Andrew Walker
no flags Details
pcap taken from samba server (17.40 KB, application/vnd.tcpdump.pcap)
2020-07-02 20:05 UTC, Andrew Walker
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Walker 2020-07-02 20:00:01 UTC
Created attachment 16107 [details]
WIP patch

User provided core of smbd crash. bt below.

frame 7 we try to talloc_free a non-talloced struct(static_share_mode_record):
#7  _talloc_free (ptr=0x7fffffffd468, location=0x80185bd9b "../../source3/locking/share_mode_lock.c:659") at ../../lib/talloc/talloc.c:1771
1771		tc = talloc_chunk_from_ptr(ptr);
(gdb) p *(struct db_record*)ptr
$3 = {db = 0x811209360, key = {dptr = 0x7fffffffd720 "\363_\301\034ޜ\337V", dsize = 24}, value = {dptr = 0x0, dsize = 0}, value_valid = true, storev = 0x806ea1970 <dbwrap_watched_do_locked_storev>, 
  delete_rec = 0x806ea26d0 <dbwrap_watched_do_locked_delete>, private_data = 0x7fffffffd5f8}

We get here because get_static_share_mode_data() always returns NT_STATUS_NOT_FOUND when this particular code path is followed. This is because the share mode data is stored in static_share_mode_record_value in this instance. dbwrap_watched_do_locked_fn() -> share_mode_do_locked_fn().

In my testing environment I was able to fix the crash by adding a function to return share-mode data from static_share_mode_record_value in case attempt from static_share_mode_record fails with NT_STATUS_NOT_FOUND. 

Now I'm not sure whether this is the correct way to address this issue. It's something I quickly threw together today when I saw the issue.

----------------------

#0  thr_kill () at thr_kill.S:3
#1  0x000000080ff5fe34 in __raise (s=6) at /truenas-releng/freenas/_BE/os/lib/libc/gen/raise.c:52
#2  0x000000080fed60d9 in abort () at /truenas-releng/freenas/_BE/os/lib/libc/stdlib/abort.c:67
#3  0x0000000806e8403c in dump_core () at ../../source3/lib/dumpcore.c:338
#4  0x0000000806e7a677 in smb_panic_s3 (why=0x802146c34 "Bad talloc magic value - unknown value") at ../../source3/lib/util.c:853
#5  0x000000080129c1f7 in smb_panic (why=0x188de <error: Cannot access memory at address 0x188de>) at ../../lib/util/fault.c:174
#6  0x0000000802143038 in talloc_chunk_from_ptr (ptr=0x7fffffffd468) at ../../lib/talloc/talloc.c:512
#7  _talloc_free (ptr=0x7fffffffd468, location=0x80185bd9b "../../source3/locking/share_mode_lock.c:659") at ../../lib/talloc/talloc.c:1771
#8  0x00000008016e1719 in get_share_mode_lock (mem_ctx=<optimized out>, id=..., servicepath=0x0, smb_fname=0x0, old_write_time=0x0) at ../../source3/locking/share_mode_lock.c:659
#9  0x00000008016d99d0 in get_existing_share_mode_lock (mem_ctx=0x188de, id=...) at ../../source3/locking/locking.c:471
#10 0x00000008018045ac in contend_level2_oplocks_begin_default (fsp=0x810e675e0, type=<optimized out>) at ../../source3/smbd/oplock.c:1302
#11 smbd_contend_level2_oplocks_begin (fsp=0x810e675e0, type=<optimized out>) at ../../source3/smbd/oplock.c:1343
#12 0x00000008016dbf31 in brl_lock_posix (br_lck=<optimized out>, plock=0x7fffffffd2f8) at ../../source3/locking/brlock.c:893
#13 brl_lock (br_lck=<optimized out>, smblctx=3941576778, pid=..., start=<optimized out>, size=<optimized out>, lock_type=(unknown: 287923008), lock_flav=<optimized out>, blocker_pid=0x7fffffffd778, 
    psmblctx=0x7fffffffd790) at ../../source3/locking/brlock.c:1023
#14 0x00000008016d9542 in do_lock_fn (value=..., modified_dependent=<optimized out>, private_data=0x7fffffffd740) at ../../source3/locking/locking.c:267
#15 0x00000008016e217d in share_mode_do_locked_fn (rec=0x7fffffffd468, value=..., private_data=0x0) at ../../source3/locking/share_mode_lock.c:745
#16 0x0000000806ea268b in dbwrap_watched_do_locked_fn (subrec=0x7fffffffd508, subrec_value=..., private_data=0x7fffffffd5f8) at ../../source3/lib/dbwrap/dbwrap_watch.c:354
#17 0x000000080484c85d in db_tdb_do_locked (db=0x811209260, key=..., fn=0x806ea25d0 <dbwrap_watched_do_locked_fn>, private_data=0x7fffffffd5f8) at ../../lib/dbwrap/dbwrap_tdb.c:222
#18 0x000000080484969c in dbwrap_do_locked (db=0x811209260, key=..., fn=0x806ea25d0 <dbwrap_watched_do_locked_fn>, private_data=<optimized out>) at ../../lib/dbwrap/dbwrap.c:550
#19 0x0000000806ea108c in dbwrap_watched_do_locked (db=0x811209360, key=..., fn=0x8016e2110 <share_mode_do_locked_fn>, private_data=0x7fffffffd6d0) at ../../source3/lib/dbwrap/dbwrap_watch.c:372
#20 0x000000080484969c in dbwrap_do_locked (db=0x811209360, key=..., fn=0x8016e2110 <share_mode_do_locked_fn>, private_data=<optimized out>) at ../../lib/dbwrap/dbwrap.c:550
#21 0x00000008016e1f8f in share_mode_do_locked (id=..., fn=0x8016d94a0 <do_lock_fn>, private_data=0x7fffffffd740) at ../../source3/locking/share_mode_lock.c:793
#22 0x00000008016d9269 in do_lock (fsp=0x810e675e0, req_mem_ctx=<optimized out>, req_guid=<optimized out>, smblctx=<optimized out>, count=1, offset=9223372036854775799, lock_type=READ_LOCK, 
    lock_flav=POSIX_LOCK, pblocker_pid=0x0, psmblctx=0x0) at ../../source3/locking/locking.c:331
#23 0x000000081b7c1451 in fruit_check_access (handle=<optimized out>, fsp=<optimized out>, access_mask=137, share_mode=1) at ../../source3/modules/vfs_fruit.c:653
#24 fruit_create_file (handle=<optimized out>, req=0x81b343640, root_dir_fid=0, smb_fname=0x810e675e0, access_mask=137, share_access=1, create_disposition=3, create_options=0, file_attributes=162, 
    oplock_request=256, lease=0x810f678f8, allocation_size=0, private_flags=0, sd=0x0, ea_list=0x0, result=0x810f67888, pinfo=0x810f6789c, in_context_blobs=0x7fffffffd9a0, out_context_blobs=0x81b2a1d40)
    at ../../source3/modules/vfs_fruit.c:4088
#25 0x00000008017e3d19 in smbd_smb2_create_send (mem_ctx=0x81b342b80, ev=<optimized out>, smb2req=0x81b342b80, in_oplock_level=64 '@', in_impersonation_level=2, in_desired_access=137, 
    in_file_attributes=162, in_share_access=1, in_create_disposition=3, in_create_options=0, in_name=<optimized out>, in_context_blobs=...) at ../../source3/smbd/smb2_create.c:983
#26 smbd_smb2_request_process_create (smb2req=<optimized out>) at ../../source3/smbd/smb2_create.c:268
#27 0x00000008017d9b9c in smbd_smb2_request_dispatch (req=<optimized out>) at ../../source3/smbd/smb2_server.c:2695
#28 0x00000008017dcf7a in smbd_smb2_io_handler (xconn=0x810e661e0, fde_flags=<optimized out>) at ../../source3/smbd/smb2_server.c:4046
#29 smbd_smb2_connection_handler (ev=<optimized out>, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../../source3/smbd/smb2_server.c:4084
#30 0x0000000802d64a3c in tevent_common_invoke_fd_handler (fde=0x810e4d800, flags=<optimized out>, removed=0x0) at ../../lib/tevent/tevent_fd.c:138
#31 0x0000000802d6b81f in kqueue_event_loop (kqueue_ev=0x810e8ce60, nkp=<optimized out>) at ../../lib/tevent/tevent_kqueue.c:611
#32 kqueue_event_loop_once (ev=<optimized out>, location=<optimized out>) at ../../lib/tevent/tevent_kqueue.c:1187
#33 0x0000000802d6952d in std_event_loop_once (ev=0x810e54060, location=0x8018be238 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_standard.c:114
#34 0x0000000802d63c61 in _tevent_loop_once (ev=0x810e54060, location=0x8018be238 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:775
#35 0x0000000802d63ec2 in tevent_common_loop_wait (ev=0x810e54060, location=0x8018be238 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent.c:898
#36 0x0000000802d6959d in std_event_loop_wait (ev=0x810e54060, location=0x8018be238 "../../source3/smbd/process.c:4170") at ../../lib/tevent/tevent_standard.c:145
#37 0x00000008017c7dfd in smbd_process (ev_ctx=0x810e54060, msg_ctx=<optimized out>, sock_fd=39, interactive=<optimized out>) at ../../source3/smbd/process.c:4170
#38 0x000000000102caaf in smbd_accept_connection (ev=<optimized out>, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../../source3/smbd/server.c:1012
#39 0x0000000802d64a3c in tevent_common_invoke_fd_handler (fde=0x810e4d640, flags=<optimized out>, removed=0x0) at ../../lib/tevent/tevent_fd.c:138
#40 0x0000000802d6b81f in kqueue_event_loop (kqueue_ev=0x810e8c460, nkp=<optimized out>) at ../../lib/tevent/tevent_kqueue.c:611
#41 kqueue_event_loop_once (ev=<optimized out>, location=<optimized out>) at ../../lib/tevent/tevent_kqueue.c:1187
#42 0x0000000802d6952d in std_event_loop_once (ev=0x810e54060, location=0x1031c16 "../../source3/smbd/server.c:1359") at ../../lib/tevent/tevent_standard.c:114
#43 0x0000000802d63c61 in _tevent_loop_once (ev=0x810e54060, location=0x1031c16 "../../source3/smbd/server.c:1359") at ../../lib/tevent/tevent.c:775
#44 0x0000000802d63ec2 in tevent_common_loop_wait (ev=0x810e54060, location=0x1031c16 "../../source3/smbd/server.c:1359") at ../../lib/tevent/tevent.c:898
#45 0x0000000802d6959d in std_event_loop_wait (ev=0x810e54060, location=0x1031c16 "../../source3/smbd/server.c:1359") at ../../lib/tevent/tevent_standard.c:145
#46 0x000000000102b00f in smbd_parent_loop (ev_ctx=0x810e54060, parent=<optimized out>) at ../../source3/smbd/server.c:1359
Comment 1 Andrew Walker 2020-07-02 20:02:45 UTC
Steps to reproduce:
1) configure time machine share
2) set fruit:locking = netatalk
3) initiate time machine backup. In my lab I'm using 10.15.5 (19F101)
Comment 2 Andrew Walker 2020-07-02 20:05:16 UTC
Created attachment 16108 [details]
pcap taken from samba server

Add pcap of MacOS session for reference.
Comment 3 Andrew Walker 2020-07-02 20:17:13 UTC
That said, it's sorely tempting to just do something like this:

truenas# diff -udp share_mode_lock.c.orig share_mode_lock.c
--- share_mode_lock.c.orig	2020-07-01 14:12:32.537160630 -0700
+++ share_mode_lock.c	2020-07-01 14:16:06.895029703 -0700
@@ -656,7 +656,9 @@ struct share_mode_lock *get_share_mode_lock(
 	if (!NT_STATUS_IS_OK(status)) {
 		DBG_DEBUG("get_static_share_mode_data failed: %s\n",
 			  nt_errstr(status));
-		TALLOC_FREE(static_share_mode_record);
+		if (static_share_mode_record_talloced) {
+			TALLOC_FREE(static_share_mode_record);
+		}
 		goto fail;
 	}
Comment 4 Ralph Böhme 2020-07-02 20:20:48 UTC
(In reply to Andrew Walker from comment #3)
Oh what a mess. If noone else gets to it earlier, I'll try to take a look next week as I'm currenlty burried beneath other stuff.
Comment 5 Andrew Walker 2020-07-14 12:09:39 UTC
This appears like it may be a duplicate of Bug 14392.
Comment 6 Andrew Walker 2020-07-14 12:10:07 UTC
WIP patch is for v4-12
Comment 7 Ralph Böhme 2022-01-17 08:44:52 UTC
Opening bug for public access, this is not a security issue.