Bug 9026 - 3.6.6 upgrade from 3.5.x fails with "Couldn't migrate printers tdb file: NT_STATUS_NO_MEMORY"
3.6.6 upgrade from 3.5.x fails with "Couldn't migrate printers tdb file: NT_S...
Status: RESOLVED FIXED
Product: Samba 3.6
Classification: Unclassified
Component: Printing
3.6.6
All All
: P5 major
: ---
Assigned To: Karolin Seeger
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-04 16:07 UTC by Chris Smith
Modified: 2016-05-18 08:09 UTC (History)
4 users (show)

See Also:


Attachments
level 10 smbd.log (128.08 KB, application/octet-stream)
2012-07-04 16:07 UTC, Chris Smith
no flags Details
ntdrivers.tdb (60.00 KB, application/octet-stream)
2012-07-04 16:07 UTC, Chris Smith
no flags Details
ntprinters.tdb (328.00 KB, application/octet-stream)
2012-07-04 16:08 UTC, Chris Smith
no flags Details
Proposed fix based on master (4.36 KB, patch)
2012-07-06 12:12 UTC, David Disseldorp
gd: review+
Details
patch for 3.6.x (38.13 KB, patch)
2012-07-16 13:27 UTC, Guenther Deschner
gd: review+
Details
additional patch for 3.6.x (1.21 KB, patch)
2012-08-28 12:32 UTC, Guenther Deschner
ddiss: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Smith 2012-07-04 16:07:13 UTC
Created attachment 7689 [details]
level 10 smbd.log

3.6.6 fails to start smbd after upgrade from 3.5.x (15/16).

smbd.log level 10:
=============================================
[2012/07/04 10:13:18.361396,  5, effective(0, 0), real(0, 0)] printing/nt_printing_migrate_internal.c:54(rename_file_with_suffix)
  moved '/var/lib/samba/ntdrivers.tdb' to '/var/lib/samba/ntdrivers.tdb.bak'
[2012/07/04 10:13:18.361589,  3, effective(0, 0), real(0, 0)] lib/charcnv.c:537(convert_string_talloc)
  convert_string_talloc: Conversion error: Illegal multibyte sequence(�0)
[2012/07/04 10:13:18.361664,  0, effective(0, 0), real(0, 0)] lib/charcnv.c:543(convert_string_talloc)
  Conversion error: Illegal multibyte sequence(�0)
[2012/07/04 10:13:18.361726,  1, effective(0, 0), real(0, 0)] ../librpc/ndr/ndr.c:414(ndr_pull_error)
  ndr_pull_error(5): Bad character conversion with flags 0x42
[2012/07/04 10:13:18.361786,  2, effective(0, 0), real(0, 0)] printing/nt_printing_migrate.c:207(printing_tdb_migrate_printer)
  printer pull failed: Character Conversion Error
[2012/07/04 10:13:18.361911,  0, effective(0, 0), real(0, 0)] printing/nt_printing_migrate_internal.c:237(nt_printing_tdb_migrate)
  Couldn't migrate printers tdb file: NT_STATUS_NO_MEMORY
=============================================
Comment 1 Chris Smith 2012-07-04 16:07:49 UTC
Created attachment 7690 [details]
ntdrivers.tdb
Comment 2 Chris Smith 2012-07-04 16:08:22 UTC
Created attachment 7691 [details]
ntprinters.tdb
Comment 3 David Disseldorp 2012-07-04 16:14:29 UTC
I'll attempts to reproduce this one using the tdbs provided.
Comment 4 David Disseldorp 2012-07-05 10:54:46 UTC
(In reply to comment #3)
> I'll attempts to reproduce this one using the tdbs provided.

The ntprinting_printer->printer_data[0].name field appears to be at fault.
It contains the following, which is parsed as a UTF8 null-terminated string:
{0x50, 0x52, 0x49, 0x56, 0xe2, 0x30, 0x0} = "PRIV\342\060"


(gdb) bt
#0  convert_string_talloc (ctx=0x555556110e70, from=from@entry=CH_UTF8, to=to@entry=CH_UNIX, src=<optimized out>, srclen=7, dst=0x7fffffffd8c0, converted_size=0x7fffffffd8c8,
    allow_bad_conv=false) at lib/charcnv.c:537
#1  0x00005555559a5af5 in ndr_pull_string (ndr=ndr@entry=0x555556117040, ndr_flags=ndr_flags@entry=1, s=s@entry=0x5555561191b8) at ../librpc/ndr/ndr_string.c:273
#2  0x00005555559c28be in ndr_pull_ntprinting_printer_data (ndr=ndr@entry=0x555556117040, ndr_flags=ndr_flags@entry=1, r=0x5555561191b0) at librpc/gen_ndr/ndr_ntprinting.c:467
#3  0x00005555559c36cf in ndr_pull_ntprinting_printer (ndr=ndr@entry=0x555556117040, ndr_flags=ndr_flags@entry=3, r=r@entry=0x7fffffffdb30) at ../librpc/ndr/ndr_ntprinting.c:61
#4  0x00005555559a00ee in ndr_pull_struct_blob (blob=blob@entry=0x7fffffffd9f0, mem_ctx=mem_ctx@entry=0x555556110e70, p=p@entry=0x7fffffffdb30,
    fn=0x5555559c35c0 <ndr_pull_ntprinting_printer>) at ../librpc/ndr/ndr.c:911
#5  0x00005555559bef50 in printing_tdb_migrate_printer (mem_ctx=mem_ctx@entry=0x555556110e70, winreg_pipe=winreg_pipe@entry=0x555556115a60,
    key_name=key_name@entry=0x555556116fd9 "x642e", data=data@entry=0x5555561181e0 "H\020", length=2842) at printing/nt_printing_migrate.c:203
#6  0x00005555559bf7b4 in migrate_internal (mem_ctx=mem_ctx@entry=0x555556110e70, tdb_path=tdb_path@entry=0x555556114250 "/var/lib/samba/ntprinters.tdb",
    winreg_pipe=0x555556115a60) at printing/nt_printing_migrate_internal.c:129
#7  0x00005555559bfd72 in nt_printing_tdb_migrate (msg_ctx=0x555556103b60) at printing/nt_printing_migrate_internal.c:234
#8  0x000055555590f5c7 in spoolss_init_cb (ptr=<optimized out>) at rpc_server/rpc_ep_setup.c:671
#9  0x000055555587ca30 in rpc_srv_register (version=version@entry=1, clnt=clnt@entry=0x555555bf3c14 "spoolss", srv=srv@entry=0x555555bf3c14 "spoolss",
    iface=0x5555560dae20 <ndr_table_spoolss>, cmds=cmds@entry=0x5555560f6400 <api_spoolss_cmds>, size=size@entry=110, rpc_srv_cb=rpc_srv_cb@entry=0x7fffffffde60)
    at rpc_server/srv_pipe_register.c:220
#10 0x000055555584ff50 in rpc_spoolss_init (rpc_srv_cb=rpc_srv_cb@entry=0x7fffffffde60) at librpc/gen_ndr/srv_spoolss.c:8652
#11 0x000055555590fe15 in dcesrv_ep_setup (ev_ctx=<optimized out>, msg_ctx=msg_ctx@entry=0x555556103b60) at rpc_server/rpc_ep_setup.c:1117
#12 0x000055555565e17d in main (argc=<optimized out>, argv=0x7fffffffe428) at smbd/server.c:1256
(gdb) up 1
#1  0x00005555559a5af5 in ndr_pull_string (ndr=ndr@entry=0x555556117040, ndr_flags=ndr_flags@entry=1, s=s@entry=0x5555561191b8) at ../librpc/ndr/ndr_string.c:273
273                             if (!convert_string_talloc(ndr->current_mem_ctx, chset,
(gdb) l
268
269                     NDR_PULL_NEED_BYTES(ndr, len1);
270                     if (len1 == 0) {
271                             as = talloc_strdup(ndr->current_mem_ctx, "");
272                     } else {
273                             if (!convert_string_talloc(ndr->current_mem_ctx, chset,
274                                                        CH_UNIX,
275                                                        ndr->data+ndr->offset, len1,
276                                                        (void **)(void *)&as,
277                                                        &converted_size, false))
(gdb) p len1
$17 = 7
(gdb) p *(ndr->data+ndr->offset)@7
$18 = "PRIV\342\060"
(gdb) p /x *(ndr->data+ndr->offset)@7
$19 = {0x50, 0x52, 0x49, 0x56, 0xe2, 0x30, 0x0}
(gdb) up 1
#2  0x00005555559c28be in ndr_pull_ntprinting_printer_data (ndr=ndr@entry=0x555556117040, ndr_flags=ndr_flags@entry=1, r=0x5555561191b0) at librpc/gen_ndr/ndr_ntprinting.c:467
467                                     NDR_CHECK(ndr_pull_string(ndr, NDR_SCALARS, &r->name));
(gdb) up 1
#3  0x00005555559c36cf in ndr_pull_ntprinting_printer (ndr=ndr@entry=0x555556117040, ndr_flags=ndr_flags@entry=3, r=r@entry=0x7fffffffdb30) at ../librpc/ndr/ndr_ntprinting.c:61
61                                      NDR_CHECK(ndr_pull_ntprinting_printer_data(ndr, NDR_SCALARS, &r->printer_data[r->count]));
(gdb) p *r
$20 = {info = {attributes = 4168, priority = 1, default_priority = 1, starttime = 0, untiltime = 0, status = 0, cjobs = 8, averageppm = 0, changeid = 10221829, c_setprinter = 0, 
    setuptime = 1320686522, servername = 0x5555561155e0 "", printername = 0x5555561178c0 "x642e", sharename = 0x555556117930 "x642e", 
    portname = 0x5555561179a0 "Samba Printer Port", drivername = 0x555556118d60 "x642e", comment = 0x555556118dd0 "Lexmark X642e", 
    location = 0x555556118e60 "Upstairs Management", sepfile = 0x555556115bc0 "", printprocessor = 0x555556116f70 "winprint", datatype = 0x555556118f00 "RAW", 
    parameters = 0x555556115d90 ""}, devmode = 0x555556118f70, count = 0, printer_data = 0x5555561191b0}
Comment 5 David Disseldorp 2012-07-05 13:28:25 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > I'll attempts to reproduce this one using the tdbs provided.
> 
> The ntprinting_printer->printer_data[0].name field appears to be at fault.
> It contains the following, which is parsed as a UTF8 null-terminated string:
> {0x50, 0x52, 0x49, 0x56, 0xe2, 0x30, 0x0} = "PRIV\342\060"
> 

Actually, this data should be in ntprinting_printer->devmode->nt_dev_private, we're just not correctly pulling that field as far as I can tell.

357                             NDR_CHECK(ndr_pull_generic_ptr(ndr, &_ptr_nt_dev_private));
(gdb) p r->driverextra 
$1 = 1728
...
(gdb) p /x *(ndr->data + ndr->offset)@8
$8 = {0x1, 0x0, 0x0, 0x0, 0xc0, 0x6, 0x0, 0x0}

That's the nt_dev_private data blog descriptor, the length 0x000006c0 matches r->driverextra. Furthermore:

(gdb) p *(ndr->data + ndr->offset + 8 + r->driverextra)@100
$13 = "\001\000\000\000PrinterDriverData\000\000\000\000\000\000\000\000\000\001\000\000\000PrinterDriverData\\TrayFormSize\000\004\000\000\000\004\000\000\000\326\000\000\000\001\000\000\000PrinterDriverData\\T"

That looks like the _real_ ntprinting_printer->printer_data, which contains registry key-value pairs.
Comment 6 David Disseldorp 2012-07-05 15:26:12 UTC
(In reply to comment #5)
...
> Actually, this data should be in ntprinting_printer->devmode->nt_dev_private,
> we're just not correctly pulling that field as far as I can tell.

There appears to be a bug in ndr_pull_DATA_BLOB(). The data blog length is not pulled, when called with any of the LIBNDR_ALIGN_FLAGS set.

This bug is more prevalent with the fix for bug 8373, as LIBNDR_ALIGN_FLAGS now includes LIBNDR_FLAG_NOALIGN, where it used to only include the less used LIBNDR_FLAG_ALIGN[248] flags.

I've tested with the following patch, with which unmarshalling of ntprinting_printer data succeeds:

diff --git a/librpc/ndr/ndr_basic.c b/librpc/ndr/ndr_basic.c
index 7a4e22a..89b94ae 100644
--- a/librpc/ndr/ndr_basic.c
+++ b/librpc/ndr/ndr_basic.c
@@ -1294,6 +1294,8 @@ _PUBLIC_ enum ndr_err_code ndr_pull_DATA_BLOB(struct ndr_pull *ndr, int ndr_flag
                if (ndr->data_size - ndr->offset < length) {
                        length = ndr->data_size - ndr->offset;
                }
+               ndr->offset += length;
+               NDR_CHECK(ndr_pull_uint32(ndr, NDR_SCALARS, &length));
        } else {
                NDR_CHECK(ndr_pull_uint32(ndr, NDR_SCALARS, &length));
        }

It still needs further testing, I'd also like feedback from someone more experienced with the ndr code.
Comment 7 David Disseldorp 2012-07-06 11:53:17 UTC
(In reply to comment #6)

> There appears to be a bug in ndr_pull_DATA_BLOB(). The data blog length is not
> pulled, when called with any of the LIBNDR_ALIGN_FLAGS set.
> 
> This bug is more prevalent with the fix for bug 8373, as LIBNDR_ALIGN_FLAGS now
> includes LIBNDR_FLAG_NOALIGN, where it used to only include the less used
> LIBNDR_FLAG_ALIGN[248] flags.
> 
> I've tested with the following patch, with which unmarshalling of
> ntprinting_printer data succeeds:

After looking further into ndr_push/pull_DATA_BLOB usage, I consider this change incorrect.

There are 3 use cases for DATA_BLOB marshalling/unmarshalling:

1)
ndr_push_DATA_BLOB and ndr_pull_DATA_BLOB when called with LIBNDR_FLAG_ALIGN* alignment flags set, are used to push/pull padding bytes _only_. The length is determined by the alignment required and the current ndr offset.
e.g. dcerpc.idl:
        typedef struct {
...
                [flag(NDR_ALIGN8)]    DATA_BLOB _pad;
        } dcerpc_request;

2)
When called with the LIBNDR_FLAG_REMAINING flag, all remaining bytes in the ndr buffer are pushed/pulled.
e.g. dcerpc.idl:
        typedef struct {
...
                [flag(NDR_REMAINING)] DATA_BLOB stub_and_verifier;
        } dcerpc_request;

3)
When called without alignment flags, push/pull a uint32 length _and_ a corresponding byte array to/from the ndr buffer.
e.g. drsblobs.idl
        typedef [public] struct { 
...
                DATA_BLOB data;
        } DsCompressedChunk;



The fix for bug 8373 changed the definition of "alignment flags", such that when called with NDR_NOALIGN ndr_push/pull_DATA_BLOB behaves as (1: padding bytes) rather than (3: uint32 length + byte array).

This breaks marshalling/unmarshalling for the following structures.
eventlog.idl:
        typedef [flag(NDR_NOALIGN|NDR_PAHEX),public] struct {
...
                DATA_BLOB sid;
...
        } eventlog_Record_tdb;

ntprinting.idl:
        typedef [flag(NDR_NOALIGN),public] struct {
...
                DATA_BLOB *nt_dev_private; 
        } ntprinting_devicemode;

        typedef [flag(NDR_NOALIGN),public] struct {
...
                DATA_BLOB data;
        } ntprinting_printer_data;
Comment 8 David Disseldorp 2012-07-06 12:12:17 UTC
Created attachment 7692 [details]
Proposed fix based on master
Comment 9 Chris Smith 2012-07-06 12:17:57 UTC
(In reply to comment #8)
> Created attachment 7692 [details]
> Proposed fix based on master

No patch then for 3.6.6 ?
Comment 10 David Disseldorp 2012-07-06 12:29:22 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > Created attachment 7692 [details] [details]
> > Proposed fix based on master
> 
> No patch then for 3.6.6 ?

The same patch applies cleanly to 3.6.6.
Comment 11 Chris Smith 2012-07-06 13:57:26 UTC
(In reply to comment #10)
> The same patch applies cleanly to 3.6.6.

Thanks. I may get brave enough to try it over the weekend. So far 7 releases of 3.6 have been unusable for one reason or another. Beginning to feel like the 3.6 series is a forgotten stepchild. I keep wondering why 3.5 keeps getting new releases when 3.6 is supposed to replace it (but in some ways I'm glad since 3.6 has not been kind) and the major focus is on 4.0.
Comment 12 David Disseldorp 2012-07-06 14:04:09 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > The same patch applies cleanly to 3.6.6.
> 
> Thanks. I may get brave enough to try it over the weekend. So far 7 releases of
> 3.6 have been unusable for one reason or another. Beginning to feel like the
> 3.6 series is a forgotten stepchild. I keep wondering why 3.5 keeps getting new
> releases when 3.6 is supposed to replace it (but in some ways I'm glad since
> 3.6 has not been kind) and the major focus is on 4.0.

Thanks for providing the logs and tdb, would have been much harder to get to the bottom of this without your report.

Please continue to raise bugs for any other 3.6 issues you are seeing, the 3.6 series is certainly not intended as a forgotten stepchild.
Comment 13 Guenther Deschner 2012-07-06 17:05:20 UTC
Comment on attachment 7692 [details]
Proposed fix based on master

Looks good, I pushed it to master, along with other fixes and tests.
Comment 14 David Disseldorp 2012-07-06 17:19:15 UTC
Thanks Günther.

Karolin, could you please merge to 3.6?
Comment 15 Guenther Deschner 2012-07-16 13:27:17 UTC
Created attachment 7707 [details]
patch for 3.6.x

v3-6-test version of that patch.
Comment 16 Guenther Deschner 2012-07-16 13:27:52 UTC
Karolin, please pick up for 3.6.x. Thanks!
Comment 17 Guenther Deschner 2012-07-27 08:18:42 UTC
Karolin, are there any problems with this patchset ? It really is a blocker for 3.6.7.
Comment 18 Karolin Seeger 2012-07-27 19:50:14 UTC
(In reply to comment #17)
> Karolin, are there any problems with this patchset ? It really is a blocker for
> 3.6.7.

Blocker do not exist any longer. I will push the patch soon.
Comment 19 Karolin Seeger 2012-07-27 19:52:11 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > Karolin, are there any problems with this patchset ? It really is a blocker for
> > 3.6.7.
> 
> Blocker do not exist any longer. I will push the patch soon.

s/Blocker/Blockers/
Comment 20 Karolin Seeger 2012-07-29 18:45:38 UTC
Pushed to v3-6-test.
Closing out bug report.

Thanks!
Comment 21 Karolin Seeger 2012-08-28 09:48:30 UTC
These patches break the samba3.posix_s3.local.ndr test:

[1/1 in 0s] samba3.posix_s3.local.ndr (s3dc)
UNEXPECTED(failure): samba3.posix_s3.local.ndr .ntprinting_printer
REASON: _StringException: _StringException: ../source4/torture/ndr/ndr.c:51: Expression `ndr->offset == ndr->data_size' failed: 3233 unread bytes

command: bin/smbtorture4 --configfile=$SMB_CONF_PATH --maximum-runtime=$SELFTEST_MAXTIME --target=$SELFTEST_TARGET --basedir=$SELFTEST_TMPDIR --option="torture:winbindd_netbios_name=$SERVER" --option="torture:winbindd_netbios_domain=$DOMAIN" --option=torture:sharedelay=100000 --option=torture:progress=no --format=subunit $LISTOPT //$SERVER_IP/tmp -U$USERNAME%$PASSWORD local.ndr $LOADLIST 2>&1 | /data/git/samba/v3-6-test/selftest/filter-subunit $LISTOPT --fail-on-empty --prefix="samba3.posix_s3.local.ndr ."
expanded command: bin/smbtorture4 --configfile=/data/git/samba/v3-6-test/source3/st/client/client.conf --maximum-runtime=1200 --target=samba3 --basedir=/data/git/samba/v3-6-test/source3/st/tmp --option="torture:winbindd_netbios_name=LOCALS3DC2" --option="torture:winbindd_netbios_domain=SAMBA-TEST" --option=torture:sharedelay=100000 --option=torture:progress=no --format=subunit $LISTOPT //127.0.0.2/tmp -Ukseeger%locals3dc2pass local.ndr $LOADLIST 2>&1 | /data/git/samba/v3-6-test/selftest/filter-subunit $LISTOPT --fail-on-empty --prefix="samba3.posix_s3.local.ndr ."
ERROR: Testsuite[samba3.posix_s3.local.ndr (s3dc)]
REASON: Exit code was 1

Is this an issue with the test or has a bug been introduced?
Comment 22 Guenther Deschner 2012-08-28 12:32:53 UTC
Created attachment 7839 [details]
additional patch for 3.6.x

Ok, I see. Using the same function for testing as in master reveals the missing ndr flags for that call. Without this patch, the call "torture_suite_add_ndr_pull_test" cannot be used at all. 

All of this only affects testing...
Comment 23 David Disseldorp 2012-08-28 12:57:02 UTC
Comment on attachment 7839 [details]
additional patch for 3.6.x

Looks good Günther. This fix looks to have been in master for almost a year (430123e8).
Comment 24 Guenther Deschner 2012-08-28 13:36:38 UTC
Karolin, please add this followup patch. Thanks.
Comment 25 Karolin Seeger 2012-08-29 08:30:35 UTC
(In reply to comment #24)
> Karolin, please add this followup patch. Thanks.

Pushed to v3-6-test.
Closing out bug report.

Thanks for investigating!