Bug 4820 - User Manager: Unable to rechange the domain
Summary: User Manager: Unable to rechange the domain
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: All All
: P3 normal (vote)
Target Milestone: ---
Assignee: Andrew Bartlett
QA Contact: Andrew Bartlett
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-07-26 13:08 UTC by Matthias Dieter Wallnöfer
Modified: 2008-10-24 14:48 UTC (History)
0 users

See Also:


Attachments
The capture of the network traffic (196.03 KB, application/octet-stream)
2007-07-30 07:36 UTC, Matthias Dieter Wallnöfer
no flags Details
Valgrind log (1.62 KB, text/plain)
2007-08-28 05:06 UTC, Matthias Dieter Wallnöfer
no flags Details
Corrected provision.reg (727 bytes, patch)
2008-01-01 08:53 UTC, Matthias Dieter Wallnöfer
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Dieter Wallnöfer 2007-07-26 13:08:16 UTC
When I am trying to change the domain in User Manager several times, I get "a device attached to your system is not functioning". When I try to start SAMBA in debug level 5, I see some SMB requests before displaying the message on the client.
In SAMBA 3 the bug isn't reproducable after my test, so there the domain change is working right.
Comment 1 Andrew Bartlett 2007-07-30 04:08:03 UTC
I'm going to have to close this as a duplicate of the stacktrace bug, because without more info, I think it's just the same thing.  If it still ocours, I'll need a network trace (pcap from wireshark). 

*** This bug has been marked as a duplicate of 4821 ***
Comment 2 Matthias Dieter Wallnöfer 2007-07-30 07:11:07 UTC
No, the bug is still reproducible. I'll see to produce a wireshark log.
Comment 3 Matthias Dieter Wallnöfer 2007-07-30 07:36:48 UTC
Created attachment 2846 [details]
The capture of the network traffic

Here is the wireshark pcap file. 192.168.1.9 is the address of the win2k client (vmware-win2k2), 192.168.1.10 of the samba server (vmware-samba4).
Comment 4 Matthias Dieter Wallnöfer 2007-08-23 16:42:24 UTC
Maybe also interesting: I tested lastly the NT Server Manager for Domains that seems to run fine. There are missing RPC's in SAMBA 4 but this isn't such tragic. But there the change domain dialog works without problems!
Comment 5 Matthias Dieter Wallnöfer 2007-08-27 10:09:38 UTC
Your latest work on it seem to break the whole usermanager. I wasn't even able to see the userlist!
Comment 6 Andrew Bartlett 2007-08-27 17:00:11 UTC
Works fine for me...
Comment 7 Matthias Dieter Wallnöfer 2007-08-28 02:02:25 UTC
Previous it worked (better):
- When I start the User Manager for Domains when I'm logged in as the domain administrator, I get a message box: "Wrong parameter. Do you want to select another domain do administer?"
- Then, when I click on "Yes" I see my domain in the list, but when I want to confirm, I get "No true selection"
Comment 8 Matthias Dieter Wallnöfer 2007-08-28 05:06:32 UTC
Created attachment 2898 [details]
Valgrind log

When running SAMBA in valgrind I noticed some messages in the logfile.
Comment 9 Matthias Dieter Wallnöfer 2007-08-29 01:52:04 UTC
Interesting: The same valgrind error I got also when clicking on "Replication" in the properties of the SAMBA machine in the "Server Manager for Domains".
Comment 10 Matthias Dieter Wallnöfer 2007-08-31 05:13:47 UTC
Comment on attachment 2898 [details]
Valgrind log

Ok, it was quite a mess! I had in my SVN work directory patches against the rgistry backend for the regtree bug. I reverted the changes and am now able to see the userlist again like you, Andrew! But the above descripted bug, when you several try to change the domain with the dialog, *remains*!
Comment 11 Matthias Dieter Wallnöfer 2007-09-01 16:44:16 UTC
So, now after long testing (I think) I found the real issue. It is caused by the samr pipe, that sometimes doesn't seem to close in the right way. F. e. :
- You change for the first time the domain with the "Select Domain" command
- If you subsequently try to change it several times, you get the error message "unconnected device"
- But when you double click a user object, then click "OK", the samr pipe is handled correctly - you can change the domain and nothing happens anymore
Comment 12 Matthias Dieter Wallnöfer 2007-09-03 01:46:00 UTC
Could you please one time also have a look to this bug? I've now described in the above text the cause.
But *why* is the user manager often unable to close the samr pipe and then tries to reconnect, and that fails? The best would it be to compare the results with a Windows Server, I think!
Comment 13 Andrew Bartlett 2007-09-03 04:27:45 UTC
I've reproduced this, but I can't figure out what is going wrong.  

Sorry.

In particular, the error occurs without an immediately proceeding network packet -  the 'change domain' list is created, apparently successfully, and the client fails.  This makes it much hard to track down - perhaps we get some expected value wrong?
Comment 14 Matthias Dieter Wallnöfer 2007-09-03 05:50:55 UTC
I know, the bug is very tricky. I spent some hours to find a way to fix it and went over many SAMBA 4 source files to analyze the cause. It is caused by the samr pipe. I'm very assured to that. Maybe in the idl definition there could be a small error. I can only say, I tested it with SAMBA 3 and there it *works* like it should. Maybe we could do some kind of code compare?
Comment 15 Andrew Bartlett 2007-09-03 06:03:17 UTC
Samba3 and Samba4 are very different in this area, but comparing outputs might help.  Let me know if you find anything more!
Comment 16 Matthias Dieter Wallnöfer 2007-09-04 15:43:53 UTC
My thesis is now that the problem could be also caused by the call dcesrv_samr_QueryDisplayInfo. Maybe it doesn't return the exact expected values and then the User Manager doesn't want to close the samr pipe. What do you think?
Comment 17 Matthias Dieter Wallnöfer 2007-12-08 14:30:06 UTC
The problem has gone even more worse now! I'm now not at least able to switch to my domain.
Error message "Element not found!".
Comment 18 Matthias Dieter Wallnöfer 2008-01-01 08:45:35 UTC
The problem "Element not found!" is caused by a mistake in the registry. The key "SYSTEM" is there written as "System" rather then with all letters in upcase. Since LDB seems to be case sensitive, the ldb_search routine couldn't determine the right key and the operations through the WINREG pipe failed.
Comment 19 Matthias Dieter Wallnöfer 2008-01-01 08:53:56 UTC
Created attachment 3085 [details]
Corrected provision.reg

A corrected provision.reg file for the provisioning should be enough to solve this issue and similar ones.
But to me it seems, that the windows registry is case insensitive but LDB doesn't seem.
Please note, the rechange domain problem persists!
Comment 20 Andrew Bartlett 2008-01-06 18:19:30 UTC
Jelmer,

Any comment on the registry part of this bug.  Have we had a regression regarding case sensitivity in the registry?
Comment 21 Jelmer Vernooij 2008-01-06 22:35:16 UTC
Yes, this is probably a bug in the LDB registry backend. I doubt it's a regression though, it's always been like this :-)

Not sure what the easiest way is to fix this. Is it possible to do case-insensitive searches in LDB?
Comment 22 Matthias Dieter Wallnöfer 2008-01-13 13:31:23 UTC
I think, I've found now the right reason why this fails:

When opening a pipe, we don't set the "OpenNoRecall" flag in comparison with SAMBA 3.
Comment 23 Matthias Dieter Wallnöfer 2008-01-13 14:59:11 UTC
I investigated the case a bit more and discovered, that some parameters aren't set in the response when a NTCreateX request is handled by SAMBA 4. SAMBA 3 does it right.

This includes:
- "Create options" ("OpenNoRecall" in my case was not set if the client requested it - the 0x400000 bit in libcli/raw/smb.h under NTCREATEX_OPTIONS should be "NTCREATEX_OPTIONS_OPEN_NO_RECALL" rather then "UNKNOWN" - this says Wireshark)
- "Create action" (SAMBA 3 told me in my example 1 - "File existed and was opened")
- Response "File attributes" (SAMBA 3 told me in my example 0x80 - "Normal file")
Comment 24 Matthias Dieter Wallnöfer 2008-02-26 12:31:08 UTC
Andrew, have you looked into this? Or who could be the right person for this pipe problem?
Comment 25 Andrew Bartlett 2008-02-27 03:58:06 UTC
My only problem is the lack of time to work on the tests to prove the problem.
Comment 26 Matthias Dieter Wallnöfer 2008-07-16 14:07:50 UTC
Good that you started now with the NTCREATEX_OPTIONS_OPEN_NO_RECALL! Hopefully the correct handling of this one fixes the issue (I think it has to do something with locking).
Comment 27 Andrew Bartlett 2008-07-16 18:26:30 UTC
The NT Create & X create options are not even referenced in the IPC case, so my changes will have no effect here. 
Comment 28 Matthias Dieter Wallnöfer 2008-08-14 11:32:47 UTC
Metze's latest commits finally fixed the problem. Well done!
Comment 29 Matthias Dieter Wallnöfer 2008-10-03 15:45:56 UTC
Sadly, I've to reopen the bug, because the same problem reappeared in newer GIT releases.
Comment 30 Matthias Dieter Wallnöfer 2008-10-24 14:48:06 UTC
Now I know the real reason of the problem - the WINREG server. My fixes are now in the main GIT repo, so the problem is now *really* fixed.