Bug 13824 - libsmbclient leaks when accessing files inside nested DFS namespaces
Summary: libsmbclient leaks when accessing files inside nested DFS namespaces
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: libsmbclient (show other bugs)
Version: unspecified
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Jeremy Allison
QA Contact: Samba QA Contact
Depends on:
Reported: 2019-03-07 09:26 UTC by Chris
Modified: 2019-03-12 21:59 UTC (History)
0 users

See Also:

Debug output from smbclient (373.82 KB, text/plain)
2019-03-08 09:46 UTC, Chris
no flags Details
Extra debug patch. (3.07 KB, patch)
2019-03-11 18:07 UTC, Jeremy Allison
no flags Details
Debug output from smbclient with patch (360.36 KB, text/plain)
2019-03-12 00:46 UTC, Chris
no flags Details
Extra debug - v2. (3.50 KB, patch)
2019-03-12 18:04 UTC, Jeremy Allison
no flags Details
Debug output from smbclient with patch v2 (411.47 KB, text/plain)
2019-03-12 20:22 UTC, Chris
no flags Details
Extra debug - v3 (4.19 KB, patch)
2019-03-12 20:58 UTC, Jeremy Allison
no flags Details
Debug output from smbclient with patch v3 (501.36 KB, text/plain)
2019-03-12 21:55 UTC, Chris
no flags Details
Debug output from smbclient with patch v3 corrected (501.22 KB, text/plain)
2019-03-12 21:59 UTC, Chris
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris 2019-03-07 09:26:01 UTC
This is somewhat similar to https://bugzilla.samba.org/show_bug.cgi?id=11195 and https://bugzilla.samba.org/show_bug.cgi?id=11624

The patch in 11624 resolved the issue except in instances where we are accessing files within nested DFS namespaces. In these circumstances the same symptoms as in 11195 are seen.

To reproduce create two DFS namespaces:-


Create a folder target inside namespace1 which points to namespace2:-

domain.com\namespace1\Target -> domain.com\namespace2

Then open and close files using domain.com\namespace1\Target

As with 11195 running...

lsof | grep microsoft

...will show multiple connections labelled "microsoft-ds (ESTABLISHED)" to the target server even after the files have been closed. The connections will only be closed once the parent process is terminated.

Accessing multiple files in this way can quickly lead to the file server returning nt_status_too_many_opened_files
Comment 1 Jeremy Allison 2019-03-07 21:53:31 UTC
Hmmm. Can you can get debug logs to show exactly *when* the extra connections are being made ? That would help me track down the circumstances causing this.
Comment 2 Jeremy Allison 2019-03-07 21:54:15 UTC
Or indeed, a wireshark trace might also do the trick. I need to determine why when going to the same server name we're not re-using the existing connections in the DFS connection cache.
Comment 3 Chris 2019-03-08 00:11:28 UTC
Sure, no problem.

I tried running:-

smbclient //domain/dfs -W DOMAIN.COM -U username -d10 -l/tmp --option=gensec:gse_krb5=no

...and a /tmp/log.smbclient is created but never written.

I'll try to get Wireshark running tomorrow.
Comment 4 Chris 2019-03-08 09:46:15 UTC
Created attachment 14912 [details]
Debug output from smbclient
Comment 5 Chris 2019-03-08 09:46:28 UTC
OK, I set debug to 10 and tee-d the output. Hope that's OK?

I've attached the log file. As you'll see all I was doing was navigating the folder structure and dir-ing. By the end of this test:-

lsof | grep microsoft-ds | grep smbclient -c

...was showing 30.

I also spotted that in the log there are entries like:-

dos_clean_name [\DFS\Information for Staff\Information for Staff\Information for Students]
unix_clean_name [\DFS\Information for Staff\Information for Staff\Information for Students]

...but the actual path is:-

\DFS\Information for Staff\Information for Students

So it seems to be doubling part of the path up somewhere. Not sure if this is just a display issue but I've also found a problem with renaming files in this type of nested namespace environment where libsmbclient returns that it can't find the file. Could this be related?
Comment 6 Jeremy Allison 2019-03-09 01:07:34 UTC
At a brief glance it looks like it's not finding the existing cached connections, so it's opening a new connection for every operation.

Even with SMB2 we should be finding existing connections tagged with remote_host name, look at the cli_cm_find(referring_cli, server, share) code in cli_cm_open().

Are you able to rebuild the code ? If so, I might send you a patch adding extra debugs that will show what remote names we're caching and examining why the cached connection lookup isn't working.
Comment 7 Chris 2019-03-09 12:19:11 UTC
Sure. Fire it over when you're ready and I'll build and post back more logs.

Any thoughts on the incorrect paths in the logs?
Comment 8 Jeremy Allison 2019-03-11 18:07:58 UTC
Created attachment 14916 [details]
Extra debug patch.

Can you try building with this and giving me the level 10 output. Should give me more data on the problems.

Thanks !

Comment 9 Chris 2019-03-11 23:44:36 UTC
OK, I applied the patch and built with no issues but I don't see any extra output. I'm doing this against the latest CentOS 7 srpm. I guess that's the issue?
Comment 10 Jeremy Allison 2019-03-12 00:07:10 UTC
You need to run with debug level 10. The extra calls are DBG_DEBUG (log level 10) values. If you're running with debug level 10 but don't see the new calls that at least is showing the SMB2 calls aren't going through the DFS connection manager (which is strange).
Comment 11 Chris 2019-03-12 00:10:32 UTC
So as above?

smbclient //domain/dfs -W DOMAIN.COM -U username -d10 --option=gensec:gse_krb5=no

If so then definitely not seeing any of the additional output. I'm just building from the latest source from samba.org to check if that makes a difference.
Comment 12 Jeremy Allison 2019-03-12 00:20:55 UTC
OK, just in case it's something strange to do with DBG_DEBUG(), change these calls to d_printf() to have the debug data come out in the normal output stream and re-test. If you still don't see them, then there's a problem that SMB2 just isn't going through the DFS connection caching layer.
Comment 13 Chris 2019-03-12 00:46:34 UTC
Created attachment 14917 [details]
Debug output from smbclient with patch

Rebuilt it from latest and it looks like the additional output is there.

See attached.
Comment 14 Jeremy Allison 2019-03-12 17:31:04 UTC
OK, this looks like the core of the error:

cli_resolve_path: Calling cli_dfs_get_referral on dfs_path \2012FS\DFS\Information for Staff
signed SMB2 message
cli_resolve_path: Calling cli_cm_find on server domain.com share Information for Staff
cli_cm_find: Looking for connection to server domain.com share Information for Staff
cli_cm_find: List entry server 2012FS share DFS
cli_cm_find: List entry server 2012FS share IPC$

The server / share name pair it should be looking for inside the DFS connection caching code is incorrect.

It seems to be looking for a share called "Information for Staff".

Looks like cli_dfs_get_referral() is working correctly, but then this code:

        for (count = 0; count < num_refs; count++) {
                if (!split_dfs_path(dfs_refs, refs[count].dfspath,
                                    &dfs_refs[count].extrapath)) {
                        return NT_STATUS_NOT_FOUND;

                ccli = cli_cm_find(rootcli, dfs_refs[count].server,
                if (ccli != NULL) {
                        extrapath = dfs_refs[count].extrapath;
                        *targetcli = ccli;

specifically the split_dfs_path() code is messing up. Extra patch
to follow to discover what is being passed into this function and
what it is doing.
Comment 15 Jeremy Allison 2019-03-12 18:04:11 UTC
Created attachment 14920 [details]
Extra debug - v2.

Updated patch to give debug info inside split_dfs_path().
Comment 16 Chris 2019-03-12 20:22:18 UTC
Created attachment 14921 [details]
Debug output from smbclient with patch v2

Rebuilt and output attached. I followed the same route through the folder structure as before.
Comment 17 Jeremy Allison 2019-03-12 20:47:42 UTC
OK, here is the problem:

cli_resolve_path: Calling cli_dfs_get_referral on dfs_path 
\2012FS\DFS\Information for Staff

The return from the referral lookup is returning a bogus DFS path of:

signed SMB2 message
split_dfs_path: split_dfs_path: |\domain.com\Information for Staff|
split_dfs_path: server: |domain.com|
split_dfs_path: share: |Information for Staff|
split_dfs_path: extrapath: ||

\domain.com\Information for Staff

Given that - the code is trying to parse it into 'server' \ 'share'
and gives the wrong result.

How are you creating these DFS links ? Are they on a Samba server ?

I will now look into adding debugs into cli_dfs_get_referral() to see what might be doing this. We're making progress (slowly :-).
Comment 18 Jeremy Allison 2019-03-12 20:58:24 UTC
Created attachment 14922 [details]
Extra debug - v3

OK, here is another version that adds dump_data() calls to the SMB2 request to get the DFS referral. This is essentially a poor-mans wireshark trace (you could also just upload the wireshark traces :-) but will allow me to examine what the request/response data is for the DFS referral lookup.
Comment 19 Chris 2019-03-12 21:26:29 UTC
OK, just building now.

To answer your question about how this is all setup, it's a little out-of-the-ordinary (I hadn't ever come across one like this before) but apparently Microsoft allow it.

There are three separate domain-based DFS namespaces

domain.com\Information for Students
domain.com\Information for Staff

From here, there are folder targets like so:-

domain.com\DFS\Information for Staff -> domain.com\Information for Staff
domain.com\DFS\Information for Students -> domain.com\Information for Students

And then, within the Information for Staff namespace there's one for Information for Students:-

domain.com\Information for Staff\Information for Students -> domain.com\Information for Students

So you effectively end up with:-

domain.com\DFS\Information for Staff\Information for Students

...where "Information for Staff" and "Information for Students" are folder targets to different namespaces. These namespaces are nested within one another.

Again, this isn't something I've seen before but Microsoft do seem to suggest that namespaces can be nested in this way.

I'll post the output back as soon as I have it.
Comment 20 Jeremy Allison 2019-03-12 21:43:07 UTC
OK, this is starting to look like a strange setup we haven't come across before and the code isn't coping well with - not a generic "DFS is broken with SMB2" bug.

I still want to know what the problem is, but this might take a little longer to fix as we'll need to be able to create a regression test case that duplicates the problem so we can test the fix.
Comment 21 Chris 2019-03-12 21:55:03 UTC
Created attachment 14925 [details]
Debug output from smbclient with patch v3

Yes, I agree. I had to check the docs to confirm that it was actually possible to configure things this way but I've replicated the setup on my network just to be sure.

Latest output attached.
Comment 22 Chris 2019-03-12 21:59:15 UTC
Created attachment 14926 [details]
Debug output from smbclient with patch v3 corrected

Apologies, one of the namespace names is wrong in the previous attachment. Corrected in this one.