Bug 9697 - DsReplicaGetInfo fails due to sendto() EMSGSIZE error on UNIX domain socket
Summary: DsReplicaGetInfo fails due to sendto() EMSGSIZE error on UNIX domain socket
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: DCE-RPCs and pipes (show other bugs)
Version: 4.0.3
Hardware: All FreeBSD
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-02 19:20 UTC by Landon Fuller
Modified: 2013-03-06 11:22 UTC (History)
1 user (show)

See Also:


Attachments
git format-patch (2.60 KB, patch)
2013-03-02 19:20 UTC, Landon Fuller
no flags Details
required patch from master to cherry-pick the fix (1.96 KB, patch)
2013-03-03 09:41 UTC, Andrew Bartlett
metze: review+
Details
patches cherry-picked from master (2.85 KB, patch)
2013-03-03 09:41 UTC, Andrew Bartlett
metze: review+
Details
pathes for master to address metze's concerns and apply to tsocket_bsd (2.80 KB, patch)
2013-03-04 05:27 UTC, Andrew Bartlett
no flags Details
patches cherry-picked from master (2nd set) (3.17 KB, patch)
2013-03-04 23:22 UTC, Andrew Bartlett
metze: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Landon Fuller 2013-03-02 19:20:04 UTC
Created attachment 8603 [details]
git format-patch

On FreeBSD, local DCE/RPC (eg, samba-tool drs showrepl) fails with the following error:

ERROR(runtime): DsReplicaGetInfo of type 0 failed - (-1073610723, 'NT_STATUS_RPC_PROTOCOL_ERROR')

Investigating the cause, the call actually completes successfully, but the response is never sent due to sendto() returning EMSGSIZE in unixdom_sendto().

This is caused by FreeBSD's default 2048 byte SO_SNDBUF for UNIX domain sockets; the DsReplicaGetInfo response is on the order of 3004+ bytes long, and thus EMSGSIZE is returned. This succeeds on Linux due to the use of a larger default SO_SNDBUF size, but there is no guarantee that the buffer will be sized appropriately.

I see two possible fixes: the first is to statically configure the SO_SNDBUF for the largest possible packet size. The second would be to set SO_SNDBUF based on the actual packet size upon the return of EMSGSIZE. The latter seems safer to me, given that it may not be reasonable to predetermine the largest possible packet size, but I'm not familiar with the code base.

I've attached a patch that implements the latter solution, but I am content to implement the former, as well.

Lastly, this particular DsReplicaGetInfo response is also larger than the MTU of most network transports, and by default neither Linux nor FreeBSD will fragment 'atomically' sent messages. I do not know the code base or protocols well enough to know whether this is a problem outside of this particular UNIX domain socket failure case, but -- in theory -- this bug could occur elsewhere. I also don't know enough about DCE/RPC to know whether it would make sense to use framed stream sockets rather than relying on atomic sends.

As a work-around on FreeBSD, one can also set a higher system-wide default, eg: sysctl -w net.local.dgram.maxdgram=4096
Comment 1 Andrew Bartlett 2013-03-03 09:41:31 UTC
Created attachment 8604 [details]
required patch from master to cherry-pick the fix
Comment 2 Andrew Bartlett 2013-03-03 09:41:59 UTC
Created attachment 8605 [details]
patches cherry-picked from master
Comment 3 Andrew Bartlett 2013-03-03 09:42:55 UTC
Metze,

Does tsocket need a similar fix?
Comment 4 Stefan Metzmacher 2013-03-03 16:48:30 UTC
Comment on attachment 8605 [details]
patches cherry-picked from master

I think we should return the sendto() errno (EMSGSIZE) if setsockopt()
fails.
Comment 5 Stefan Metzmacher 2013-03-03 16:49:26 UTC
(In reply to comment #3)
> Metze,
> 
> Does tsocket need a similar fix?

I guess so
Comment 6 Andrew Bartlett 2013-03-04 05:27:25 UTC
Created attachment 8606 [details]
pathes for master to address metze's concerns and apply to tsocket_bsd

Metze,

I don't have a good way to test this (any ideas?) but this seems to be what is needed for both issues.
Comment 7 Andrew Bartlett 2013-03-04 23:22:22 UTC
Created attachment 8607 [details]
patches cherry-picked from master (2nd set)

This patch is attachment 8606 [details] with the cherry-pick markers from master.  Apply after the other patches.
Comment 8 Stefan Metzmacher 2013-03-05 08:14:00 UTC
Comment on attachment 8604 [details]
required patch from master to cherry-pick the fix

Looks good
Comment 9 Stefan Metzmacher 2013-03-05 08:14:27 UTC
Comment on attachment 8605 [details]
patches cherry-picked from master

Looks ok
Comment 10 Stefan Metzmacher 2013-03-05 08:14:47 UTC
Comment on attachment 8607 [details]
patches cherry-picked from master (2nd set)

Looks good
Comment 11 Karolin Seeger 2013-03-06 09:23:19 UTC
Pushed to autobuild-v4-0-test.
Comment 12 Karolin Seeger 2013-03-06 11:22:36 UTC
Pushed to autobuild-v4-0-test.
Closing out bug report.

Thanks!