Bug 6999 - getpeername failed. Error was Transport endpoint - Port 139/445 Poingpong problem
Summary: getpeername failed. Error was Transport endpoint - Port 139/445 Poingpong pro...
Status: NEW
Alias: None
Product: Samba 3.4
Classification: Unclassified
Component: File services (show other bugs)
Version: 3.4.2
Hardware: x86 All
: P3 major
Target Milestone: ---
Assignee: Volker Lendecke
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-17 23:41 UTC by Jim Schatzman
Modified: 2009-12-29 16:50 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Schatzman 2009-12-17 23:41:26 UTC
If Samba is configured with "smb ports = 139,445" and a Windows XP client is configured to use TCP (instead of NetBIOS), then the result is erratic connection failures. XP will erratically detect a connection fault. Windows Explorer will then freeze up for several minutes, for example, if you have mapped a network drive to the Samba server.

Numerous Internet tips suggest limiting Samba to use only port 139. However, that will not work if your Windows box is configured to use TCP instead of NetBios. You can fix the problem for these systems by configuring Samba to use only port 445. However, this will then fail for Windows systems that need NetBIOS.

The problem seems to be something like this

1) Samba ping pongs (why??) between connections on ports 139 and 445.

2) Windows deals with both ports, but when it gets a connection on port 445, it terminates any open connections on port 139 with prejudice. Samba then logs any of a variety of error messages, including

"getpeername failed. Error was Transport endpoint is not connected"

"Error writing XXX bytes to client. -1. (Transport endpoint is not connected)"

"write_data: write failure in writing to client 0.0.0.0. Error Broken pipe"

"close_directory: Could not get share mode lock for ."

"close_remove_share_mode: Could not get share mode lock for file XXXX"

Some folks have reported merely that this garbage fills up the system log. The problem is worse than that - Windows XP causes the networking subsystem to lock up for an extended period of time.


This may be at heart a Windows XP bug. XP apparently tries both ports and "picks" whichever is fastest - even if it is configured for TCP only. Apparently, XP can renegotiate this decision at any time. It seems to be when XP changes its mind about which port to use that things go wrong. Samba does not handle this gracefully at all. I suspect that if, once a connection is established, say on port 445, if Samba simply ignored connections from the same IP with port 139, and vice-versa, the problem would go away. It might also work to consider the sequence of packets over the two ports as if they were on one virtual port. If these approaches are too draconian, maybe somebody else can come up with a better solution.


To reproduce the problem


1) Configure Samba with smb ports = 139,445

2) Configure a Windows box with TCP-only SMB (no NetBIOS!). Map a network drive pointing to the Linux Samba installation.

3) Exercise certain functionality, such as reading large directory trees over and over. Look for error messages in the Samba log file.
Comment 1 Jim Schatzman 2009-12-17 23:47:18 UTC
Sorry, that may have been confusing. I have heard this problem diagnosed as resulting from Windows terminating connections with prejudice. However, I think it more likely that XP is "simply" trying to shift from one port to the other in mid stream, and Samba doesn't understand this.