3387 – NTBACKUP - Can't backup to SAMBA Share files bigger than 4.29Gig

Bug 3387 - NTBACKUP - Can't backup to SAMBA Share files bigger than 4.29Gig

Summary: NTBACKUP - Can't backup to SAMBA Share files bigger than 4.29Gig

Status:	RESOLVED INVALID

Alias:	None

Product:	Samba 3.0
Classification:	Unclassified
Component:	nmbd (show other bugs)
Version:	3.0.21a
Hardware:	x86 Linux

Importance:	P3 major
Target Milestone:	none
Assignee:	Gerald (Jerry) Carter (dead mail address)
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2006-01-07 20:34 UTC by Jonathan Marchand
Modified:	2006-01-17 21:03 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
smb.conf (4.52 KB, application/octet-stream) 2006-01-12 15:58 UTC, Danny Ybarra	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jonathan Marchand 2006-01-07 20:34:07 UTC

I have a very strange problem with NTBACKUP writing to a SAMBA share.

NTBACKUP won't write files bigger than 4.29GB to a SAMBA share.

Here are the specs of the systems:

SAMBA Server:

Slackware 10.1
SAMBA 3.0.21a (large file support turned on)
Kernel 2.6.8

Windows Server:

Windows 2003 Server
NTBACKUP (latest version)
Data to backup is about 50GB

Description of the problem:

Copying large files (ie: 50GB) directly from the Windows 2003 server to the SAMBA share works fine. However, it doesn't work with NTBACKUP. As soon as NTBACKUP hit the 4.29GB mark when archiving to the SAMBA server, it errors with some filesystem I/O error.

NTBACKUP works fine when the destination folder is another Windows server.

Actually, this used to work fine. Backups were made using NTBACKUP to the SAMBA server without any problems. I can't think of anything that changed on the Windows server that would have caused this to stop working.

--------

When NTBACKUP fails:

Error: The device reported an error on a request to write data to media.
Error reported: Invalid command.
There may be a hardware or media problem.
Please check the system event log for relevant failures.
The operation was ended.
Backup completed on 4/01/2006 at 10:00 AM.
Directories: 2391
Files: 7761
Bytes: 4,287,133,922
Time: 19 minutes and 17 seconds

Error: C: is not a valid drive, or you do not have access.

I had a look at the mailing-list(s), I saw that quite a few people had issues with NTBACKUP but from what I read all bugs were patched in 3.0.5 or so.

Let me know if you need further details!

Thanks.

Comment 1 Jonathan Marchand 2006-01-08 16:37:09 UTC

Further to that, the filesystem used on the Samba server is ReiserFS.

Comment 2 Danny Ybarra 2006-01-11 16:42:52 UTC

I am having this same problem at all of our sites.  We are using the latest release of samba 3.0.21a.  If there is any debugging information someone needs let me know.

Comment 3 Gerald (Jerry) Carter (dead mail address) 2006-01-12 11:18:50 UTC

I'm running now against 3.0.21 and don't seem to 
have any problems.  What are you settings for

* shadow copy on the nt backup client
* bin/smbd -b | egrep '(LARGE|ACL)'

I'm running a backup now from WinXP sp2.  And so far the backup 
file is at 13Gb.

Comment 4 Danny Ybarra 2006-01-12 13:55:40 UTC

(In reply to comment #3)
> I'm running now against 3.0.21 and don't seem to 
> have any problems.  What are you settings for
> 
> * shadow copy on the nt backup client
> * bin/smbd -b | egrep '(LARGE|ACL)'
> 
> I'm running a backup now from WinXP sp2.  And so far the backup 
> file is at 13Gb.
> 

This one is weird because it only started happening when we switched from Redhat Enterprise 3 to SLES9 using the latest version of samba.

Here is the output of the smbd command:
fileserver1:/ # smbd -b | egrep '(LARGE|ACL)'
   HAVE_SYS_ACL_H
   HAVE_EXPLICIT_LARGEFILE_SUPPORT
   HAVE_POSIX_ACLS
   _LARGEFILE64_SOURCE

Here is another exerpt from /var/log/messages:
Jan 12 13:48:15 fileserver1 smbd[20462]: [2006/01/12 13:48:15, 0] lib/util_sock.c:read_data(526)
Jan 12 13:48:15 fileserver1 smbd[11216]: [2006/01/12 13:48:15, 0] lib/util_sock.c:get_peer_addr(1222)
Jan 12 13:48:15 fileserver1 smbd[11216]:   getpeername failed. Error was transport endpoint is not connected
Jan 12 13:48:15 fileserver1 smbd[20464]: [2006/01/12 13:48:15, 0] lib/util_sock.c:get_peer_addr(1222)
Jan 12 13:48:15 fileserver1 smbd[20464]:   getpeername failed. Error was Transport endpoint is not connected
Jan 12 13:48:15 fileserver1 smbd[20464]: [2006/01/12 13:48:15, 0] lib/util_sock.c:read_data(526)
Jan 12 13:48:15 fileserver1 smbd[20464]:   read_data: read failure for 4 bytes to client 0.0.0.0. Error = Connection reset by peer
Jan 12 13:48:15 fileserver1 smbd[20462]:   read_data: read failure for 4 bytes to client 0.0.0.0. Error = Connection reset by peer
Jan 12 13:51:04 fileserver1 smbd[20476]: [2006/01/12 13:51:04, 0] smbd/service.c:set_current_service(49)
Jan 12 13:51:04 fileserver1 smbd[20476]:   chdir (/homes/Shared) failed
Jan 12 13:51:11 fileserver1 smbd[20476]: [2006/01/12 13:51:11, 0] smbd/service.c:set_current_service(49)
Jan 12 13:51:11 fileserver1 smbd[20476]:   chdir (/homes/Shared) failed
Jan 12 13:51:11 fileserver1 smbd[20476]: [2006/01/12 13:51:11, 0] smbd/service.c:set_current_service(49)
Jan 12 13:51:11 fileserver1 smbd[20476]:   chdir (/homes/Shared) failed
Jan 12 13:51:16 fileserver1 smbd[20477]: [2006/01/12 13:51:16, 0] smbd/service.c:set_current_service(49)
Jan 12 13:51:16 fileserver1 smbd[20477]:   chdir (/homes/Shared) failed

Comment 5 Gerald (Jerry) Carter (dead mail address) 2006-01-12 15:20:50 UTC

I'm using SUSE 10.0.  Ext3 fs.  What file system are you using?
And SUSE distro?

Comment 6 Danny Ybarra 2006-01-12 15:37:57 UTC

(In reply to comment #5)
> I'm using SUSE 10.0.  Ext3 fs.  What file system are you using?
> And SUSE distro?
> 

We are using SLES9 ext3 too with all current patches.  Dell Server 2 procs and 2 GB of RAM.  Running raid5 on Perc3 controller with a dell 220S storage array.

Comment 7 Gerald (Jerry) Carter (dead mail address) 2006-01-12 15:49:35 UTC

Danny, what OS are you running?  If Linux, what kernel/distro/file system?

I can retest against SLES9 later (need to add another disk to the 
VM session)

Comment 8 Danny Ybarra 2006-01-12 15:58:38 UTC

Created attachment 1677 [details]
smb.conf

Comment 9 Danny Ybarra 2006-01-12 16:01:59 UTC

I am running kernel version 2.6.5-7.244-default.  I also uploaded my smb.conf so you can see my parameters.

Comment 10 Danny Ybarra 2006-01-13 08:59:18 UTC

I just did another backup from another XP workstation and it too will not go over 4.7 GB.  Here are the errors I got from /var/log/messages.

Jan 13 09:47:52 fileserver1 smbd[23802]: [2006/01/13 09:47:52, 0] lib/util_sock.c:write_data(554)
Jan 13 09:47:52 fileserver1 smbd[23802]:   write_data: write failure in writing to client 129.162.180.222. Error Connection reset by peer
Jan 13 09:47:52 fileserver1 smbd[23802]: [2006/01/13 09:47:52, 0] lib/util_sock.c:send_smb(762)
Jan 13 09:47:52 fileserver1 smbd[23802]:   Error writing 51 bytes to client. -1. (Connection reset by peer)

Comment 11 Danny Ybarra 2006-01-17 08:33:06 UTC

I have an update.  It seems this may not be a Samba problem.  I tried writing a 20 GB file to a share on the local disk of the SLES9 server and it worked.  It seems that this problem shows up when writing to the Dell 220S SCSI storage array using a Dell PERC3 controller card.  I think this is a problem with the megaraid driver in SLES9.  I was successfully able to write a 20 GB file to other SLES9 servers that weren't using the megaraid driver.

Comment 12 Gerald (Jerry) Carter (dead mail address) 2006-01-17 21:03:24 UTC

Danny, thanks for the update.  I'm closing this one out now. 
Sounds like you might need to follow up with SUSE support.