Bug 102 - smbtar can't handle files >=8GB properly
Summary: smbtar can't handle files >=8GB properly
Alias: None
Product: Samba 2.2
Classification: Unclassified
Component: smbclient (show other bugs)
Version: 2.2.8a
Hardware: All Linux
: P2 major
Target Milestone: ---
Assignee: Gerald (Jerry) Carter (dead mail address)
QA Contact:
: 104 (view as bug list)
Depends on:
Reported: 2003-05-21 00:16 UTC by Dragan Krnic
Modified: 2005-11-14 09:27 UTC (History)
1 user (show)

See Also:

Fixes the bug for files bigger than 8 GB (740 bytes, patch)
2003-07-16 02:40 UTC, Dragan Krnic
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dragan Krnic 2003-05-21 00:16:50 UTC
smbclient's built-in tar feature lists the size modulo 8 GB. It effectively 
truncates the size from left if it is >=8GB. For example, a  file 
14,570,219,438 bytes big (octal 154434763656) gets listed as
 5,980,284,846 bytes big (octal 54434763656).

The good news is that it puts all 14+ GBs on the tape but the tar can only
recover those cca. 6 GB listed, and complain about "obsolescent base-64 

As of GNU-tar 1.13.25 (on my SuSE 8.2) the correct way to list size of files 
greater than 11 octal digits is to put binary long 0x80000000 in the first 4 
bytes of the field and a binary long long size in the remaining 8 bytes, 
trailing '\0' be damned. Although I didn't find any documentation to that
effect, this is how it works.
Comment 1 Tim Potter 2003-05-22 18:32:43 UTC
*** Bug 104 has been marked as a duplicate of this bug. ***
Comment 2 Dragan Krnic 2003-07-15 16:06:29 UTC
I suddenly had a little free time to fix this bug. As an example, here is the
ll of a file bigger than 8 GB:

   pmn92:/local/sub/stars # ll bigger.than.8g
   -rw-r--r--    2 root     root     9483026432 Jul  9 18:58 bigger.than.8g

If I use smbclient to make a tape archive, the size in the tar header will
indicate the size modulo 8 GB. When restoring the file from the tape, tar
only extracts the first so many bytes and complains that checksum of the
next file is wrong.

   # tar tvbf 64 /dev/nst0
   -rw-r--r-- 0/0       893091840 2003-07-09 18:58:30 ./bigger.than.8g

With the changes stated below, the tar can correctly extract the full size of
the file:

   # tar tvbf 64 /dev/nst0 
   -rw-r--r-- 0/0      9483026432 2003-07-09 18:58:30 ./bigger.than.8g

The file source/client/clitar.c has to be changed as follows:

   --- samba-2.2.8/source/client/clitar.c  2003-03-14 21:34:47.000000000 +0100 
   +++ samba-2.2.8/source/client/clitar.c  2003-07-15 21:33:31.000000000 +0100
   @@ -206,10 +206,16
      /* write out a "standard" tar format header */
      safe_strcpy(hb.dbuf.mode, amode, strlen(amode));
      oct_it((SMB_BIG_UINT)0, 8, hb.dbuf.uid);
      oct_it((SMB_BIG_UINT)0, 8, hb.dbuf.gid);
      oct_it((SMB_BIG_UINT) size, 13, hb.dbuf.size);
   +  if ( size > (SMB_BIG_UINT)077777777777LL )
   +  {    memset ( hb.dbuf.size, 0, 4 );
   +       hb.dbuf.size[0]=128;
   +   for ( i = 8, jp=(char*)&size; i; i-- )
   +      hb.dbuf.size[i+3] = *(jp++);
   +  }
      oct_it((SMB_BIG_UINT) mtime, 13, hb.dbuf.mtime);
      memcpy(hb.dbuf.chksum, "        ", sizeof(hb.dbuf.chksum));
      memset(hb.dbuf.linkname, 0, NAMSIZ);

If the file size exceeds the resolution of 11 octal places, the size field
of the header struct, hd.dbuf.size, has to be filled with binary values 
0x80000000L in the first 4 bytes and with network-ordered content of the
8-byte SMB_BIG_UINT in the following 8 bytes. The variables i and jp are
local variables from the original code. 
Comment 3 Tim Potter 2003-07-15 17:14:45 UTC
Checked in.  Thanks!

Dragan, would you mind adding patches as attachments in the future?  Click on
the 'Create a New Attachment' and upload the patch directly into bugzilla.

This lowers the chance of the patch becoming mangled through cutting and pasting
Comment 4 Dragan Krnic 2003-07-16 02:40:16 UTC
Created attachment 52 [details]
Fixes the bug for files bigger than 8 GB

That is a good idea. A working patch is attached as file clitar.diff. 
The patch look-alike from my previous posting was a proof of concept.
The patch woudln't accept it until I trimmed the top 2 lines and then
it says

   Hunk #1 succeeded at 208 with fuzz 2.

when I patch it in manually with --verbose but nothing in rpm -bb.

I still don't understand why the 1st proposal gets rejected. Is it
the blank line or the quotes in comment?
Comment 5 Dragan Krnic 2003-07-16 03:52:53 UTC
Hallo Tim and Jerry,

I was actually up to something bigger. There are 2 more things I would like
to improve in smbclient tarmode:

1. Pack all the relevant ACLs as they are on the source PC

   There are actually 354 unused bytes in the header. They can accomodate
   ACLS of most files on a samba client. Things may be more complicated
   on Win servers, but for them we can overflow the ACLs into a ././@LongLink
   pre-extension. A ././@LongLink is a file with its own size set to the
   length of the long file name including the null-byte termination. But it
   may be set to any other size, the null-termination will delimit the name
   and the rest can be used for additional ACLs which couldn't fit in the
   empty space of the normal header.

   In order to do that, smbclient should have a subcommand to list ACLs.
   Can you set me on the right track, as to how to implement it?

2. The tarmode enum of <full|inc|reset|noreset> should be extended by a
   provision for tar's option "--listed-incremental=<snapshot.file>".

   This is one thing, which is really necessary, because modification
   date as in smbclient's option "-N <timestamp.file>" only takes care
   of files whose mtime has changed. It doesn't dump newly installed
   files with older modification dates, which practically means that no
   newly installed software will be dumped until the next full backup
   and neither will older files copied from other servers.

   Using the archive bits to indicate which files are to be dumped seems
   not to work as expected either. On the one hand it isn't easy to set
   all the archive bits before a full dump, on the other the archive bit
   seems not to be an automatism.

As I raved in a samba posting, these 2 features should go a long way towards
a reliable ez-2-use backup within samba.
Comment 6 Tim Potter 2003-07-16 17:39:03 UTC
Dude, you can access the ACL for a file or directory by using the
cli_query_secdesc() function in source/libsmb/clisecdesc.c  You'll have to write
your own smbclient command to call it though as it doesn't exist as yet.

The other trick is to take the security descriptor in a SEC_DESC structure and
turn it in to a stream of bytes that you can store in a tar hearder, or
somewhere else.  For an example of how this is done look in
testsuite/printing/psec.c where it is done in the psec_setsec() function. 
Basically we do a prs_init(), then sec_io_desc_buf() and then store the contents
of the prs buffer.

Isn't there some weird extension to tar for storing extended attributes as
files?  I think Solaris using something like it to store POSIX ACLs in tar
files.  The trick is to create two entries for a file in the tar archive.  The
first contains the POSIX ACLs and the second the file data.  If you are using a
normal tar then the first file will be overwritten with the second.  If your tar
command understands ACLs then it will store the contents of the first file as
the ACLs and the contents of the second as the file data.  Maybe something like
this could be used for storing Windows ACLs as well.
Comment 7 Tim Potter 2003-07-16 17:42:12 UTC
It should be possible to set/reset archive bits on a Samba server using chmod -R
to adjust one of the executable bits for the file.  I can't remember which one
corresponds to the archive bit.

The ATTRIB.EXE utility should be able to achieve the same effect from Windows.
Comment 8 Gerald (Jerry) Carter (dead mail address) 2005-11-14 09:27:48 UTC
database cleanup