Bug 7854 - Abysmal sparse file performance
Summary: Abysmal sparse file performance
Status: RESOLVED DUPLICATE of bug 5801
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.0.9
Hardware: x86 FreeBSD
: P3 major (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-07 21:09 UTC by grarpamp
Modified: 2016-01-14 10:42 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description grarpamp 2010-12-07 21:09:47 UTC
I have a 5.5GB file, mostly sparse. Tar performs far[!] better than rsync.
I have no ideas yet, so just an FYI as to current state.
FreeBSD 8.1 i386 zfs
Yes, I know the blocks used differs but don't know why yet, could
be just how zfs does things or related to the large amount of sparseness.
There are no media errors, CPU/IO load or anything like that and the source
and dest paths are on the same filesystem.
I've not tested times for files that are, say, 90% full instead of 90% sparse
Though a 50% 35MB file was 8.5x slower than tar and had identical
block counts and sha256 with both.

/usr/bin/time rsync -HaxiS ./a ../
>f+++++++++ a
      271.13 real       101.44 user        95.62 sys
ls -sl ./a ../a ; rm -f ../a
blocks bytes
3625 5535932416 ./a
3769 5535932416 ../a

/usr/bin/time tar -cf - ./a | /usr/bin/time tar -C .. -Sxf -
       57.67 real         1.10 user        27.77 sys
       57.67 real        10.68 user         5.87 sys
ls -sl ./a ../a ; rm -f ../a
blocks bytes
3625 5535932416 ./a
2977 5535932416 ../a
Comment 1 grarpamp 2012-02-29 07:08:41 UTC
Bump and request for 1.25 year review.

Sparse files can be created with dd.
Sparseness in reasonably random locations and densities
can be created with partial bittorrent downloads.
Comment 2 grarpamp 2012-03-10 22:52:51 UTC
Maybe related:
https://bugzilla.samba.org/show_bug.cgi?id=5801
Comment 3 Björn Jacke 2016-01-14 10:42:34 UTC
yes the speed penalty is there because the data needs to be analyzed on the receiver site. Only a protocol extension for unallocated sparse regions can solve this.

*** This bug has been marked as a duplicate of bug 5801 ***