Bug 8979 - rsync daemon: High load while skipping hardlinks
Summary: rsync daemon: High load while skipping hardlinks
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.0.5
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
Depends on:
Reported: 2012-06-05 13:44 UTC by Simon Klinkert
Modified: 2012-06-16 17:06 UTC (History)
0 users

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Simon Klinkert 2012-06-05 13:44:41 UTC

I am observing a rsync daemon process with a high load and an endlessly running loop in function check_prior() in hlink.c
I stepped (with gdb) through this function but there is no way out of this while loop.

rsync command: rsync --server -vlHtprze.iLs --timeout=600 --delete --partial --ignore-existing . <path>

gdb back trace:

#0  0x080ad49c in check_prior (file=0x8297f7c, gnum=0, prev_ndx_p=0x80449a0, flist_p=0x80449a4) at hlink.c:268
#1  0x080ae0f3 in skip_hard_link (file=0x8297f7c, flist_p=0x81bf8b4) at hlink.c:549
#2  0x080c6a14 in handle_skipped_hlink (file=0x8297f7c, itemizing=1, code=FLOG, f_out=1) at generator.c:2015
#3  0x080c4b68 in recv_generator (
    fname=0x8045900 "<path to directory>", 
    file=0x8297f7c, ndx=647, itemizing=1, code=FLOG, f_out=1) at generator.c:1400
#4  0x080c761f in generate_files (f_out=1, local_name=0x0) at generator.c:2262
#5  0x0809fa50 in do_recv (f_in=0, f_out=1, local_name=0x0) at main.c:832
#6  0x0809fdd0 in do_server_recv (f_in=0, f_out=1, argc=1, argv=0x81d06e4) at main.c:942
#7  0x0809feb2 in start_server (f_in=0, f_out=1, argc=2, argv=0x81d06e0) at main.c:972

Please tell me which further information do you need.
Comment 1 Simon Klinkert 2012-06-14 06:47:45 UTC
After some more investigation with gdb it seems like something is wrong in function flist_for_ndx().

check_prior() calls flist_for_ndx() with ndx=759. The problem is he returns every time cur_flist since there is no way to set another flist than cur_flist (because ndx=759 and cur_flist->ndx_start = 758).

next = 0x81dfb18, prev = 0x87033d0, files = 0x88a34e0, sorted = 0x88a34e0, file_pool = 0x81d1800, pool_boundary = 0x82bc474, used = 3, malloced = 32768, low = 0, high = 2, ndx_start = 758, flist_num = 50, parent_ndx = 49, in_progress = -2, to_redo = 0

I don't know if this would be the correct fix but maybe we need something like this after the loops in flist_for_ndx:

if (ndx == flist->ndx_start - 1)
         return NULL;

I have three broken rsync daemons but I have no idea how to reproduce this behavior.
Comment 2 Wayne Davison 2012-06-16 17:06:59 UTC
You should use the --owner option with -H or upgrade to a newer rsync.