Bug 10670 - When ip takeover, nfs client print error "stale nfs file hanle"
Summary: When ip takeover, nfs client print error "stale nfs file hanle"
Status: CLOSED WORKSFORME
Alias: None
Product: CTDB 2.5.x or older
Classification: Unclassified
Component: ctdb (show other bugs)
Version: 1.0.71
Hardware: All All
: P5 normal
Target Milestone: ---
Assignee: Michael Adam
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-23 09:44 UTC by yuxiangyang (dead mail adddress)
Modified: 2017-10-26 03:41 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description yuxiangyang (dead mail adddress) 2014-06-23 09:44:22 UTC
My ctdb version is :
ctdb-1.0.114.3-3.el6.x86_64

My linux version is Centos 6.2.

I use glusterfs as my storage node and my nfs client version is NFSV3.

I set up my cluster according to this article:
http://www.gluster.org/community/documentation/index.php/CTDB


When i stop ctdb on one node named  node1( NFS client is writing small files to this client), then on the nfs client, print "Stale nfs file handel." 

dd: opening `ipc_1/picd1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_1/picds1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_2/picd1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_2/picds1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_1/picd1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_1/picds1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_2/picd1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_2/picds1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_1/picd1_613_3.jpg': Stale NFS file handle
dd: opening `ipc_1/picds1_613_3.jpg': Stale NFS file handle
dd: opening `ipc_2/picd1_613_3.jpg': Stale NFS file handle
dd: opening `ipc_2/picds1_613_3.jpg': Stale NFS file handle

after a while another node named node2 takeover the ip and nfs client write file without any error.

10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.238441 s, 44.0 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.0437292 s, 48.0 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.137964 s, 76.0 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.031062 s, 67.5 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.0431491 s, 48.6 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.229843 s, 45.6 MB/s
2+0 records in
2+0 records out

When i start ctdb on node1, and ctdb begin to recovery , I got the same error on nfs client:

dd: opening `ipc_1/picd1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_1/picds1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_2/picd1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_2/picds1_613_1.jpg': Stale NFS file handle
dd: opening `ipc_1/picd1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_1/picds1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_2/picd1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_2/picds1_613_2.jpg': Stale NFS file handle
dd: opening `ipc_1/picd1_613_3.jpg': Stale NFS file handle
dd: opening `ipc_1/picds1_613_3.jpg': Stale NFS file handle
dd: opening `ipc_2/picd1_613_3.jpg': Stale NFS file handle
dd: opening `ipc_2/picds1_613_3.jpg': Stale NFS file handle

after a while when the node1 recovery completely, the data write correctly .

10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.238441 s, 44.0 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.0437292 s, 48.0 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.137964 s, 76.0 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.031062 s, 67.5 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.0431491 s, 48.6 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.229843 s, 45.6 MB/s
2+0 records in
2+0 records out

Any reply is appreicated !!!
Comment 1 Martin Schwenke 2017-10-26 03:38:47 UTC
Are you still seeing this?
Comment 2 Martin Schwenke 2017-10-26 03:41:09 UTC
2nd attempt to close.  This works for others... sending email to reporter failed...