Bug 11238 - CTDB – GlusterFS NFS Event Script
CTDB – GlusterFS NFS Event Script
Status: NEEDINFO
Product: CTDB 2.5.x or older
Classification: Unclassified
Component: ctdb
2.5.1
All Linux
: P5 enhancement
: ---
Assigned To: Martin Schwenke
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-04-26 13:07 UTC by Ben Draper
Modified: 2016-02-08 03:20 UTC (History)
0 users

See Also:


Attachments
GlusterFS NFS Event Monitor Script (Old Version Ignore) (1.26 KB, application/x-shellscript)
2015-04-26 13:07 UTC, Ben Draper
no flags Details
GlusterFS NFS Event Monitor Script (1.34 KB, application/x-shellscript)
2015-04-26 13:15 UTC, Ben Draper
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Draper 2015-04-26 13:07:27 UTC
Created attachment 10989 [details]
GlusterFS NFS Event Monitor Script (Old Version Ignore)

Hello Support,

There is no CTDB monitor script for the GlusterFS NFS implementation as you cannot use the normal NFS event script that comes with CTDB, this is because GlusterFS manages NFS.

Without a proper monitoring script CTDB will not initiate a failover when GlusterFS NFS services fail, attached is a script to solve this problem.

Please see testing below:

# ctdb status
Number of nodes:2
pnn:0 10.0.1.10        OK (THIS NODE)
pnn:1 10.0.1.11        OK
Generation:2096778561
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:1
# gluster volume status smb_br01 | grep 'NFS Server on localhost'
NFS Server on localhost					2049	Y	15479
# kill -9 15479
# gluster volume status smb_br01 | grep 'NFS Server on localhost'
NFS Server on localhost					N/A	N	N/A
# ctdb status
Number of nodes:2
pnn:0 10.0.1.10        UNHEALTHY (THIS NODE)
pnn:1 10.0.1.11        OK
Generation:2096778561
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:1

# tail /var/log/log.ctdb 
2015/04/26 14:00:29.465384 [ 2050]: Node became UNHEALTHY. Ask recovery master 1 to perform ip reallocation
2015/04/26 14:00:34.838603 [ 2050]: 60.glusternfs: ERROR: glusterfs_nfs tcp port 2049 is not responding
2015/04/26 14:00:34.841680 [ 2050]: 60.glusternfs: ERROR: glusterfs_nfs tcp port 38465 is not responding
2015/04/26 14:00:34.844732 [ 2050]: 60.glusternfs: ERROR: glusterfs_nfs tcp port 38466 is not responding
2015/04/26 14:00:45.210742 [ 2050]: 60.glusternfs: ERROR: glusterfs_nfs tcp port 2049 is not responding
2015/04/26 14:00:45.213786 [ 2050]: 60.glusternfs: ERROR: glusterfs_nfs tcp port 38465 is not responding
2015/04/26 14:00:45.216709 [ 2050]: 60.glusternfs: ERROR: glusterfs_nfs tcp port 38466 is not responding

# systemctl restart glusterd && systemctl restart glusterfsd
# gluster volume status smb_br01 | grep 'NFS Server on localhost'
NFS Server on localhost					2049	Y	18629
# ctdb status
Number of nodes:2
pnn:0 10.0.1.10        OK (THIS NODE)
pnn:1 10.0.1.11        OK
Generation:2096778561
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:1
# 

Regards,
Ben Draper
Comment 1 Ben Draper 2015-04-26 13:15:22 UTC
Created attachment 10990 [details]
GlusterFS NFS Event Monitor Script

GlusterFS NFS Event Monitor Script
Comment 2 Martin Schwenke 2015-06-22 05:51:06 UTC
Thanks for this suggestion and sorry for taking so long to respond.

I assume that TCP ports 38465 and 38466 are RPC services?  The job
you're doing with verify_ports() can probably be done more reliably
and completely using our existing RPC port checking code.  Can you
please send me the output of "rpcinfo -p" and "rpcinfo -s" so I can
do a sanity check?

We're in the process of folding 60.ganesha into 60.nfs so that we only
have a single, unified NFS eventscript in CTDB.

I'm in the process of reworking our RPC checking code.  It will a
directory of configuration files (/etc/ctdb/nfs-checks.d/ by default).
It will use a more extensible scheme than what we currently do in
/etc/ctdb/nfs-rpc-checks.d/.

I like your idea of actually monitoring the port(s) for the portmapper
itself.  I'm planning to add a configuration file to do this by
default.  Thanks.  :-)

Anything else will be done by configuring a call-out for the NFS
system being used.  That will stop us hard-coding all sorts of rubbish
in our core code.  :-)

We will ship a callout for the Linux kernel NFS server and
configuration files for RPC checks.  These will be used by default.

We will provide a sample callout for Ganesha as documentation and will
expect NFS Ganesha to ship an up-to-date callout for CTDB.  This way
they will always be in sync (with themselves) and won't have to depend
on us to merge changes to 60.ganesha.

Gluster NFS could then also provide a very simple callout and
instructions (or a script) to setup configuration files in
/etc/ctdb/nfs-checks.d/ for the RPC checks.
Comment 3 Ben Draper 2015-06-24 09:55:00 UTC
Thanks for getting back to me Martin, I really appreciate it. I'll try and provide as much information as possible to help you with this.

I like your idea of having one handler and then callout functions to different types of NFS implementations though, that would be fantastic! :-)

The issue I had is that the nfs-kernel-server service never gets started as GlusterFS has its own NFS implementation, so I needed a different script to ensure it worked properly as you can see below. With regards to the RPC side of things with my checking code your probably right there can be better ways to do it :-)

# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    3   tcp  38465  mountd
    100005    1   tcp  38466  mountd
    100003    3   tcp   2049  nfs
    100021    4   tcp  38468  nlockmgr
    100227    3   tcp   2049  nfs_acl
    100024    1   udp  58457  status
    100024    1   tcp  38265  status
    100021    1   udp    932  nlockmgr
    100021    1   tcp    934  nlockmgr
#

# rpcinfo -s
   program version(s) netid(s)                         service     owner
    100000  2,3,4     local,udp,tcp,udp6,tcp6          portmapper  superuser
    100005  1,3       tcp                              mountd      superuser
    100003  3         tcp                              nfs         superuser
    100021  1,4       udp,tcp                          nlockmgr    superuser
    100227  3         tcp                              nfs_acl     superuser
    100024  1         tcp6,udp6,tcp,udp                status      29
#

# fuser 111/tcp
111/tcp:               583
# fuser 38465/tcp
38465/tcp:            2446
# fuser 38466/tcp
38466/tcp:            2446
# fuser 2049/tcp
2049/tcp:             2446
# fuser 38468/tcp
38468/tcp:            2446
# fuser 58457/udp
58457/udp:            2683
# fuser 38265/tcp
38265/tcp:            2683
# fuser 932/udp
932/udp:              2446
# fuser 934/tcp
934/tcp:              2446

# ps -elf | grep 583 | grep -v grep
5 S rpc        583     1  0  80   0 -  9977 poll_s 17:05 ?        00:00:00 /sbin/rpcbind -w
# ps -elf | grep 2446 | grep -v grep
5 S root      2446     1  0  80   0 - 142206 futex_ 17:05 ?       00:00:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/572f9a612871bae19917989c604bd09b.socket
# ps -elf | grep 2683 | grep -v grep
5 S rpcuser   2683     1  0  80   0 - 12691 poll_s 17:05 ?        00:00:00 /sbin/rpc.statd
#

# firewall-cmd --list-all
public (default, active)
  interfaces: eno67113728
  sources:
  services: nfs rpc-bind samba ssh
  ports: 38466/tcp 38465/tcp
  masquerade: no
  forward-ports:
  icmp-blocks:
  rich rules:
#

If you need anymore information please let me know.

Thanks,
Ben
Comment 4 Martin Schwenke 2015-07-15 10:23:14 UTC
The new 60.nfs with $CTDB_NFS_CALLOUT and new /etc/ctdb/nfs-checks.d/
directory is now upstream in Samba master.

To implement your existing eventscripts you would need to:

* Implemented a callout with "monitor-pre" defined and have it run
  verify_supporting_services().  Set CTDB_NFS_CALLOUT to point to
  where it is installed.

  You probably want to define "register" too, so that the callout is
  only called for "monitor-pre".  Take a look at
  nfs-linux-kernel-callout as an example.

* Implement verify_ports() using .check files in
  /etc/ctdb/nfs-checks.d/.

  If you want CTDB to become unhealthy after a single failure then you
  would just have files like:

    20.nfs.check:

    # nfs
    version=3
    unhealthy_after=1

    # nlockmgr
    version="1 4"
    unhealthy_after=1

  and so on.

  The current default is to just check for services available on "tcp"
  but you can also do "udp".  This is done by using the "family"
  variable in the check file (there's a README to explain).

  I see I didn't implement support in ctdb_check_rpc() for properly
  checking IPv6 service availability.  That patch is now in my queue,
  so you'll be able to explicitly check for "tcp6" and "udp6" if you
  need to.  :-)

  You could either include instructions to install/create these or
  provide a directory which can be pointed to by the
  CTDB_NFS_CHECKS_DIR variable.  This variable is currently
  undocumented but we could document it.

Will you also want to use CTDB's statd-callout?  Or will we need to
quickly add something to disable this?  Not sure if Gluster NFS has a
cluster-aware lock manager or if it needs to use CTDB's hackery to
keep track of the clients that have locks.
Comment 5 Ben Draper 2015-07-23 16:00:21 UTC
Hi Martin,

All those points make sense, I'll see if I can get some time to create the callout and create the required files to verify ports and test it all works properly from upstream with GlusterFS.

I'll have to double check on the statd-callout question. I think GlusterFS takes care of locks, but I will need to double check this to be 100% though with its NFS implementation.

Thanks,
Ben
Comment 6 Martin Schwenke 2016-02-08 03:20:11 UTC
Hi Ben,

The NFS callout feature is now available in Samba 4.3.  Can you please check if it meets your needs?

Thanks...

peace & happiness,
martin