Bug 10752 - Apparent Memory leak in CTDB version: 2.5.3
Apparent Memory leak in CTDB version: 2.5.3
Status: NEW
Product: CTDB 2.5.x or older
Classification: Unclassified
Component: ctdb
2.5.3
All Linux
: P5 major
: ---
Assigned To: Amitay Isaacs
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-07-29 13:36 UTC by Scott Goldman
Modified: 2016-06-21 09:51 UTC (History)
2 users (show)

See Also:


Attachments
A compressed output of the ctdb dumpmemory command after smb was turned off (600.72 KB, application/octet-stream)
2014-08-21 14:03 UTC, Scott Goldman
no flags Details
The output of the ctdb statistics command after smb was turned off (2.01 KB, application/octet-stream)
2014-08-21 14:04 UTC, Scott Goldman
no flags Details
ctdb dumpmemory AFTER shutting down smb (652.56 KB, application/octet-stream)
2014-09-03 20:50 UTC, Scott Goldman
no flags Details
ctdb statistics AFTER shutting down smb (2.01 KB, application/octet-stream)
2014-09-03 20:50 UTC, Scott Goldman
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Scott Goldman 2014-07-29 13:36:28 UTC
Over the last 24 hours, "ctdb memory_used" has continued to climb from 24701934 at 2pm to 114631972 @ 9am (with a majority of users logging off after 6pm). We have just upgraded from ctdb 1 to ctdb 2.5.  

All tdb databases seem healthy.  ctdb repack 1 has been run.

completely restarting SMB/nbd does not reduce memory usage of ctdb.  The only way to reduce the memory is to restart CTDB.

When the memory usage gets too high, the speed drops (open/close/navigate).


[global]
	dos charset = CP850
	unix charset = UTF-8
	display charset = LOCALE
	workgroup = CAGENASGROUP
	realm = 
	netbios name = CAGENAS
	netbios aliases = 
	netbios scope = 
	server string = Samba 3.6.16
	interfaces = 
	bind interfaces only = No
	security = USER
	auth methods = 
	encrypt passwords = Yes
	client schannel = Auto
	server schannel = Auto
	allow trusted domains = Yes
	map to guest = Bad Password
	null passwords = Yes
	obey pam restrictions = No
	password server = *
	smb passwd file = /etc/samba/smbpasswd
	private dir = /etc/samba
	passdb backend = tdbsam
	algorithmic rid base = 1000
	root directory = 
	guest account = nobody
	enable privileges = Yes
	pam password change = No
	passwd program = 
	passwd chat = *new*password* %n\n *new*password* %n\n *changed*
	passwd chat debug = No
	passwd chat timeout = 2
	check password script = 
	username map = 
	password level = 0
	username level = 0
	unix password sync = No
	restrict anonymous = 0
	lanman auth = No
	ntlm auth = No
	client NTLMv2 auth = No
	client lanman auth = No
	client plaintext auth = No
	client use spnego principal = No
	send spnego principal = No
	preload modules = 
	dedicated keytab file = 
	kerberos method = default
	map untrusted to domain = No
	log level = 2
	syslog = 1
	syslog only = No
	log file = /var/log/samba/%m.log
	max log size = 5000
	debug timestamp = Yes
	debug prefix timestamp = No
	debug hires timestamp = Yes
	debug pid = No
	debug uid = No
	debug class = No
	enable core files = Yes
	smb ports = 445 139
	large readwrite = Yes
	max protocol = SMB2
	min protocol = CORE
	min receivefile size = 0
	read raw = Yes
	write raw = Yes
	disable netbios = Yes
	reset on zero vc = No
	log writeable files on exit = No
	acl compatibility = auto
	defer sharing violations = Yes
	nt pipe support = Yes
	nt status support = Yes
	announce version = 4.9
	announce as = NT
	max mux = 50
	max xmit = 65535
	name resolve order = lmhosts wins host bcast
	max ttl = 259200
	max wins ttl = 518400
	min wins ttl = 21600
	time server = No
	unix extensions = No
	use spnego = Yes
	client signing = auto
	server signing = No
	client use spnego = Yes
	client ldap sasl wrapping = plain
	enable asu support = No
	svcctl list = 
	deadtime = 15
	getwd cache = Yes
	keepalive = 300
	lpq cache time = 30
	max smbd processes = 0
	paranoid server security = Yes
	max disk size = 0
	max open files = 16404
	socket options = IPTOS_LOWDELAY TCP_NODELAY SO_KEEPALIVE
	use mmap = No
	hostname lookups = No
	name cache timeout = 660
	ctdbd socket = 
	cluster addresses = 
	clustering = Yes
	ctdb timeout = 0
	ctdb locktime warn threshold = 0
	smb2 max read = 65536
	smb2 max write = 65536
	smb2 max trans = 65536
	smb2 max credits = 8192
	load printers = No
	printcap cache time = 750
	printcap name = /dev/null
	cups server = 
	cups encrypt = No
	cups connection timeout = 30
	iprint server = 
	disable spoolss = Yes
	addport command = 
	enumports command = 
	addprinter command = 
	deleteprinter command = 
	show add printer wizard = No
	os2 driver map = 
	mangling method = hash2
	mangle prefix = 1
	max stat cache size = 256
	stat cache = Yes
	machine password timeout = 0
	add user script = 
	rename user script = 
	delete user script = 
	add group script = 
	delete group script = 
	add user to group script = 
	delete user from group script = 
	set primary group script = 
	add machine script = 
	shutdown script = 
	abort shutdown script = 
	username map script = 
	username map cache time = 0
	logon script = 
	logon path = \\%N\%U\profile
	logon drive = 
	logon home = \\%N\%U
	domain logons = No
	init logon delayed hosts = 
	init logon delay = 100
	os level = 20
	lm announce = No
	lm interval = 60
	preferred master = No
	local master = No
	domain master = Auto
	browse list = Yes
	enhanced browsing = Yes
	dns proxy = Yes
	wins proxy = No
	wins server = 
	wins support = No
	wins hook = 
	kernel oplocks = Yes
	lock spin time = 200
	oplock break wait time = 0
	ldap admin dn = 
	ldap delete dn = No
	ldap group suffix = 
	ldap idmap suffix = 
	ldap machine suffix = 
	ldap passwd sync = no
	ldap replication sleep = 1000
	ldap suffix = 
	ldap ssl = start tls
	ldap ssl ads = No
	ldap deref = auto
	ldap follow referral = Auto
	ldap timeout = 15
	ldap connection timeout = 2
	ldap page size = 1024
	ldap user suffix = 
	ldap debug level = 0
	ldap debug threshold = 10
	eventlog list = 
	add share command = 
	change share command = 
	delete share command = 
	preload = 
	lock directory = /var/lib/samba
	state directory = /var/lib/samba
	cache directory = /var/lib/samba
	pid directory = /var/run/samba
	utmp directory = 
	wtmp directory = 
	utmp = No
	default service = 
	message command = 
	get quota command = 
	set quota command = 
	remote announce = 
	remote browse sync = 
	socket address = 0.0.0.0
	nmbd bind explicit broadcast = Yes
	homedir map = auto.home
	afs username map = 
	afs token lifetime = 604800
	log nt token command = 
	time offset = 0
	NIS homedir = No
	registry shares = Yes
	usershare allow guests = No
	usershare max shares = 0
	usershare owner only = Yes
	usershare path = /var/lib/samba/usershares
	usershare prefix allow list = 
	usershare prefix deny list = 
	usershare template share = 
	allow insecure wide links = No
	async smb echo handler = No
	multicast dns register = Yes
	panic action = 
	perfcount module = 
	host msdfs = Yes
	passdb expand explicit = No
	idmap backend = tdb
	idmap cache time = 604800
	idmap negative cache time = 120
	idmap uid = 
	idmap gid = 
	template homedir = /cagefs/.rootconfig/userhome/%U
	template shell = /bin/bash
	winbind separator = \
	winbind cache time = 300
	winbind reconnect delay = 30
	winbind max clients = 200
	winbind enum users = No
	winbind enum groups = No
	winbind use default domain = No
	winbind trusted domains only = No
	winbind nested groups = Yes
	winbind expand groups = 1
	winbind nss info = template
	winbind refresh tickets = No
	winbind offline logon = No
	winbind normalize names = No
	winbind rpc only = No
	create krb5 conf = Yes
	ncalrpc dir = /var/lib/samba/ncalrpc
	winbind max domain connections = 1
	fileid:mapping = fsname
	gpfs:leases = yes
	gpfs:dfreequota = yes
	gpfs:prealloc = yes
	gpfs:sharemodes = no
	gpfs:winattr = yes
	idmap config * : range = 10000-600000
	nfs4:acedup = merge
	nfs4:chown = yes
	nfs4:mode = special
	idmap config * : backend = tdb2
	comment = 
	path = 
	username = 
	invalid users = 
	valid users = 
	admin users = 
	read list = 
	write list = 
	printer admin = 
	force user = 
	force group = 
	read only = Yes
	acl check permissions = Yes
	acl group control = No
	acl map full control = Yes
	create mask = 0744
	force create mode = 00
	security mask = 0777
	force security mode = 00
	directory mask = 0755
	force directory mode = 00
	directory security mask = 0777
	force directory security mode = 00
	force unknown acl user = No
	inherit permissions = No
	inherit acls = No
	inherit owner = No
	guest only = No
	administrative share = No
	guest ok = No
	only user = No
	hosts allow = 
	hosts deny = 
	allocation roundup size = 1048576
	aio read size = 0
	aio write size = 0
	aio write behind = 
	ea support = No
	nt acl support = Yes
	profile acls = No
	map acl inherit = No
	afs share = No
	smb encrypt = auto
	block size = 1024
	change notify = Yes
	directory name cache size = 100
	kernel change notify = No
	max connections = 0
	min print space = 0
	strict allocate = Yes
	strict sync = No
	sync always = No
	use sendfile = No
	write cache size = 0
	max reported print jobs = 0
	max print jobs = 1000
	printable = No
	print notify backchannel = Yes
	print ok = No
	printing = bsd
	cups options = 
	print command = lpr -r -P'%p' %s
	lpq command = lpq -P'%p'
	lprm command = lprm -P'%p' %j
	lppause command = 
	lpresume command = 
	queuepause command = 
	queueresume command = 
	printer name = 
	use client driver = No
	default devmode = Yes
	force printername = No
	printjob username = %U
	default case = lower
	case sensitive = Auto
	preserve case = Yes
	short preserve case = Yes
	mangling char = ~
	hide dot files = Yes
	hide special files = No
	hide unreadable = No
	hide unwriteable files = No
	delete veto files = No
	veto files = /.rootconfig/user.quota/group.quota/fileset.quota/lost+found/*.eml/*.nws/riched20.dll/*.{*}/iexplores.exe/IEXPLORES.EXE/explorer.exe/EXPLORER.EXE/autorun.*/AUTORUN.*/autorun2.inf/AUTORUN2.inf/open.exe/updates.exe/
	hide files = 
	veto oplock files = 
	map archive = Yes
	map hidden = No
	map system = No
	map readonly = yes
	mangled names = Yes
	store dos attributes = No
	dmapi support = No
	browseable = Yes
	access based share enum = No
	blocking locks = Yes
	csc policy = manual
	fake oplocks = No
	locking = Yes
	oplocks = Yes
	level2 oplocks = Yes
	oplock contention limit = 2
	posix locking = No
	strict locking = Auto
	share modes = Yes
	dfree cache time = 0
	dfree command = 
	copy = 
	preexec = 
	preexec close = No
	postexec = 
	root preexec = 
	root preexec close = No
	root postexec = 
	available = Yes
	volume = 
	fstype = NTFS
	set directory = No
	wide links = No
	follow symlinks = Yes
	dont descend = 
	magic script = 
	magic output = 
	delete readonly = No
	dos filemode = No
	dos filetimes = Yes
	dos filetime resolution = No
	fake directory create times = No
	vfs objects = gpfs, fileid
	msdfs root = No
	msdfs proxy =
Comment 1 Scott Goldman 2014-07-29 13:47:26 UTC
ctdb listvars - mostly defaults (notably:Samba3AvoidDeadlocks=1)
MaxRedirectCount        = 3
SeqnumInterval          = 1000
ControlTimeout          = 60
TraverseTimeout         = 20
KeepaliveInterval       = 5
KeepaliveLimit          = 5
RecoverTimeout          = 120
RecoverInterval         = 1
ElectionTimeout         = 3
TakeoverTimeout         = 9
MonitorInterval         = 15
TickleUpdateInterval    = 20
EventScriptTimeout      = 30
EventScriptTimeoutCount = 20
RecoveryGracePeriod     = 120
RecoveryBanPeriod       = 300
DatabaseHashSize        = 100001
DatabaseMaxDead         = 5
RerecoveryTimeout       = 10
EnableBans              = 1
DeterministicIPs        = 0
LCP2PublicIPs           = 1
ReclockPingPeriod       = 60
NoIPFailback            = 0
DisableIPFailover       = 0
VerboseMemoryNames      = 0
RecdPingTimeout         = 60
RecdFailCount           = 10
LogLatencyMs            = 0
RecLockLatencyMs        = 1000
RecoveryDropAllIPs      = 120
VerifyRecoveryLock      = 1
VacuumInterval          = 10
VacuumMaxRunTime        = 120
RepackLimit             = 10000
VacuumLimit             = 5000
VacuumFastPathCount     = 60
MaxQueueDropMsg         = 1000000
UseStatusEvents         = 0
AllowUnhealthyDBRead    = 0
StatHistoryInterval     = 1
DeferredAttachTO        = 120
AllowClientDBAttach     = 1
RecoverPDBBySeqNum      = 1
DeferredRebalanceOnNodeAdd = 300
FetchCollapse           = 1
HopcountMakeSticky      = 50
StickyDuration          = 600
StickyPindown           = 200
NoIPTakeover            = 0
DBRecordCountWarn       = 100000
DBRecordSizeWarn        = 10000000
DBSizeWarn              = 100000000
PullDBPreallocation     = 10485760
NoIPHostOnAllDisabled   = 0
Samba3AvoidDeadlocks    = 1
Comment 2 Amitay Isaacs 2014-08-20 12:40:00 UTC
Can you attach the output of "ctdb dumpmemory" when there are no users and the memory usage is still high along with the output of "ctdb statistics".

Does this happen on all the nodes?
Comment 3 Scott Goldman 2014-08-21 14:03:34 UTC
Created attachment 10212 [details]
A compressed output of the ctdb dumpmemory command after smb was turned off
Comment 4 Scott Goldman 2014-08-21 14:04:01 UTC
Created attachment 10213 [details]
The output of the ctdb statistics command after smb was turned off
Comment 5 Scott Goldman 2014-08-21 14:04:33 UTC
we are only running on one node.
Comment 6 Scott Goldman 2014-08-21 14:07:22 UTC
and in less that 5 hours, memory usage is now:
[cagenas02: SU: / 1001] ctdb statistics|egrep "num_clients|memory_used"
 num_clients                      155
 memory_used                 13110270

UID        PID  PPID  C STIME TTY          TIME CMD
root     15492     1  1 05:05 ?        00:04:52 /usr/sbin/ctdbd --reclock=/cagefs/.rootconfig/ctdb_shared_lockfile --pidfile=/var/run/ctdb/ctdbd.pid --logfile=/var/log/ctdb --socket=/tmp/ctdb.socket --public-addresses=/etc/ctdb/public_addresses -d ERR
root     15684 15492  0 05:05 ?        00:00:19 /usr/sbin/ctdbd --reclock=/cagefs/.rootconfig/ctdb_shared_lockfile --pidfile=/var/run/ctdb/ctdbd.pid --logfile=/var/log/ctdb --socket=/tmp/ctdb.socket --public-addresses=/etc/ctdb/public_addresses -d ERR
[cagenas02: SU: / 1004] date
Thu Aug 21 10:08:37 EDT 2014
[cagenas02: SU: / 1005] 

this system is used for graphic rendering and many clients open and close many files quickly.

thanks!
Comment 7 Amitay Isaacs 2014-08-25 05:07:12 UTC
Most of the memory CTDB is holding seems to be the packets for the other node "10.76.155.34".  That means you have configured CTDB for 2 nodes, but running CTDB only on a single node.

What's the motivation of running CTDB on a single node?  You don't need CTDB if you are only using a single node.

Can you reproduce this issue with a single entry in the /etc/ctdb/nodes file?
Comment 8 Scott Goldman 2014-08-25 11:20:58 UTC
Will do. The motivation was the tdb file became blooated and slowed down (under v1). Moving to v2 was the first step to getting back to multiple nodes running.

Sent from my BlackBerry 10 smartphone.
  Original Message
From: samba-bugs@samba.org
Sent: Monday, August 25, 2014 1:09 AM
To: Goldman, Scott
Subject: [Bug 10752] Apparent Memory leak in CTDB version: 2.5.3


https://bugzilla.samba.org/show_bug.cgi?id=10752

--- Comment #7 from Amitay Isaacs <amitay@samba.org> 2014-08-25 05:07:12 UTC ---
Most of the memory CTDB is holding seems to be the packets for the other node
"10.76.155.34".  That means you have configured CTDB for 2 nodes, but running
CTDB only on a single node.

What's the motivation of running CTDB on a single node?  You don't need CTDB if
you are only using a single node.

Can you reproduce this issue with a single entry in the /etc/ctdb/nodes file?

--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You reported the bug.
Comment 9 Scott Goldman 2014-08-25 13:22:52 UTC
Failure.. removing all unused nodes (and leaving just the "up" node) causes:

  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:16.838320,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:16.925185,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:16.929985,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:16.934880,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:16.939888,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:17.025798,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)
[2014/08/25 09:22:17.031062,  0] smbd/process.c:525(init_smb_request)
  init_smb_request: invalid wct number 255 (size 108)

and connections to fail.

What did I do wrong?

/etc/ctdb/nodes file just had the 1 active node listed.


Scott Goldman
ESPN, Inc.
Senior Director, Platform Engineering and Storage & Backup Services
ph: 860-766-4227
fx: 860-766-4502
email: scott.goldman@espn.com 


-----Original Message-----
From: samba-bugs@samba.org [mailto:samba-bugs@samba.org] 
Sent: Monday, August 25, 2014 1:07 AM
To: Goldman, Scott
Subject: [Bug 10752] Apparent Memory leak in CTDB version: 2.5.3

https://bugzilla.samba.org/show_bug.cgi?id=10752

--- Comment #7 from Amitay Isaacs <amitay@samba.org> 2014-08-25 05:07:12 UTC ---
Most of the memory CTDB is holding seems to be the packets for the other node
"10.76.155.34".  That means you have configured CTDB for 2 nodes, but running
CTDB only on a single node.

What's the motivation of running CTDB on a single node?  You don't need CTDB if
you are only using a single node.

Can you reproduce this issue with a single entry in the /etc/ctdb/nodes file?
Comment 10 Amitay Isaacs 2014-08-25 14:06:24 UTC
Looks like the active node was not the first node in the list.  You cannot change the order of the nodes.  Keep the IP address at the same place in the nodes file.

For example, if the node was third node, then you have to make sure there are two blank lines before the IP address of the active node.
Comment 11 Scott Goldman 2014-08-25 14:22:37 UTC
OK. thanks.. good to know.. we will implement this tomorrow morning @5am along with our regular restart.

Scott Goldman
ESPN, Inc.
Senior Director, Platform Engineering and Storage & Backup Services
ph: 860-766-4227
fx: 860-766-4502
email: scott.goldman@espn.com 


-----Original Message-----
From: samba-bugs@samba.org [mailto:samba-bugs@samba.org] 
Sent: Monday, August 25, 2014 10:06 AM
To: Goldman, Scott
Subject: [Bug 10752] Apparent Memory leak in CTDB version: 2.5.3

https://bugzilla.samba.org/show_bug.cgi?id=10752

--- Comment #10 from Amitay Isaacs <amitay@samba.org> 2014-08-25 14:06:24 UTC ---
Looks like the active node was not the first node in the list.  You cannot
change the order of the nodes.  Keep the IP address at the same place in the
nodes file.

For example, if the node was third node, then you have to make sure there are
two blank lines before the IP address of the active node.
Comment 12 Scott Goldman 2014-08-26 01:10:37 UTC
leaving the first line blank also did not work, but i figured out what would..
/etc/ctdb/nodes is now just 1 line (10.78.155.32), but this time i copied all the ".1" tdb files to ".0".. that worked like a charm. i now have 1 node cluster.

IF that turns out to solve our issue, i would say there is still a memeory leak somewhere.. you cant say that it is impermissable to keep 1 node (of a 2 node cluster) offline for more than a day without blowing up the "good" node.

That said, we'll run for a while and provide the same ctdb dumpmemory and statistics (after turning off smb).

thanks
Comment 13 Amitay Isaacs 2014-09-02 03:56:02 UTC
Has there been any change since you modified the configuration to include only a single node?
Comment 14 Scott Goldman 2014-09-02 13:33:04 UTC
we are still resetting ctdb every nght. As i asked, isnt it still a bug that you cant keep a nde offline without causing a memory leak on the up node (i assume recmaster)?  I cant stop resetting until i know we are ok - it hurts too much to reset when slow (too much used memory)

is there an alternative? does keeping ctdb up but stopped offer a better solution?


thanks
Scott
Comment 15 Amitay Isaacs 2014-09-02 13:49:09 UTC
Are you seeing the memory increase even with a single node in the list?

Strictly speaking it's not a memory leak.  A node can be taken offline for maintenance for short duration (few hours). But if you are taking the node offline permanently, then it's better to delete the node rather than running the cluster without it.

Also, I don't understand why you are using CTDB if you are only using a single node.
Comment 16 Scott Goldman 2014-09-02 13:56:57 UTC
under ctdb 1.0, when more than 1 node was up, the cluster would (at times) crawl to a halt - the fix was to shut down 3 of the 4 nodes and keep them as "warm standbys". this had been fine for 2 years.

when we moved to 2.5.3, we expected that we would be able to bring multipe nodes back online but we havent gotten to that point.. now that we identified this issue and are prepared to not roll back, we will upgraded a second node and move into an active/active config.  

as for whether we still have the leak, the only way to tell is by canceling the daily resest - we'll do that if you have confidence in the cause.

again, would using ctdb stop be better than shutting down ctdb?
Comment 17 Amitay Isaacs 2014-09-02 14:11:41 UTC
Ah, okay.  Now I understand the reason for resets. :-)

(In reply to comment #16)
> 
> as for whether we still have the leak, the only way to tell is by canceling the
> daily resest - we'll do that if you have confidence in the cause.

I would really like to know if there is a memory leak.  However, based on your configuration I am sure that the memory increase you have been seeing is because of the extra nodes in the list.  If you run CTDB with just a single node then you should not see the memory increase.
 
> again, would using ctdb stop be better than shutting down ctdb?

There should not be any difference between running a single node (without configuring multiple nodes) and running with multiple nodes with all-but-one nodes stopped.  So if you want to try multiple nodes configuration, ctdb stop would definitely be better than keeping the nodes shut down.
Comment 18 Scott Goldman 2014-09-03 20:49:35 UTC
no good.. by 8:30 this morning, we were slow again.  I shutdown smb and took the dumps (attached).  we are currently a single node cluster:
Number of nodes:1
pnn:0 xx.xx.xx.xx     OK (THIS NODE)
Generation:1372053624
Size:1
hash:0 lmaster:0
Recovery mode:NORMAL (0)
Recovery master:0
[cagenas02: SU: / 1385]
Comment 19 Scott Goldman 2014-09-03 20:50:29 UTC
Created attachment 10249 [details]
ctdb dumpmemory AFTER shutting down smb
Comment 20 Scott Goldman 2014-09-03 20:50:47 UTC
Created attachment 10250 [details]
ctdb statistics AFTER shutting down smb
Comment 21 Amitay Isaacs 2014-09-04 08:20:39 UTC
I don't see any issues with ctdb daemon memory usage anymore.  The additional memory reported is used for the deleted records.  Once vacuuming kicks in all that memory would get freed up.

(In reply to comment #18)
> no good.. by 8:30 this morning, we were slow again.

This seems to be a separate issue from the memory usage.  If this a CTDB issue, may be open new defect to track it.