occasionally excessive mon memory footprint
I have 3 mons that share disks with osds. Sometimes, when btrfs gets into a mode in which syncs are delayed, the mons get into a state in which many subsequent elections get different results, and mons that used to be in the active set end up being kicked out for lagging behind. In these circumstances, if they were primary, they appear to start piling up messages to be relayed to the primary, and memory use grows, apparently exponentially.
The attached memory profile is from mon.1; it had grown from the baseline memory use of about 120MB to 16GB of virtual memory, 12.5GB heap, before I killed it. mon.0 had at the same time grown from the same baseline to some 3.5GB of virtual memory, but its heap, that peaked at 2.5GB, had gone back down to 125MB. mon.2 never went past the baseline.
This was collected with 0.35, but I had run into this with many earlier versions of ceph.
#1 Updated by Alexandre Oliva almost 10 years ago
I've just run into this while only two out of the 3 mons were up: mon.0 was taking several minutes to complete a sync (a btrfs bug I've been looking into), and mon.1's memory use was at almost 16GB when I restarted it. So it doesn't take a third lagging monitor to trigger the problem: perhaps a lagging primary is the trigger.