Project

General

Profile

Actions

Bug #1590

closed

occasionally excessive mon memory footprint

Added by Alexandre Oliva over 12 years ago. Updated over 12 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have 3 mons that share disks with osds. Sometimes, when btrfs gets into a mode in which syncs are delayed, the mons get into a state in which many subsequent elections get different results, and mons that used to be in the active set end up being kicked out for lagging behind. In these circumstances, if they were primary, they appear to start piling up messages to be relayed to the primary, and memory use grows, apparently exponentially.

The attached memory profile is from mon.1; it had grown from the baseline memory use of about 120MB to 16GB of virtual memory, 12.5GB heap, before I killed it. mon.0 had at the same time grown from the same baseline to some 3.5GB of virtual memory, but its heap, that peaked at 2.5GB, had gone back down to 125MB. mon.2 never went past the baseline.

This was collected with 0.35, but I had run into this with many earlier versions of ceph.


Files

hugemon.pdf (7.98 KB) hugemon.pdf peak memory use graph for mon.1, before I killed it Alexandre Oliva, 10/01/2011 01:23 PM

Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Feature #1646: mon: catch up on committed items before attempting to join quorumResolvedSage Weil10/21/2011

Actions
Actions #1

Updated by Alexandre Oliva over 12 years ago

I've just run into this while only two out of the 3 mons were up: mon.0 was taking several minutes to complete a sync (a btrfs bug I've been looking into), and mon.1's memory use was at almost 16GB when I restarted it. So it doesn't take a third lagging monitor to trigger the problem: perhaps a lagging primary is the trigger.

Actions #2

Updated by Sage Weil over 12 years ago

  • Category set to Monitor

this will go away with #1646.

Actions #3

Updated by Sage Weil over 12 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF