Project

General

Profile

Bug #3067

mon: runaway memory

Added by Sage Weil over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

One of the think I notice is that, mon seems to eat up a lot of memory. Here is
some info on another machine:

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND           
 2126 root      20   0 1397m 459m    0 S  13.6  1.4   1208:21 ceph-osd          
 1986 root      20   0 1272m 220m    0 S  11.6  0.7   1217:09 ceph-osd          
 2499 root      20   0 1087m 171m    0 S   1.0  0.5 386:43.58 ceph-osd          
 9705 root      20   0     0    0    0 S   1.0  0.0   0:11.87 kworker/0:0       
 1717 root      20   0 68.0g  25g  456 S   0.7 81.9   9064:40 ceph-mon          
 2056 root      20   0 1051m 155m    0 S   0.7  0.5 402:21.84 ceph-osd          
 2337 root      20   0 1108m 177m    0 S   0.7  0.6 407:13.85 ceph-osd          
 2672 root      20   0 1134m 190m    0 S   0.7  0.6 500:59.67 ceph-osd     

argonaut.  from #2026

History

#1 Updated by Sage Weil over 11 years ago

If you can't ssh you probably need to power cycle the machine and restart the daemons. It sounds like there is some bug that is causing memory utilization to run away. If this is fully reproducible (it really happens every couple of days), the ideal thing would be if you could run ceph-mon through valgrind massif, e.g.

valgrind --tool=massif ceph-mon -i a

and then stop it when you see memory usage starting to go crazy. That should generate a file called massif.out.$pid that will tell us what the memory is being used for/by.

#2 Updated by Sage Weil over 11 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF