Project

General

Profile

Actions

Bug #1298

closed

osd: memory leak in 3f708ee

Added by Sage Weil almost 13 years ago. Updated almost 13 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

dho osds running 3f708ee are leaking memory

Actions #1

Updated by Sage Weil almost 13 years ago

It occurs to me that there were many degraded PGs, and currently that means the pg logs aren't trimmed and are kept in memory. This may explain the memory usage.

Could it be that the memory utilization jump on cosd restart (upgrade) was that the peering loaded more log off disk than what in memory before the restart? Or that the peak heap usage was just higher (all peering at once vs staggered over a couple weeks of node failures) and the process didn't release memory to the OS?

Actions #2

Updated by Greg Farnum almost 13 years ago

If you dump the heap stats on eg osd1 (ceph osd tell 1 heap stats), you get:

2011-07-19 08:26:07.228248 osd1 [2607:f298:1:2233::5522]:6800/2629 38 : [INF] osd.1tcmalloc heap stats:------------------------------------------------
2011-07-19 08:26:07.228260 osd1 [2607:f298:1:2233::5522]:6800/2629 39 : [INF] MALLOC:    618070016 (  589.4 MB) Heap size
2011-07-19 08:26:07.228267 osd1 [2607:f298:1:2233::5522]:6800/2629 40 : [INF] MALLOC:     39679336 (   37.8 MB) Bytes in use by application
2011-07-19 08:26:07.228275 osd1 [2607:f298:1:2233::5522]:6800/2629 41 : [INF] MALLOC:    526712832 (  502.3 MB) Bytes free in page heap
2011-07-19 08:26:07.228282 osd1 [2607:f298:1:2233::5522]:6800/2629 42 : [INF] MALLOC:     45207552 (   43.1 MB) Bytes unmapped in page heap
2011-07-19 08:26:07.228289 osd1 [2607:f298:1:2233::5522]:6800/2629 43 : [INF] MALLOC:      2206376 (    2.1 MB) Bytes free in central cache
2011-07-19 08:26:07.228296 osd1 [2607:f298:1:2233::5522]:6800/2629 44 : [INF] MALLOC:       101888 (    0.1 MB) Bytes free in transfer cache
2011-07-19 08:26:07.228303 osd1 [2607:f298:1:2233::5522]:6800/2629 45 : [INF] MALLOC:      4162032 (    4.0 MB) Bytes free in thread caches
2011-07-19 08:26:07.228311 osd1 [2607:f298:1:2233::5522]:6800/2629 46 : [INF] MALLOC:         9022              Spans in use
2011-07-19 08:26:07.228319 osd1 [2607:f298:1:2233::5522]:6800/2629 47 : [INF] MALLOC:          211              Thread heaps in use
2011-07-19 08:26:07.228326 osd1 [2607:f298:1:2233::5522]:6800/2629 48 : [INF] MALLOC:     12058624 (   11.5 MB) Metadata allocated
2011-07-19 08:26:07.228334 osd1 [2607:f298:1:2233::5522]:6800/2629 49 : [INF] ------------------------------------------------

So at this point some of the extra RAM is definitely being used by the profiler, not sure how much that accounts for. (I find it odd that it hasn't dumped a profile in several days though; did somebody turn that off again?)

Actions #3

Updated by Colin McCabe almost 13 years ago

It might be interesting to attach with gdb and run .capacity() on some of the std::vector elements of class PG.

Actions #4

Updated by Sage Weil almost 13 years ago

  • Target version changed from v0.32 to v0.33
Actions #5

Updated by Sage Weil almost 13 years ago

  • Status changed from New to Can't reproduce
Actions #6

Updated by Sage Weil almost 13 years ago

  • Status changed from Can't reproduce to Duplicate
Actions #7

Updated by Sage Weil almost 13 years ago

  • Target version deleted (v0.33)
Actions

Also available in: Atom PDF