Project

General

Profile

Actions

Bug #8648

closed

Standby MDS leaks memory over time

Added by Milosz Tanski almost 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've discovered in my Ceph cluster that the MDS overtime will leak memory. In my case it usually takes a week two or to notice and after two months eventually the machine will kill the MDS server (OOM killed). It looks like this only happens on the standby (stanby-active) MDS, I haven't observed such leaks on the primary. If I switch which one is the standby MDS (due to a version bounce, or restart due to OS fixes) then that machine's MDS starts leaking.

I'm currently running Firefly but this issue has been present for a long time (Dumping & Emperor).

Actions #1

Updated by Greg Farnum almost 10 years ago

I believe we're leaking CInodes in open_root_inode et al.

Actions #2

Updated by Sage Weil over 9 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
Actions #3

Updated by Sage Weil over 9 years ago

  • Status changed from New to 12
  • Priority changed from Normal to High
  • Source changed from other to Community (dev)

Any change you can run one of these in standby under massif for a while? that will tell us what is leaking!

Actions #4

Updated by Zheng Yan over 9 years ago

  • Status changed from 12 to Fix Under Review
Actions #5

Updated by Zheng Yan over 9 years ago

  • Status changed from Fix Under Review to Resolved

fixed by commit eae88dad4c32e4bb5fb255ec4bf1be18b09d498e

Actions

Also available in: Atom PDF