Project

General

Profile

Actions

Bug #61702

open

ceph-mgr process RSS memory usage grows continuously

Added by Gabriel Cabral 11 months ago. Updated 9 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In a ceph cluster using nautilus v14.2.22, it was recorded processes* memory usage every 3h*. Then saw the RSS of ceph-mgr grew continuously.

PPID PID NLWP RSS VSZ CMD

110888 113509 59 285728 1288728 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 286256 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 286776 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 287072 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 287088 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 287928 1290776 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 288656 1291544 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 288936 1291544 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 289240 1291544 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290240 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290504 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290504 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290504 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290524 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290808 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f

In this case, the growth average was about 3 MiB a day.

The growth average follows the size of the cluster. In other cluster ceph, bigger than the first one, the growth average was about 10MiB a day.

ENVIRONMENT
Kubernetes 1.24.24
Ceph version: ceph version 14.2.22 (742a9ee1d2a5d5c87096f7c0098035af5c73aa19) nautilus (stable)
Linux version 5.10.0-6-amd64

Actions #1

Updated by Konstantin Shalygin 11 months ago

What exactly size of cluster & what ceph-mgr modules is enabled?

ceph mgr module ls | jq '.enabled_modules'

Actions #2

Updated by Gabriel Cabral 11 months ago

Konstantin Shalygin wrote:

What exactly size of cluster & what ceph-mgr modules is enabled?
[...]

In both ceph clusters I mentioned in the description, there was only one ceph-mgr module enabled:

$ ceph mgr module ls | grep -A10 '.enabled_modules'
    "enabled_modules": [
        "restful" 
    ],
    "disabled_modules": [
         ...

I'm doing some tests with StarlingX open-source system. About the exact size, In the first ceph cluster I mentioned was a Simplex with 1osd, and the other cluster ceph with 10MiB growth average a day was a Duplex system, with 2 osds.

Actions #3

Updated by Konstantin Shalygin 11 months ago

I think this is not ceph-mgr itself, but 'restful' module
Try to disable this module via `ceph mgr module disable restful`

Actions #4

Updated by Radoslaw Zarzynski 9 months ago

  • Status changed from New to Need More Info

This might be a duplicate of #59580.

Actions

Also available in: Atom PDF