Bug #61702
openceph-mgr process RSS memory usage grows continuously
0%
Description
In a ceph cluster using nautilus v14.2.22, it was recorded processes* memory usage every 3h*. Then saw the RSS of ceph-mgr grew continuously.
PPID PID NLWP RSS VSZ CMD
110888 113509 59 285728 1288728 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 286256 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 286776 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 287072 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 287088 1289752 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 287928 1290776 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 288656 1291544 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 288936 1291544 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 289240 1291544 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290240 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290504 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290504 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290504 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290512 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290524 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
110888 113509 59 290808 1293848 /usr/bin/ceph-mgr --cluster ceph --conf /etc/ceph/ceph.conf --id controller-0 -f
In this case, the growth average was about 3 MiB a day.
The growth average follows the size of the cluster. In other cluster ceph, bigger than the first one, the growth average was about 10MiB a day.
ENVIRONMENT
Kubernetes 1.24.24
Ceph version: ceph version 14.2.22 (742a9ee1d2a5d5c87096f7c0098035af5c73aa19) nautilus (stable)
Linux version 5.10.0-6-amd64
Updated by Konstantin Shalygin 11 months ago
What exactly size of cluster & what ceph-mgr modules is enabled?
ceph mgr module ls | jq '.enabled_modules'
Updated by Gabriel Cabral 11 months ago
Konstantin Shalygin wrote:
What exactly size of cluster & what ceph-mgr modules is enabled?
[...]
In both ceph clusters I mentioned in the description, there was only one ceph-mgr module enabled:
$ ceph mgr module ls | grep -A10 '.enabled_modules'
"enabled_modules": [
"restful"
],
"disabled_modules": [
...
I'm doing some tests with StarlingX open-source system. About the exact size, In the first ceph cluster I mentioned was a Simplex with 1osd, and the other cluster ceph with 10MiB growth average a day was a Duplex system, with 2 osds.
Updated by Konstantin Shalygin 11 months ago
I think this is not ceph-mgr itself, but 'restful' module
Try to disable this module via `ceph mgr module disable restful`
Updated by Radoslaw Zarzynski 9 months ago
- Status changed from New to Need More Info
This might be a duplicate of #59580.