Actions
Bug #48381
closedMemory leak ceph-mon in ConfigMonitor::load_config
% Done:
0%
Description
Hello everyone!
I have a memory leak on 2 clusters: QAS, PROD after update from Ceph version 13.2.5 to 15.2.5 in ceph-mon process.
One hour uptime statistics:
# ceph tell mon.service01-qas heap dump
mon.service01-qas dumping heap profile now.
------------------------------------------------
MALLOC: 222469760 ( 212.2 MiB) Bytes in use by application
MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist
MALLOC: + 6777856 ( 6.5 MiB) Bytes in central cache freelist
MALLOC: + 3080704 ( 2.9 MiB) Bytes in transfer cache freelist
MALLOC: + 7320448 ( 7.0 MiB) Bytes in thread cache freelists
MALLOC: + 3014656 ( 2.9 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 242663424 ( 231.4 MiB) Actual memory used (physical + swap)
MALLOC: + 2154496 ( 2.1 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 244817920 ( 233.5 MiB) Virtual address space used
MALLOC:
MALLOC: 10291 Spans in use
MALLOC: 24 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
This is result after 11 days of proccess running:
# ceph tell mon.service01-qas heap stats
mon.service01-qas tcmalloc heap stats:------------------------------------------------
MALLOC: 3582733448 ( 3416.8 MiB) Bytes in use by application
MALLOC: + 16384 ( 0.0 MiB) Bytes in page heap freelist
MALLOC: + 64338336 ( 61.4 MiB) Bytes in central cache freelist
MALLOC: + 8880896 ( 8.5 MiB) Bytes in transfer cache freelist
MALLOC: + 41391832 ( 39.5 MiB) Bytes in thread cache freelists
MALLOC: + 28442624 ( 27.1 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 3725803520 ( 3553.2 MiB) Actual memory used (physical + swap)
MALLOC: + 7127040 ( 6.8 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 3732930560 ( 3560.0 MiB) Virtual address space used
MALLOC:
MALLOC: 437764 Spans in use
MALLOC: 25 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
# pprof --text /usr/bin/ceph-mon /var/log/ceph/mon.service01-qas.profile.1900.heap
Using local file /usr/bin/ceph-mon.
Using local file /var/log/ceph/mon.service01-qas.profile.1900.heap.
Total: 3150.8 MB
2702.7 85.8% 85.8% 3078.1 97.7% ConfigMonitor::load_config
375.5 11.9% 97.7% 375.5 11.9% std::string::_Rep::_S_create
60.4 1.9% 99.6% 60.4 1.9% rocksdb::BlockFetcher::ReadBlockContents
8.4 0.3% 99.9% 8.4 0.3% ceph::logging::Log::submit_entry
1.0 0.0% 99.9% 1.0 0.0% rocksdb::WritableFileWriter::Append
0.6 0.0% 99.9% 0.6 0.0% ceph::buffer::v15_2_0::list::refill_append_space
0.5 0.0% 99.9% 0.5 0.0% std::_Rb_tree::_M_copy
0.5 0.0% 100.0% 0.5 0.0% rocksdb_cache::BinnedLRUHandleTable::Resize
Cluster state:
# ceph status
cluster:
id: 7a2137a1-a6c6-401c-ab31-37da3ba074b8
health: HEALTH_OK
services:
mon: 4 daemons, quorum service01-qas,service02-qas,service03-qas,service04-qas (age 10d)
mgr: service04-qas(active, since 5w), standbys: service02-qas, service03-qas, service01-qas
mds: remote_files:1 {0=service04-qas=up:active} 3 up:standby
osd: 4 osds: 4 up (since 4w), 4 in (since 4w); 32 remapped pgs
task status:
scrub status:
mds.service04-qas: idle
data:
pools: 8 pools, 256 pgs
objects: 195.79k objects, 35 GiB
usage: 91 GiB used, 1.9 TiB / 2.0 TiB avail
pgs: 224 active+clean
32 active+clean+remapped
Configuration:
# ceph config show mon.service01-qas
NAME VALUE SOURCE OVERRIDES IGNORES
auth_client_required cephx file
auth_cluster_required cephx file
auth_service_required cephx file
daemonize false override
keyring $mon_data/keyring default
leveldb_block_size 65536 default
leveldb_cache_size 536870912 default
leveldb_compression false default
leveldb_log default
leveldb_write_buffer_size 33554432 default
mon_host 10.189.44.81,10.189.44.82,10.189.44.83,10.189.44.84 file
mon_initial_members service01-qas, service02-qas, service03-qas, service04-qas file
no_config_file false override
public_network 10.189.44.0/24 file
rbd_default_features 61 default
setgroup ceph cmdline
setuser ceph cmdline
Actions