Actions
Subtask #9890
closedBug #9889: mon: leveldb weirdness
mon: VIRT usage 2.4G larger than tcmalloc's VIRT stats (dumpling, centos6.3)
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
- centos 6.3
- ceph version 0.67.11 (bc8b67bef6309a32361be76cd11fb56b057ea9d2)
- Stressing the monitors with qa/workunits/mon/workloadgen.sh without cleanup
while [ 1 ]; do LOADGEN_NUM_OSDS=20 DURATION=120 ./mon_workloadgen.sh || break ; done
which eventually results in lots of osds down/out and significantly large OSDMaps (2.6MB map for 4431 osds).
- Running a loop of 'ceph log foo' with 10 parallel jobs.
- No mon thrashing involved.
'top' reports a 2.5GB VIRT usage for any of the monitors, 90MB RES.
'ceph heap stats':
[ubuntu@vpm090 ~]$ ceph tell mon.a heap stats mon.atcmalloc heap stats:------------------------------------------------ MALLOC: 10122344 ( 9.7 MiB) Bytes in use by application MALLOC: + 7536640 ( 7.2 MiB) Bytes in page heap freelist MALLOC: + 14149264 ( 13.5 MiB) Bytes in central cache freelist MALLOC: + 20772480 ( 19.8 MiB) Bytes in transfer cache freelist MALLOC: + 13545096 ( 12.9 MiB) Bytes in thread cache freelists MALLOC: + 1264792 ( 1.2 MiB) Bytes in malloc metadata MALLOC: ------------ MALLOC: = 67390616 ( 64.3 MiB) Actual memory used (physical + swap) MALLOC: + 62062592 ( 59.2 MiB) Bytes released to OS (aka unmapped) MALLOC: ------------ MALLOC: = 129453208 ( 123.5 MiB) Virtual address space used MALLOC: MALLOC: 1555 Spans in use MALLOC: 13 Thread heaps in use MALLOC: 32768 Tcmalloc page size ------------------------------------------------
Updated by Joao Eduardo Luis over 9 years ago
mon.c (in quorum) is being the synchronization provider for mon.b (restarted with valgrind memcheck).
mon.c's spiked to 2.6GB VIRT, 533MB RES, 470MB SHR.
mon.a's stayed the same at 2.5GB VIRT, 81MB RES, 7MB SHR.
mon.b's is at 2.8GB VIRT, 246MB RES, 10MB SHR.
Updated by Joao Eduardo Luis over 9 years ago
forgot to mention that leveldb stores for all mons are several GB large, even after compaction:
[ubuntu@vpm090 ~]$ du -chs dev/mon.* 6.7G dev/mon.a 7.0G dev/mon.b 6.8G dev/mon.c 21G total
Likely due to osdmaps not having been trimmed for ages (test ran for 2 days in a row without even been "clean")
Updated by Sage Weil about 7 years ago
- Status changed from New to Can't reproduce
Actions