Project

General

Profile

Subtask #9890

Bug #9889: mon: leveldb weirdness

mon: VIRT usage 2.4G larger than tcmalloc's VIRT stats (dumpling, centos6.3)

Added by Joao Eduardo Luis over 8 years ago. Updated almost 6 years ago.

Status:
Can't reproduce
Priority:
Normal
Category:
Monitor
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

  • centos 6.3
  • ceph version 0.67.11 (bc8b67bef6309a32361be76cd11fb56b057ea9d2)
  • Stressing the monitors with qa/workunits/mon/workloadgen.sh without cleanup
while [ 1 ]; do LOADGEN_NUM_OSDS=20 DURATION=120 ./mon_workloadgen.sh || break ; done

which eventually results in lots of osds down/out and significantly large OSDMaps (2.6MB map for 4431 osds).

  • Running a loop of 'ceph log foo' with 10 parallel jobs.
  • No mon thrashing involved.

'top' reports a 2.5GB VIRT usage for any of the monitors, 90MB RES.

'ceph heap stats':

[ubuntu@vpm090 ~]$ ceph tell mon.a heap stats
mon.atcmalloc heap stats:------------------------------------------------
MALLOC:       10122344 (    9.7 MiB) Bytes in use by application
MALLOC: +      7536640 (    7.2 MiB) Bytes in page heap freelist
MALLOC: +     14149264 (   13.5 MiB) Bytes in central cache freelist
MALLOC: +     20772480 (   19.8 MiB) Bytes in transfer cache freelist
MALLOC: +     13545096 (   12.9 MiB) Bytes in thread cache freelists
MALLOC: +      1264792 (    1.2 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =     67390616 (   64.3 MiB) Actual memory used (physical + swap)
MALLOC: +     62062592 (   59.2 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =    129453208 (  123.5 MiB) Virtual address space used
MALLOC:
MALLOC:           1555              Spans in use
MALLOC:             13              Thread heaps in use
MALLOC:          32768              Tcmalloc page size
------------------------------------------------

History

#1 Updated by Joao Eduardo Luis over 8 years ago

mon.c (in quorum) is being the synchronization provider for mon.b (restarted with valgrind memcheck).

mon.c's spiked to 2.6GB VIRT, 533MB RES, 470MB SHR.
mon.a's stayed the same at 2.5GB VIRT, 81MB RES, 7MB SHR.
mon.b's is at 2.8GB VIRT, 246MB RES, 10MB SHR.

#2 Updated by Joao Eduardo Luis over 8 years ago

forgot to mention that leveldb stores for all mons are several GB large, even after compaction:

[ubuntu@vpm090 ~]$ du -chs dev/mon.*
6.7G    dev/mon.a
7.0G    dev/mon.b
6.8G    dev/mon.c
21G     total

Likely due to osdmaps not having been trimmed for ages (test ran for 2 days in a row without even been "clean")

#3 Updated by Sage Weil almost 6 years ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF