Bug #37980
openluminous: osd memery use very high,and missmatch between res and heap stats
0%
Description
ceph 12.2.1
3 nodes, 30 osds per node
ec pool:4+2
After running for 2 months,we find some osds memery use very high in top lists,4-5G,and the heap stats like this:
top - 10:45:01 up 73 days, 1:20, 1 user, load average: 10.26, 9.74, 9.95
Tasks: 657 total, 3 running, 654 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6.6 us, 6.8 sy, 0.0 ni, 82.7 id, 3.7 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem: 65325552 total, 64448092 used, 877460 free, 90120 buffers
KiB Swap: 0 total, 0 used, 0 free. 446164 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18385 ceph 20 0 7369196 5.114g 7232 S 6.1 8.2 2873:28 /usr/bin/ceph-osd -f --cluster ceph --id 61 --setuser ceph --setg+
osd.61 tcmalloc heap stats:
MALLOC: 2239198296 ( 2135.5 MiB) Bytes in use by application
MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist
MALLOC: + 72369432 ( 69.0 MiB) Bytes in central cache freelist
MALLOC: + 13839792 ( 13.2 MiB) Bytes in transfer cache freelist
MALLOC: + 104315104 ( 99.5 MiB) Bytes in thread cache freelists
MALLOC: + 25096352 ( 23.9 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 2454818976 ( 2341.1 MiB) Actual memory used (physical + swap)
MALLOC: + 4095991808 ( 3906.2 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 6550810784 ( 6247.3 MiB) Virtual address space used
MALLOC:
MALLOC: 136858 Spans in use
MALLOC: 63 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
RES shows osd.61 memery use 5G+,but heap stats "Actual memory used" just 2G+,and we find that the osds with high res also have high "Bytes released to OS".
After we restart the osd, the memery use is released.
Has anyone encountered a similar problem?