Bug #12516
closedTCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable defualt value 32 MB is enough for Ceph daemons ?
0%
Description
mpstat sees high %user in libtcmalloc.so.4.1.2
While a fio test is running on a RBD image mapped to a VM.
Running “perf top” running on an OSD shows :
34.37% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
18.06% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
13.76% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
1.45% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::RemoveRange
The "mpstat -P ALL 1" on the ceph OSD node shows the value between 80 and 90 in the %user column.
https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg23575.html
As described in these notes, the gperftools-2.1.90 has the fix. The gperftools’s version in ceph 0.80.9 is gperftools-libs-2.1-1.el7
There is also an environment variable TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES and wonder the right value to set this environment variable with?
and We are looking for TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES right value.
Version-Release number of selected component (if applicable):
rpm -qa | grep ceph
ceph-common-0.80.9-0.el7.x86_64
ceph-0.80.9-0.el7.x86_64
libcephfs1-0.80.9-0.el7.x86_64
python-ceph-0.80.9-0.el7.x86_64
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.1 (Maipo)
rpm -qa | grep gperftools
gperftools-libs-2.1-1.el7
$ cat test.fio [global] ioengine=libaio iodepth=32 rw=randwrite runtime=60 bs=16k direct=1 buffered=0 size=1024M numjobs=4 group_reporting [test] directory=/mnt/test
To run the fio in the VM,
$ fio test.fio
We have started working on this issue and found that without this patch
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES value will not show any affect.
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES doesn't affect tcmalloc behavior
https://code.google.com/p/gperftools/issues/detail?id=585
patch can be found in above given link.
So as per my understanding we need to backport this patch to ceph shipped gperftools and also need to modify our init script if 32 M is not enough.
file /etc/init.d/ceph and add one line :
cmd="TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=<the-right-value> $cmd"
After [ -n "$max_open_files" ] && files="ulimit -n $max_open_files;"
and before if [ -n "$SYSTEMD_RUN" ];
.
so final script would be :
[ -n "$max_open_files" ] && files="ulimit -n $max_open_files;" cmd="TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=<the-right-value> $cmd" if [ -n "$SYSTEMD_RUN" ]; then cmd="$SYSTEMD_RUN -r bash -c '$files $cmd --cluster $cluster -f'" else cmd="$files $wrap $cmd --cluster $cluster $runmode" fi