Bug #12516
Updated by Kefu Chai almost 9 years ago
mpstat sees high %user in libtcmalloc.so.4.1.2 While a fio test is running on a RBD image mapped to a VM. Running “perf top” running on an OSD shows : 34.37% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans 18.06% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache 13.76% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans 1.45% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::RemoveRange The "mpstat -P ALL 1" on the ceph OSD node shows the value between 80 and 90 in the %user column. https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg23575.html As described in these notes, the gperftools-2.1.90 has the fix. The gperftools’s version in ceph 0.80.9 is gperftools-libs-2.1-1.el7 There is also an environment variable TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES and wonder the right value to set this environment variable with? and We are looking for TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES right value. Version-Release number of selected component (if applicable): rpm -qa | grep ceph ceph-common-0.80.9-0.el7.x86_64 ceph-0.80.9-0.el7.x86_64 libcephfs1-0.80.9-0.el7.x86_64 python-ceph-0.80.9-0.el7.x86_64 cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.1 (Maipo) rpm -qa | grep gperftools gperftools-libs-2.1-1.el7 <pre> $ cat test.fio [global] ioengine=libaio iodepth=32 rw=randwrite runtime=60 bs=16k direct=1 buffered=0 size=1024M numjobs=4 group_reporting [test] directory=/mnt/test </pre> To run the fio in the VM, <pre> $ fio test.fio </pre> We have started working on this issue and found that without this patch TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES value will not show any affect. TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES doesn't affect tcmalloc behavior https://code.google.com/p/gperftools/issues/detail?id=585 patch can be found in above given link. So as per my understanding we need to backport this patch to ceph shipped gperftools and also need to modify our init script if 32 M is not enough. file /etc/init.d/ceph and add one line : cmd="TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=<the-right-value> $cmd" After @[ [ -n "$max_open_files" ] && files="ulimit -n $max_open_files;"@ $max_open_files;" and before @if if [ -n "$SYSTEMD_RUN" ];@. ];. so final script would be : <pre> [ -n "$max_open_files" ] && files="ulimit -n $max_open_files;" cmd="TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=<the-right-value> $cmd" if [ -n "$SYSTEMD_RUN" ]; then cmd="$SYSTEMD_RUN -r bash -c '$files $cmd --cluster $cluster -f'" else cmd="$files $wrap $cmd --cluster $cluster $runmode" fi </pre>