Project

General

Profile

Bug #12513

TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable defualt value 32 MB is enough for Ceph daemons ?

Added by Vikhyat Umrao about 4 years ago. Updated over 3 years ago.

Status:
New
Priority:
High
Assignee:
-
Target version:
-
Start date:
07/29/2015
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

mpstat sees high %user in libtcmalloc.so.4.1.2

While a fio test is running on a RBD image mapped to a VM.

Running “perf top” running on an OSD shows :

34.37%  libtcmalloc.so.4.1.2  [.] tcmalloc::CentralFreeList::FetchFromSpans
18.06% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
13.76% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
1.45% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::RemoveRange

The "mpstat -P ALL 1" on the ceph OSD node shows the value between 80 and 90 in the %user column.

https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg23575.html

As described in these notes, the gperftools-2.1.90 has the fix. The gperftools’s version in ceph 0.80.9 is gperftools-libs-2.1-1.el7

There is also an environment variable TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES and wonder the right value to set this environment variable with?

and We are looking for TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES right value.

Version-Release number of selected component (if applicable):

  1. rpm -qa | grep ceph
    ceph-common-0.80.9-0.el7.x86_64
    ceph-0.80.9-0.el7.x86_64
    libcephfs1-0.80.9-0.el7.x86_64
    python-ceph-0.80.9-0.el7.x86_64
  1. cat /etc/redhat-release
    Red Hat Enterprise Linux Server release 7.1 (Maipo)
  1. rpm -qa | grep gperftools
    gperftools-libs-2.1-1.el7

$ cat test.fio
[global]
ioengine=libaio
iodepth=32
rw=randwrite
runtime=60
bs=16k
direct=1
buffered=0
size=1024M
numjobs=4
group_reporting

[test]
directory=/mnt/test

To run the fio in the VM,

$ fio test.fio

We have started working on this issue and found that without this patch
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES value will not show any affect.

TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES doesn't affect tcmalloc behavior
https://code.google.com/p/gperftools/issues/detail?id=585

patch can be found in above given link.

So as per my understanding we need to backport this patch to ceph shipped gperftools and also need to modify our init script if 32 M is not enough.

file /etc/init.d/ceph and add one line :

cmd="TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=<the-right-value> $cmd"

After [ -n "$max_open_files" ] && files="ulimit -n $max_open_files;" and before if [ -n "$SYSTEMD_RUN" ];.

so final script would be :

[ -n "$max_open_files" ] && files="ulimit -n $max_open_files;"
cmd="TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=<the-right-value> $cmd"
if [ -n "$SYSTEMD_RUN" ]; then
cmd="$SYSTEMD_RUN -r bash -c '$files $cmd --cluster $cluster -f'"
else
cmd="$files $wrap $cmd --cluster $cluster $runmode"
fi

History

#1 Updated by Vikhyat Umrao about 4 years ago

By mistake i have create this issue in ceph-dokan projet please close or delete as I have created a new http://tracker.ceph.com/issues/12516 in ceph project.

#2 Updated by Vikhyat Umrao about 4 years ago

Vikhyat Umrao wrote:

By mistake i have create this issue in ceph-dokan project please close or delete as I have created a new http://tracker.ceph.com/issues/12516 in ceph project.

#3 Updated by Shengzhi Meng about 4 years ago

OK, got it.

#4 Updated by Kenneth Waegeman almost 4 years ago

when using gpertools 2.4, is setting this environment variable enough to enable a larger thread cache? Can we check this somewhere on running daemons?

Also available in: Atom PDF