Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2022-06-30T05:43:40Z
Ceph
Redmine
bluestore - Bug #56424 (Resolved): bluestore_cache_other mempool entry leak
https://tracker.ceph.com/issues/56424
2022-06-30T05:43:40Z
alexandre derumier
aderumier@odiso.Com
<p>Hi,</p>
<p>I have an octopus cluster (15.2.16),</p>
<p>(I was first installed in octopus, no upgrade from previous ceph version)</p>
<p>where all osd have their bluestore_cache_other slowly growing<br />over time, using all osd memory, and reduce other pool<br />(bluestore_cache_onode ,bluestore_cache_data,bluestore_cache_meta,...)<br />to almost zero. Their performance and latency impact occur</p>
<p>I have attached a grafana screenshot with graphs of pools over<br />the last 2months</p>
<p>Usage of this cluster is 100% rbd with qemu vm with octopus librbd.<br />scrubbing is planned only the night (and the leak seem to be also<br />during the day, so I don't think it's related).</p>
<p>I'm backuping this cluster through rbd snap|export-diff|import|trim to<br />another ceph cluster each night too.</p>
<p>Does anybody known how to debug this or to have more infos about the<br />content of the bluestore_cache_other pool ?</p>
<p>I'm currently using bitmap allocator</p>
<pre>
bluestore_allocator = bitmap
bluefs_allocator = bitmap
</pre>
<p>bluefs buffered io is true <br /><pre>
bluefs_buffered_io = true
</pre></p>
<p>server have a lot a free memory. (no swap)<br /><pre>
#free -m
total utilisé libre partagé tamp/cache
disponible
Mem: 128845 49523 1145 1349 78176
76849
Partition d'échange: 0 0 0
</pre></p>
<p>osds are using around 2GB memory</p>
<pre>
#ps -aux
ceph 621071 28.1 1.6 5547460 2141148 ? Ssl mai17 17605:21
/usr/bin/ceph-osd -f --cluster ceph --id 8 --setuser ceph --setgroup
ceph
ceph 1386069 27.7 1.6 5474260 2111700 ? Ssl mai04 22668:44
/usr/bin/ceph-osd -f --cluster ceph --id 7 --setuser ceph --setgroup
ceph
ceph 2220877 27.1 1.6 5630312 2192092 ? Ssl avril21 27339:40
/usr/bin/ceph-osd -f --cluster ceph --id 14 --setuser ceph --setgroup
ceph
ceph 2220886 26.4 1.6 5512988 2184240 ? Ssl avril21 26690:21
/usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup
ceph
ceph 2220887 30.5 1.6 5599240 2166672 ? Ssl avril21 30788:21
/usr/bin/ceph-osd -f --cluster ceph --id 18 --setuser ceph --setgroup
ceph
ceph 2220892 26.1 1.5 5463992 2107960 ? Ssl avril21 26341:42
/usr/bin/ceph-osd -f --cluster ceph --id 16 --setuser ceph --setgroup
ceph
ceph 2220976 26.4 1.6 5580952 2152004 ? Ssl avril21 26698:44
/usr/bin/ceph-osd -f --cluster ceph --id 15 --setuser ceph --setgroup
ceph
ceph 2220994 30.0 1.6 5604840 2149032 ? Ssl avril21 30271:50
/usr/bin/ceph-osd -f --cluster ceph --id 2 --setuser ceph --setgroup
ceph
ceph 2221015 28.5 1.6 5613948 2169252 ? Ssl avril21 28783:38
/usr/bin/ceph-osd -f --cluster ceph --id 12 --setuser ceph --setgroup
ceph
ceph 2221080 28.0 1.5 5644560 2086976 ? Ssl avril21 28299:06
/usr/bin/ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup
ceph
ceph 2221120 30.2 1.6 5605180 2181240 ? Ssl avril21 30509:06
/usr/bin/ceph-osd -f --cluster ceph --id 17 --setuser ceph --setgroup
ceph
ceph 2221156 28.6 1.5 5613664 2109236 ? Ssl avril21 28891:45
/usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup
ceph
ceph 2221189 28.8 1.6 5665824 2188980 ? Ssl avril21 29070:01
/usr/bin/ceph-osd -f --cluster ceph --id 19 --setuser ceph --setgroup
ceph
ceph 2221276 26.8 1.5 5555660 2091880 ? Ssl avril21 27093:56
/usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup
ceph
ceph 2221277 27.7 1.5 5609368 2074836 ? Ssl avril21 27987:11
/usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup
ceph
ceph 2221278 28.4 1.6 5596020 2147776 ? Ssl avril21 28714:16
/usr/bin/ceph-osd -f --cluster ceph --id 9 --setuser ceph --setgroup
ceph
ceph 2221564 27.0 1.5 5569916 2103536 ? Ssl avril21 27291:15
/usr/bin/ceph-osd -f --cluster ceph --id 13 --setuser ceph --setgroup
ceph
ceph 2221655 32.2 1.6 5680616 2146472 ? Ssl avril21 32443:03
/usr/bin/ceph-osd -f --cluster ceph --id 11 --setuser ceph --setgroup
ceph
</pre>
<p>here the stats of 1 osd: (other osd have behaviour)</p>
<pre>
#ceph config set osd.6 mempool_debug true
#ceph daemon osd.5 dump_mempools
{
"mempool": {
"by_pool": {
"bloom_filter": {
"items": 0,
"bytes": 0,
"by_type": {
"unsigned char": {
"items": 0,
"bytes": 0
}
}
},
"bluestore_alloc": {
"items": 13024860,
"bytes": 104198880,
"by_type": {
"range_seg_t": {
"items": 0,
"bytes": 0
}
}
},
"bluestore_cache_data": {
"items": 259,
"bytes": 5077606
},
"bluestore_cache_onode": {
"items": 95,
"bytes": 58520,
"by_type": {
"BlueStore::Onode": {
"items": 95,
"bytes": 58520
}
}
},
"bluestore_cache_meta": {
"items": 2972660,
"bytes": 23797771,
"by_type": {
"BlueStore::ExtentMap::Shard": {
"items": 797,
"bytes": 12752
},
"char": {
"items": 5818,
"bytes": 5818
},
"std::_Rb_tree_node<std::pair<int const,
boost::intrusive_ptr<BlueStore::Blob> > >": {
"items": 68,
"bytes": 3264
},
"std::_Rb_tree_node<std::pair<unsigned int const,
std::unique_ptr<BlueStore::Buffer,
std::default_delete<BlueStore::Buffer> > > >": {
"items": 122,
"bytes": 5856
}
}
},
"bluestore_cache_other": {
"items": 80377013,
"bytes": 2794170136,
"by_type": {
"bluestore_pextent_t": {
"items": 4734,
"bytes": 75744
},
"bluestore_shared_blob_t": {
"items": 1,
"bytes": 72
},
"std::_Rb_tree_node<std::pair<unsigned long const,
bluestore_extent_ref_map_t::record_t> >": {
"items": 4,
"bytes": 192
}
}
},
"bluestore_Buffer": {
"items": 122,
"bytes": 11712,
"by_type": {
"BlueStore::Buffer": {
"items": 122,
"bytes": 11712
}
}
},
"bluestore_Extent": {
"items": 965,
"bytes": 46320,
"by_type": {
"BlueStore::Extent": {
"items": 965,
"bytes": 46320
}
}
},
"bluestore_Blob": {
"items": 942,
"bytes": 97968,
"by_type": {
"BlueStore::Blob": {
"items": 942,
"bytes": 97968
}
}
},
"bluestore_SharedBlob": {
"items": 942,
"bytes": 105504,
"by_type": {
"BlueStore::SharedBlob": {
"items": 942,
"bytes": 105504
}
}
},
"bluestore_inline_bl": {
"items": 2,
"bytes": 1990
},
"bluestore_fsck": {
"items": 0,
"bytes": 0
},
"bluestore_txc": {
"items": 11,
"bytes": 8184,
"by_type": {
"BlueStore::TransContext": {
"items": 11,
"bytes": 8184
}
}
},
"bluestore_writing_deferred": {
"items": 15,
"bytes": 77996
},
"bluestore_writing": {
"items": 52,
"bytes": 333908
},
"bluefs": {
"items": 3449,
"bytes": 59984,
"by_type": {
"BlueFS::Dir": {
"items": 3,
"bytes": 264
},
"BlueFS::File": {
"items": 70,
"bytes": 14560
},
"BlueFS::FileLock": {
"items": 1,
"bytes": 8
}
}
},
"bluefs_file_reader": {
"items": 124,
"bytes": 818688,
"by_type": {
"BlueFS::FileReader": {
"items": 60,
"bytes": 7680
},
"BlueFS::FileReaderBuffer": {
"items": 0,
"bytes": 0
}
}
},
"bluefs_file_writer": {
"items": 4,
"bytes": 896,
"by_type": {
"BlueFS::FileWriter": {
"items": 4,
"bytes": 896
}
}
},
"buffer_anon": {
"items": 8420,
"bytes": 4427920
},
"buffer_meta": {
"items": 715,
"bytes": 62920,
"by_type": {
"ceph::buffer::v15_2_0::raw_char": {
"items": 0,
"bytes": 0
},
"ceph::buffer::v15_2_0::raw_claimed_char": {
"items": 0,
"bytes": 0
},
"ceph::buffer::v15_2_0::raw_malloc": {
"items": 0,
"bytes": 0
},
"ceph::buffer::v15_2_0::raw_posix_aligned": {
"items": 715,
"bytes": 62920
},
"ceph::buffer::v15_2_0::raw_static": {
"items": 0,
"bytes": 0
}
}
},
"osd": {
"items": 101,
"bytes": 1306536,
"by_type": {
"PGPeeringEvent": {
"items": 0,
"bytes": 0
},
"PrimaryLogPG": {
"items": 101,
"bytes": 1306536
}
}
},
"osd_mapbl": {
"items": 0,
"bytes": 0
},
"osd_pglog": {
"items": 403016,
"bytes": 167032496,
"by_type": {
"std::_Rb_tree_node<std::pair<unsigned int const,
int> >": {
"items": 0,
"bytes": 0
},
"std::pair<osd_reqid_t, unsigned long>": {
"items": 0,
"bytes": 0
}
}
},
"osdmap": {
"items": 34875,
"bytes": 1428352,
"by_type": {
"OSDMap": {
"items": 51,
"bytes": 61608
},
"OSDMap::Incremental": {
"items": 0,
"bytes": 0
}
}
},
"osdmap_mapping": {
"items": 0,
"bytes": 0
},
"pgmap": {
"items": 0,
"bytes": 0,
"by_type": {
"PGMap": {
"items": 0,
"bytes": 0
},
"PGMap::Incremental": {
"items": 0,
"bytes": 0
},
"PGMapDigest": {
"items": 0,
"bytes": 0
}
}
},
"mds_co": {
"items": 0,
"bytes": 0
},
"unittest_1": {
"items": 0,
"bytes": 0
},
"unittest_2": {
"items": 0,
"bytes": 0
}
},
"total": {
"items": 96828642,
"bytes": 3103124287
}
}
}
</pre>
<pre>
#ceph tell osd.5 heap dump
osd.5 dumping heap profile now.
------------------------------------------------
MALLOC: 1935971080 ( 1846.3 MiB) Bytes in use by application
MALLOC: + 131072 ( 0.1 MiB) Bytes in page heap freelist
MALLOC: + 146011104 ( 139.2 MiB) Bytes in central cache freelist
MALLOC: + 8280576 ( 7.9 MiB) Bytes in transfer cache freelist
MALLOC: + 147054360 ( 140.2 MiB) Bytes in thread cache freelists
MALLOC: + 23986176 ( 22.9 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 2261434368 ( 2156.7 MiB) Actual memory used (physical +
swap)
MALLOC: + 2414272512 ( 2302.4 MiB) Bytes released to OS (aka
unmapped)
MALLOC: ------------
MALLOC: = 4675706880 ( 4459.1 MiB) Virtual address space used
MALLOC:
MALLOC: 142742 Spans in use
MALLOC: 86 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
------------------------------------------------
</pre>
<p>Maybe related:</p>
<p><a class="external" href="https://tracker.ceph.com/issues/55761">https://tracker.ceph.com/issues/55761</a></p>
ceph-volume - Bug #38323 (New): ceph-volume -osds-per-device : Unable to convert to integer with ...
https://tracker.ceph.com/issues/38323
2019-02-14T15:10:07Z
alexandre derumier
aderumier@odiso.Com
<ol>
<li>echo $LANG<br />fr_FR.UTF-8</li>
</ol>
<p>#ceph-volume lvm batch --dmcrypt --osds-per-device 2 /dev/nvme0n1</p>
<p>Total OSDs: 2</p>
<pre><code>Type Path LV Size % of device<br />----------------------------------------------------------------------------------------------------<br /> [data] /dev/nvme0n1 2.91 TB 50%<br />----------------------------------------------------------------------------------------------------<br /> [data] /dev/nvme0n1 2.91 TB 50%<br />--> The above OSDs would be created if the operation continues<br />--> do you want to proceed? (yes/no) yes<br />Running command: /sbin/vgcreate --force --yes ceph-e7d5283b-befc-420b-aa7e-e0ad6c10f34e /dev/nvme0n1<br /> stdout: Physical volume "/dev/nvme0n1" successfully created.<br /> stdout: Volume group "ceph-e7d5283b-befc-420b-aa7e-e0ad6c10f34e" successfully created<br />--> RuntimeError: Unable to convert to integer: '5961,63'</code></pre>
<p>the "," in "5961,63" is wrong, it need to be "."</p>
<p>fix:<br />#export LANG=en_US.UTF-8</p>
Ceph - Bug #16351 (Resolved): jewel : 60-ceph-partuuid-workaround-rules still needed by debian je...
https://tracker.ceph.com/issues/16351
2016-06-16T15:42:24Z
alexandre derumier
aderumier@odiso.Com
<p><a href="https://www.mail-archive.com/ceph-users@lists.ceph.com/msg28661.html" class="external">associated mail thread</a></p>
<p>Hi,<br />since this commit<br /><a class="external" href="https://github.com/ceph/ceph/commit/9f77244b8e0782921663e52005b725cca58a8753">https://github.com/ceph/ceph/commit/9f77244b8e0782921663e52005b725cca58a8753</a></p>
<p>osd disk are not mounted anymore on debian jessie ((udev 215-17) at boot</p>
<p>I have looked at udev log, at the 95-ceph-osd.rules is never triggered</p>
<p>if I add the old 60-ceph-partuuid-workaround-rules, it's working fine</p>
devops - Bug #15588 (Resolved): deb: TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES not specified in /etc/...
https://tracker.ceph.com/issues/15588
2016-04-25T03:58:57Z
alexandre derumier
aderumier@odiso.Com
<p>Hi,<br />this PR implement TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128MB in redhat sysconfig<br /><a class="external" href="https://github.com/ceph/ceph/pull/7934">https://github.com/ceph/ceph/pull/7934</a></p>
<p>We need the same for debian/ubuntu in /etc/default/ceph</p>
devops - Bug #15587 (Resolved): deb: wrong /etc/default/ceph/ceph location
https://tracker.ceph.com/issues/15587
2016-04-24T17:08:12Z
alexandre derumier
aderumier@odiso.Com
<p>debian rules file is wrong, it's create a /etc/default/ceph directory with a /etc/default/ceph/ceph file.</p>
<p>the correct path (used by systemd and init script) is /etc/default/ceph</p>
<p>--- a/debian/rules<br />+++ b/debian/rules</p>
<p>- install <del>d -m0755 debian/ceph-common/etc/default/ceph<br /></del> install -m0644 etc/default/ceph debian/ceph-common/etc/default/ceph<br />+ install -d -m0755 debian/ceph-common/etc/default<br />+ install -m0644 etc/default/ceph debian/ceph-common/etc/default/</p>
Ceph - Bug #15585 (Resolved): jewel: debian package: init.d script bug
https://tracker.ceph.com/issues/15585
2016-04-24T12:49:35Z
alexandre derumier
aderumier@odiso.Com
<p>Hi,</p>
<p>since this commit<br /><a class="external" href="https://github.com/ceph/ceph/commit/65963739cd6815b8008282c8f64cd64365662e60">https://github.com/ceph/ceph/commit/65963739cd6815b8008282c8f64cd64365662e60</a></p>
<p>if [ -n $CEPH_BIN ] && [ -n $CEPH_ROOT ] && [ -n $CEPH_BUILD_DIR ]; then<br /> #need second look at all variables, especially ETCDIR<br /> BINDIR=$CEPH_BIN<br /> SBINDIR=$CEPH_ROOT/src<br /> ETCDIR=$CEPH_BUILD_DIR<br /> LIBEXECDIR=$CEPH_ROOT/src<br /> SYSTEMD_RUN="" <br /> ASSUME_DEV=1<br />fi</p>
<p>debian init.d script is not working anymore,<br />variables need to be double quotes, or the test is always true when variables are undefined</p>
<p>- if [ -n $CEPH_BIN ] && [ -n $CEPH_ROOT ] && [ -n $CEPH_BUILD_DIR ]; then<br />+if [ -n "$CEPH_BIN" ] && [ -n "$CEPH_ROOT" ] && [ -n "$CEPH_BUILD_DIR" ]; then</p>
Ceph - Bug #15584 (Duplicate): jewel : debian package : missing ceph-osd.target, ceph-mon.target,...
https://tracker.ceph.com/issues/15584
2016-04-24T12:45:52Z
alexandre derumier
aderumier@odiso.Com
<p>Hi,</p>
<p>jewel debian packages are missing some systemd files</p>
<p>:ceph-osd.target, ceph-mon.target, ceph-msd.target</p>
<p>They are needed by ceph.target (so currently systemctl start ceph.target is doing nothing)</p>
rbd - Bug #10139 (New): librbd cpu usage 4x higher than krbd
https://tracker.ceph.com/issues/10139
2014-11-19T05:50:07Z
alexandre derumier
aderumier@odiso.Com
<p>librbd cpu usage is quite huge currently, around 4-5x higher than krbd.</p>
<p>(Tested with fio+krbd vs fio+librbd, random read 4K)</p>
<p>Perf report is attached to this tracker</p>
<p>Mailing list discussion about this:<br /><a class="external" href="http://article.gmane.org/gmane.comp.file-systems.ceph.devel/22089/match=client+cpu+usage+kbrd+vs+librbd+perf+report">http://article.gmane.org/gmane.comp.file-systems.ceph.devel/22089/match=client+cpu+usage+kbrd+vs+librbd+perf+report</a></p>
<p>Sage:<br />----</p>
<blockquote>
<p>I'm a bit suprised by the some of the items near the top<br />(bufferlist.clear() callers). I'm sure several of those can be<br />streamlined to avoid temporary bufferlists. I don't see any super<br />egregious users of the allocator, though.</p>
<p>The memcpy callers might be a good place to start...</p>
<p>sage</p>
</blockquote>
<p>Mark<br />----<br />Wasn't josh looking into some of this a year ago? Did anything ever <br />come of that work?</p>
<p>Haomai Wang <br />-----------<br />Hmm, I think it's a good perf topic to discuss about buffer<br />alloc/dealloc. For example, maybe frequency alloced object can use<br />memory pool(each pool stores the same objects), but the most challenge<br />to this is also STL structures.</p>