Actions
Bug #38049
closedrandom osds failing in thread_name:bstore_kv_final
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Since upgrading to mimic 13.2.4 single osds are randomly failing with the following assert:
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -67> 2019-01-25 15:57:08.414 7eff6730a700 1 get compressor snappy = 0x7effa0dd9240
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -66> 2019-01-25 15:57:08.414 7eff6730a700 10 bluestore(/var/lib/ceph/osd/ceph-225) read 6.141s3_head 3#6:82b4e9d2:::10003f4713e.00000003:head# 0x
0~1000000 = 524288
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -65> 2019-01-25 15:57:08.414 7eff6730a700 10 bluestore(/var/lib/ceph/osd/ceph-225) stat 6.141s3_head 3#6:82b4e9d2:::10003f4713e.00000003:head#
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -64> 2019-01-25 15:57:08.414 7eff6730a700 10 bluestore(/var/lib/ceph/osd/ceph-225) stat 6.141s3_head 3#6:82b4e9e0:::1000cfcc030.0000000e:head#
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -63> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore(/var/lib/ceph/osd/ceph-225).collection(6.141s3_head 0x7eff8b6fee00) get_onode oid 3#6:82b4
e9e0:::1000cfcc030.0000000e:head# key 0x83800000000000000682b4e9e0213130'00cfcc030.0000000e!='0xfffffffffffffffeffffffffffffffff'o'
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -62> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore(/var/lib/ceph/osd/ceph-225).collection(6.141s3_head 0x7eff8b6fee00) r 0 v.len 438
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -61> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore.blob(0x7eff91796770) get_ref 0x0~80000 Blob(0x7eff91796770 blob([0x525f8f10000~70000] clen
0x80000 -> 0x6da33 compressed) use_tracker(0x0 0x0) SharedBlob(0x7eff917967e0 sbid 0x0))
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -60> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore.blob(0x7eff91796770) get_ref init 0x80000, 80000
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -59> 2019-01-25 15:57:08.414 7eff6730a700 15 bluestore(/var/lib/ceph/osd/ceph-225) getattrs 6.141s3_head 3#6:82b4e9e0:::1000cfcc030.0000000e:head
#
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -58> 2019-01-25 15:57:08.414 7eff6730a700 10 bluestore(/var/lib/ceph/osd/ceph-225) getattrs 6.141s3_head 3#6:82b4e9e0:::1000cfcc030.0000000e:head
# = 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -57> 2019-01-25 15:57:08.414 7eff6730a700 15 bluestore(/var/lib/ceph/osd/ceph-225) read 6.141s3_head 3#6:82b4e9e0:::1000cfcc030.0000000e:head# 0x
0~1000000
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -56> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read 0x0~1000000 size 0x80000 (524288)
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -55> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read blob Blob(0x7eff91796770 blob([0x525f8f10000~70000]
clen 0x80000 -> 0x6da33 compressed) use_tracker(0x80000 0x80000) SharedBlob(0x7eff917967e0 sbid 0x0)) need 0x0~80000 cache has 0x[]
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -54> 2019-01-25 15:57:08.414 7eff6730a700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read blob Blob(0x7eff91796770 blob([0x525f8f10000~70000]
clen 0x80000 -> 0x6da33 compressed) use_tracker(0x80000 0x80000) SharedBlob(0x7eff917967e0 sbid 0x0)) need {<0x0, 0x80000> : [0x0:0~80000]}
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -53> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read blob Blob(0x7eff8e8b60e0 blob([0x164bf7e0000~50000]
csum crc32c/0x1000) use_tracker(0x5*0x10000 0x[10000,10000,10000,10000,10000]) SharedBlob(0x7eff8e8b6150 sbid 0x0)) need 0x{<0x0, 0x50000> : [0x0:0~50000]}
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -52> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read blob Blob(0x7eff8e8b61c0 blob([!~50000,0x164bf830000
~10000] csum+has_unused crc32c/0x1000 unused=0x1fff) use_tracker(0x6*0x10000 0x[0,0,0,0,0,c000]) SharedBlob(0x7eff8e8b6380 sbid 0x0)) need 0x{<0x50000, 0xc000> : [0x50000:50000~c000]}
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -51> 2019-01-25 15:57:08.434 7eff65b07700 10 bluestore(/var/lib/ceph/osd/ceph-225) read 6.36fs6_head 6#6:f6ef61c0:::1000ee9b1a4.0000001b:head# 0x
0~1000000 = 376832
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -50> 2019-01-25 15:57:08.434 7eff65b07700 10 bluestore(/var/lib/ceph/osd/ceph-225) stat 6.36fs6_head 6#6:f6ef61c0:::1000ee9b1a4.0000001b:head#
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -49> 2019-01-25 15:57:08.434 7eff65b07700 10 bluestore(/var/lib/ceph/osd/ceph-225) stat 6.36fs6_head 6#6:f6ef61ec:::10008fb62c4.00000004:head#
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -48> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225).collection(6.36fs6_head 0x7eff8b716a00) get_onode oid 6#6:f6ef
61ec:::10008fb62c4.00000004:head# key 0x868000000000000006f6ef61ec213130'008fb62c4.00000004!='0xfffffffffffffffeffffffffffffffff'o'
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -47> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225).collection(6.36fs6_head 0x7eff8b716a00) r 0 v.len 438
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -46> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore.blob(0x7eff8e8b65b0) get_ref 0x0~80000 Blob(0x7eff8e8b65b0 blob([0x164bf9a0000~70000] clen
0x80000 -> 0x69b43 compressed) use_tracker(0x0 0x0) SharedBlob(0x7eff8e8b6620 sbid 0x0))
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -45> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore.blob(0x7eff8e8b65b0) get_ref init 0x80000, 80000
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -44> 2019-01-25 15:57:08.434 7eff65b07700 15 bluestore(/var/lib/ceph/osd/ceph-225) getattrs 6.36fs6_head 6#6:f6ef61ec:::10008fb62c4.00000004:head
#
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -43> 2019-01-25 15:57:08.434 7eff65b07700 10 bluestore(/var/lib/ceph/osd/ceph-225) getattrs 6.36fs6_head 6#6:f6ef61ec:::10008fb62c4.00000004:head
# = 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -42> 2019-01-25 15:57:08.434 7eff65b07700 15 bluestore(/var/lib/ceph/osd/ceph-225) read 6.36fs6_head 6#6:f6ef61ec:::10008fb62c4.00000004:head# 0x
0~1000000
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -41> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read 0x0~1000000 size 0x80000 (524288)
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -40> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read blob Blob(0x7eff8e8b65b0 blob([0x164bf9a0000~70000]
clen 0x80000 -> 0x69b43 compressed) use_tracker(0x80000 0x80000) SharedBlob(0x7eff8e8b6620 sbid 0x0)) need 0x0~80000 cache has 0x[]
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -39> 2019-01-25 15:57:08.434 7eff65b07700 20 bluestore(/var/lib/ceph/osd/ceph-225) _do_read blob Blob(0x7eff8e8b65b0 blob([0x164bf9a0000~70000]
clen 0x80000 -> 0x69b43 compressed) use_tracker(0x80000 0x80000) SharedBlob(0x7eff8e8b6620 sbid 0x0)) need {<0x0, 0x80000> : [0x0:0~80000]}
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -38> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.MempoolThread(0x7effa55d1b08) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_us
ed: 132811826 meta_alloc: 0 meta_used: 3696326 data_alloc: 0 data_used: 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -37> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa730b440) _trim onodes 0 / 0 buffers 0 / 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -36> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57131c0) _trim onodes 2 / 0 buffers 0 / 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -35> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57131c0) _trim considering 0x7efe17fcbb00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -34> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.sharedblob(0x7eff91796460) put 0x7eff91796460 removing self from set 0x7eff8b6fef00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -33> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.BufferSpace(0x7eff91796478 in 0x7effa57131c0) _clear
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -32> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57131c0) _trim considering 0x7efe17fccd00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -31> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57131c0) _trim 3#6:82b4e9e0:::1000cfcc030.0000000e:head# has 3 refs; skipp
ing
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -30> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim onodes 5 / 0 buffers 0 / 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -29> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim considering 0x7eff58c1af00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -28> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim 2#6:80a659e9:::1000fbf04ab.0000000a:head# has 2 refs; skipp
ing
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -27> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim considering 0x7efe0d738600
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -26> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim 2#6:80a63af2:::1000fe25508.00000042:head#b02fa has 2 refs;
skipping
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -25> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim considering 0x7efe10372300
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -24> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim 2#6:80a659e9:::1000fbf04ab.0000000a:head#b02fc has 2 refs;
skipping
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -23> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim considering 0x7efe0d73a100
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -22> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim 2#6:80800000::::head# has 6 refs; skipping
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -21> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim considering 0x7eff6409e200
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -20> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa57132a0) _trim 2#6:80a67c40:::1000fecf3fa.000071fb:head# has 5 refs; skipp
ing
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -19> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713380) _trim onodes 2 / 0 buffers 0 / 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -18> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713380) _trim considering 0x7efeeb060400
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -17> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713380) _trim 4#6:cb400000::::head# has 3 refs; skipping
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -16> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713380) _trim considering 0x7efec813bb00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -15> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713380) _trim 4#6:cb5868b1:::1000ff1ac08.0000450d:head# has 5 refs; skipp
ing
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -14> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713460) _trim onodes 4 / 0 buffers 0 / 0
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -13> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713460) _trim considering 0x7eff2653dc00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -12> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.sharedblob(0x7eff8e777d60) put 0x7eff8e777d60 removing self from set 0x7eff8b716b00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -11> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.BufferSpace(0x7eff8e777d78 in 0x7effa5713460) _clear
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -10> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713460) _trim considering 0x7eff2653df00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -9> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.sharedblob(0x7eff8e8b6000) put 0x7eff8e8b6000 removing self from set 0x7eff8b716b00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -8> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.BufferSpace(0x7eff8e8b6018 in 0x7effa5713460) _clear
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -7> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713460) _trim considering 0x7eff24f5b900
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -6> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.sharedblob(0x7eff8e8b6150) put 0x7eff8e8b6150 removing self from set 0x7eff8b716b00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -5> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.BufferSpace(0x7eff8e8b6168 in 0x7effa5713460) _clear
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -4> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.sharedblob(0x7eff8e8b6380) put 0x7eff8e8b6380 removing self from set 0x7eff8b716b00
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -3> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.BufferSpace(0x7eff8e8b6398 in 0x7effa5713460) _clear
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -2> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713460) _trim considering 0x7eff24f5c800
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: -1> 2019-01-25 15:57:08.434 7eff97faf700 20 bluestore.2QCache(0x7effa5713460) _trim 6#6:f6ef61ec:::10008fb62c4.00000004:head# has 3 refs; skipping
Jan 25 15:57:08 kaa-115 ceph-osd[24687]: 0> 2019-01-25 15:57:08.434 7eff8fd23700 -1 *** Caught signal (Aborted) **
in thread 7eff8fd23700 thread_name:bstore_kv_final
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: (()+0x8cc83b) [0x55ebefd5f83b]
2: (()+0x13230) [0x7effa982f230]
3: (gsignal()+0x10b) [0x7effa86cdbcd]
4: (abort()+0x148) [0x7effa86cf298]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x264) [0x7effac667547]
6: (ceph::register_assert_context(CephContext*)+0) [0x7effac6675b0]
7: (StupidAllocator::_insert_free(unsigned long, unsigned long)+0xbc4) [0x55ebefd3a6c0]
8: (StupidAllocator::release(interval_set<unsigned long, std::map<unsigned long, unsigned long, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, unsigned long> > > > const&)+0x6b) [0x55ebefd3a759]
9: (BlueStore::_txc_release_alloc(BlueStore::TransContext*)+0xc4) [0x55ebefc0fd7e]
10: (BlueStore::_txc_finish(BlueStore::TransContext*)+0x447) [0x55ebefc25cb1]
11: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x90) [0x55ebefc3472a]
12: (BlueStore::_kv_finalize_thread()+0x47d) [0x55ebefc358a9]
13: (BlueStore::KVFinalizeThread::entry()+0xd) [0x55ebefc8a187]
14: (()+0x75e5) [0x7effa98235e5]
15: (clone()+0x3f) [0x7effa8794e9f]
The affected osds can be restarted manually, but the failures happen quite frequently. Multiple (3-5) osds usually crash in quick succession, and this seems to occur during compression with snappy.
Files
Actions