Project

General

Profile

Bug #49818

os/bluestore/BlueStore.cc: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache()

Added by Neha Ojha 6 months ago. Updated 6 months ago.

Status:
Fix Under Review
Priority:
Immediate
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific, octopus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-03-13T03:25:33.299 INFO:tasks.ceph.osd.2.smithi049.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.1.0-742-gf7a04bff/rpm/el8/BUILD/ceph-16.1.0-742-gf7a04bff/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::split_cache(BlueStore::Collection*)' thread 7f57716b4700 time 2021-03-13T03:25:33.296503+0000
2021-03-13T03:25:33.299 INFO:tasks.ceph.osd.2.smithi049.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.1.0-742-gf7a04bff/rpm/el8/BUILD/ceph-16.1.0-742-gf7a04bff/src/os/bluestore/BlueStore.cc: 4018: FAILED ceph_assert(o->pinned)
2021-03-13T03:25:33.301 INFO:tasks.ceph.osd.2.smithi049.stderr: ceph version 16.1.0-742-gf7a04bff (f7a04bff39fc5756ebbc8ad096bd00e87357e14c) pacific (rc)
2021-03-13T03:25:33.301 INFO:tasks.ceph.osd.2.smithi049.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55c37dfd00a2]
2021-03-13T03:25:33.302 INFO:tasks.ceph.osd.2.smithi049.stderr: 2: ceph-osd(+0x5692bc) [0x55c37dfd02bc]
2021-03-13T03:25:33.302 INFO:tasks.ceph.osd.2.smithi049.stderr: 3: (BlueStore::Collection::split_cache(BlueStore::Collection*)+0x13ea) [0x55c37e5f30ea]
2021-03-13T03:25:33.302 INFO:tasks.ceph.osd.2.smithi049.stderr: 4: (BlueStore::_split_collection(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Collection>&, unsigned int, int)+0x308) [0x55c37e5f3548]
2021-03-13T03:25:33.302 INFO:tasks.ceph.osd.2.smithi049.stderr: 5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0xa21) [0x55c37e614291]
2021-03-13T03:25:33.303 INFO:tasks.ceph.osd.2.smithi049.stderr: 6: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x316) [0x55c37e616c26]
2021-03-13T03:25:33.303 INFO:tasks.ceph.osd.2.smithi049.stderr: 7: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x85) [0x55c37e13a485]
2021-03-13T03:25:33.303 INFO:tasks.ceph.osd.2.smithi049.stderr: 8: (OSD::dispatch_context(PeeringCtx&, PG*, std::shared_ptr<OSDMap const>, ThreadPool::TPHandle*)+0xf3) [0x55c37e0d0093]
2021-03-13T03:25:33.303 INFO:tasks.ceph.osd.2.smithi049.stderr: 9: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x2d8) [0x55c37e101888]
2021-03-13T03:25:33.304 INFO:tasks.ceph.osd.2.smithi049.stderr: 10: (ceph::osd::scheduler::PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x56) [0x55c37e32fbc6]
2021-03-13T03:25:33.304 INFO:tasks.ceph.osd.2.smithi049.stderr: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xa58) [0x55c37e0f36c8]
2021-03-13T03:25:33.304 INFO:tasks.ceph.osd.2.smithi049.stderr: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x55c37e75a054]
2021-03-13T03:25:33.304 INFO:tasks.ceph.osd.2.smithi049.stderr: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55c37e75ccf4]
2021-03-13T03:25:33.305 INFO:tasks.ceph.osd.2.smithi049.stderr: 14: /lib64/libpthread.so.0(+0x82de) [0x7f57985282de]
2021-03-13T03:25:33.305 INFO:tasks.ceph.osd.2.smithi049.stderr: 15: clone()

/a/sage-2021-03-13_00:06:23-rados-wip-sage2-testing-2021-03-12-1328-pacific-distro-basic-smithi/5959430 - this includes https://github.com/ceph/ceph/pull/39228


Related issues

Related to bluestore - Bug #48966: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(BlueStore::Collection*) Resolved
Related to bluestore - Bug #49900: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0) Resolved
Duplicated by bluestore - Bug #49820: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(BlueStore::Collection*) Duplicate

History

#1 Updated by Neha Ojha 6 months ago

  • Related to Bug #48966: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(BlueStore::Collection*) added

#2 Updated by Neha Ojha 6 months ago

  • Duplicated by Bug #49820: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(BlueStore::Collection*) added

#3 Updated by Igor Fedotov 6 months ago

it looks like Onode's put/get logic still has a flaw.
void BlueStore::Onode::put() {
int n = --nref;
if (n == 2) {
OnodeCacheShard* ocs = c->get_onode_cache();
std::lock_guard l(ocs->lock);
bool need_unpin = pinned;
pinned = pinned && nref > 2; // intentionally use > not >= as we have
// +1 due to pinned state

since this check plus assignment aren't atomic one might get a race with get():

if (++nref >= 2 && !pinned) {

when pinned is being assigned to false (due to nref <= 2) while get increments nref but gets old pinned value.
Hence we get nref > 2 but pinned set to false.

Looking for a solution different from doing that by a brute force and take locks unconditionally...

#4 Updated by Adam Kupczyk 6 months ago

my x86 assembly made it atomic:

   1b068:    48 8b 45 c8              mov    -0x38(%rbp),%rax
   1b06c:    48 89 c7                 mov    %rax,%rdi
   1b06f:    e8 00 00 00 00           callq  1b074 <_ZN9BlueStore5Onode3getEv+0x28>
            1b070: R_X86_64_PLT32    _ZNSt13__atomic_baseIiEppEv-0x4
   1b074:    83 f8 01                 cmp    $0x1,%eax


nref is increased and eax is previous value.
Here >= is translated to compare with 1 (previous value).

#5 Updated by Adam Kupczyk 6 months ago

But my compiled x86 definately shows that operator "&&" in ++nref >= 2 && !pinned makes whole action non-atomic.
Maybe this is the problem?

#6 Updated by Neha Ojha 6 months ago

/a/kchai-2021-03-14_09:40:23-rados-wip-kefu-testing-2021-03-14-1612-distro-basic-smithi/5965024
/a/kchai-2021-03-14_09:40:23-rados-wip-kefu-testing-2021-03-14-1612-distro-basic-smithi/5964672

#7 Updated by Igor Fedotov 6 months ago

hmm... interesting finding:
void BlueStore::Collection::split_cache(
Collection *dest) {
ldout(store->cct, 10) << func << " to " << dest << dendl;

// lock (one or both) cache shards
std::lock(cache->lock, dest->cache->lock);
std::lock_guard l(cache->lock, std::adopt_lock);
std::lock_guard l2(dest->cache->lock, std::adopt_lock);

looks like BlueStore::Collection::split_cache doesn't acquire OnodeCacheShard lock at all!!!!
The above locks are from BufferCacheShard, see

struct Collection : public CollectionImpl {
BlueStore *store;
OpSequencerRef osr;
BufferCacheShard *cache; ///< our cache shard

I'm not sure whether this is relevant to the bug or not though. But we should take care of that anyway...

#8 Updated by Mark Nelson 6 months ago

Excellent catch Igor, I suspect that was probably introduced when the onode and buffer caches were split

#9 Updated by Sage Weil 6 months ago

probably related to the locking problem:

2021-03-21T05:50:39.807 INFO:tasks.ceph.osd.7.smithi111.stderr:*** Caught signal (Segmentation fault) **
2021-03-21T05:50:39.807 INFO:tasks.ceph.osd.7.smithi111.stderr: in thread 7ff00cbd3700 thread_name:tp_osd_tp
2021-03-21T05:50:39.809 INFO:tasks.rados.rados.0.smithi097.stdout:1329:  writing smithi09721916-848 from 605349 to 1177177 tid 1
2021-03-21T05:50:39.813 INFO:tasks.ceph.osd.7.smithi111.stderr: ceph version 16.1.0-1042-g6c172970 (6c1729700146219191135171ce9dd590f98cdb16) pacific (rc)
2021-03-21T05:50:39.813 INFO:tasks.ceph.osd.7.smithi111.stderr: 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7ff02d95f980]
2021-03-21T05:50:39.813 INFO:tasks.ceph.osd.7.smithi111.stderr: 2: (std::_Hashtable<ghobject_t, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> > >, std::__detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::erase(std::__detail::_Node_const_iterator<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, false, true>)+0x23) [0x558675d0fdb3]
2021-03-21T05:50:39.813 INFO:tasks.ceph.osd.7.smithi111.stderr: 3: (BlueStore::Collection::split_cache(BlueStore::Collection*)+0x1b2) [0x558675c6e8e2]
2021-03-21T05:50:39.813 INFO:tasks.ceph.osd.7.smithi111.stderr: 4: (BlueStore::_merge_collection(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>*, boost::intrusive_ptr<BlueStore::Collection>&, unsigned int)+0x14e) [0x558675cb051e]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0xc7c) [0x558675ce0bac]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 6: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2f2) [0x558675ce3112]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 7: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x5586757e8460]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 8: (OSD::dispatch_context(PeeringCtx&, PG*, std::shared_ptr<OSDMap const>, ThreadPool::TPHandle*)+0x1db) [0x5586757822bb]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 9: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1e5) [0x5586757b2235]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 10: (ceph::osd::scheduler::PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x50) [0x5586759e8e50]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xcd5) [0x5586757a3f35]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x558675e1ba3c]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x558675e1ecf0]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7ff02d9546db]
2021-03-21T05:50:39.814 INFO:tasks.ceph.osd.7.smithi111.stderr: 15: clone()

/a/sage-2021-03-20_15:11:51-rados-wip-sage2-testing-2021-03-20-0832-pacific-distro-basic-smithi/5983949

#10 Updated by Igor Fedotov 6 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 40237

#11 Updated by Igor Fedotov 6 months ago

  • Backport changed from pacific to pacific, octopus

#12 Updated by Adam Kupczyk 6 months ago

  • Assignee deleted (Igor Fedotov)
  • Severity deleted (3 - minor)

#13 Updated by Adam Kupczyk 6 months ago

  • Assignee set to Igor Fedotov
  • Severity set to 2 - major

#14 Updated by Adam Kupczyk 6 months ago

In

  OnodeCacheShard* ocs = c->get_onode_cache();
  std::lock_guard l(ocs->lock);

while waiting for lock, split_cache might have changed OnodeCacheShard.

I think it has much to do with our problem.
Attempting to verify it:
https://github.com/ceph/ceph/pull/40215/commits/6c53a25b9db1697a152b75a86eb75e1801664020

#15 Updated by Neha Ojha 6 months ago

  • Related to Bug #49900: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0) added

Also available in: Atom PDF