Project

General

Profile

Actions

Bug #20923

closed

ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)

Added by Martin Millnert over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Backfill/Recovery
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Running 12.1.1 RC1 OSD:s, currently doing inline migration to BlueStore (ceph osd destroy procedure). Getting these assert fails quite often now.

Single-host cluster: 1 MON, 1 MGR, 5 OSD (4 FileStore, 1 BlueStore).

     0> 2017-08-06 11:28:50.154046 7f81daa56700 -1 /build/ceph-12.1.1/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::ExtentMap::fault_range(KeyValueDB*, uint32_
t, uint32_t)' thread 7f81daa56700 time 2017-08-06 11:28:50.142943
/build/ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)

 ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563a7c847e32]
 2: (BlueStore::ExtentMap::fault_range(KeyValueDB*, unsigned int, unsigned int)+0x911) [0x563a7c6bd2c1]
 3: (BlueStore::_do_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph:
:buffer::list&, unsigned int)+0x23b) [0x563a7c70508b]
 4: (BlueStore::_write(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, ceph::b
uffer::list&, unsigned int)+0x102) [0x563a7c705ec2]
 5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1685) [0x563a7c7148a5]
 6: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp
>, ThreadPool::TPHandle*)+0x378) [0x563a7c715a48]
 7: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, boost::intrusive_ptr<OpRequest>)+0xa7) [0x563a7c47cb17]
 8: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0x69a) [0x563a7c593a9a]
 9: (ECBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x255) [0x563a7c5a28f5]
 10: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5e7) [0x563a7c417667]
 11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f7) [0x563a7c2b7b07]
 12: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x563a7c5120a7]
 13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x108c) [0x563a7c2e334c]
 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x93d) [0x563a7c84dadd]
 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x563a7c84fd00]
 16: (()+0x7494) [0x7f81f5e34494]
 17: (clone()+0x3f) [0x7f81f4ebbaff]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Following this some other threads dumps as well from abort signals, but I don't include them here.
It happens during backfill.

Actions

Also available in: Atom PDF