Project

General

Profile

Actions

Bug #39152

closed

nautilus osd crash: Caught signal (Aborted) tp_osd_tp

Added by Wen Wei about 5 years ago. Updated over 4 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

OSD continously crashed

-1> 2019-04-08 17:47:06.615 7f3f3ef62700 -1 /build/ceph-14.2.0/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)' thread 7f3f3ef62700 time 2019-04-08 17:47:06.607260
/build/ceph-14.2.0/src/os/bluestore/BlueStore.cc: 11069: abort()
ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) nautilus (stable)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xda) [0x850261]
2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x296a) [0xe42aaa]
3: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x5e6) [0xe47016]
4: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x7f) [0xa0021f]
5: (PG::_delete_some(ObjectStore::Transaction*)+0x710) [0xa64220]
6: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x71) [0xa64fe1]
7: (boost::statechart::simple_state<PG::RecoveryState::Deleting, PG::RecoveryState::ToDelete, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x131) [0xaaded1]
8: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0xa81a7b]
9: (PG::do_peering_event(std::shared_ptr<PGPeeringEvent>, PG::RecoveryCtx*)+0x122) [0xa71092]
10: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1b4) [0x9abf74]
11: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&)+0xd2) [0x9ac252]
12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xbed) [0x9a00ad]
13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xfc0c1c]
14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xfc3dd0]
15: (()+0x76ba) [0x7f3f5e4846ba]
16: (clone()+0x6d) [0x7f3f5da8b41d]
0> 2019-04-08 17:47:06.623 7f3f3ef62700 -1 ** Caught signal (Aborted) *
in thread 7f3f3ef62700 thread_name:tp_osd_tp
ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) nautilus (stable)
1: (()+0x11390) [0x7f3f5e48e390]
2: (gsignal()+0x38) [0x7f3f5d9b9428]
3: (abort()+0x16a) [0x7f3f5d9bb02a]
4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1a0) [0x850327]
5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x296a) [0xe42aaa]
6: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x5e6) [0xe47016]
7: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x7f) [0xa0021f]
8: (PG::_delete_some(ObjectStore::Transaction*)+0x710) [0xa64220]
9: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x71) [0xa64fe1]
10: (boost::statechart::simple_state<PG::RecoveryState::Deleting, PG::RecoveryState::ToDelete, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x131) [0xaaded1]
11: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0xa81a7b]
12: (PG::do_peering_event(std::shared_ptr<PGPeeringEvent>, PG::RecoveryCtx*)+0x122) [0xa71092]
13: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1b4) [0x9abf74]
14: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&)+0xd2) [0x9ac252]
15: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xbed) [0x9a00ad]
16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xfc0c1c]
17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xfc3dd0]
18: (()+0x76ba) [0x7f3f5e4846ba]
19: (clone()+0x6d) [0x7f3f5da8b41d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Logs/configs attached

Thanks!


Files

98.config (62.3 KB) 98.config Wen Wei, 04/09/2019 01:02 AM
ceph.conf (299 Bytes) ceph.conf Wen Wei, 04/09/2019 01:02 AM
ceph-osd.98.log.zip (375 KB) ceph-osd.98.log.zip Wen Wei, 04/09/2019 01:03 AM
ceph-osd.12.log.zip (242 KB) ceph-osd.12.log.zip K Jarrett, 05/05/2019 09:05 PM

Related issues 2 (0 open2 closed)

Related to RADOS - Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0)Resolved

Actions
Related to RADOS - Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0)ResolvedSage WeilActions
Actions

Also available in: Atom PDF