Project

General

Profile

Actions

Bug #8346

closed

OSD crashes on master (FAILED assert(ip_op.waiting_for_commit.count(from)))

Added by John Spray almost 10 years ago. Updated almost 10 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Revision bfce3d4, home-built packages on Fedora 20: wasn't actually trying to test Ceph, but since I saw some crashes, I've opened this and attached logs.

Things started going wrong after doing a couple of spurious "ceph osd down" and "ceph osd out" operations, although some of the crashes happened a few minutes later, possibly in response to the resulting relocation of PGs.

osd.1:

osd/OSD.cc: 4768: FAILED assert(osd->is_active())

 ceph version 0.80-419-gbfce3d4 (bfce3d4facad93ce6528a9595e9b3feb9d0884ca)
 1: (OSDService::share_map(entity_name_t, Connection*, unsigned int, std::tr1::shared_ptr<OSDMap const>&, unsigned int*)+0x5b3) [0x6335a3] 2: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x187) [0x633827]
 3: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x203) [0x633f93]
 4: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x6689bc] 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb10) [0xa75d30]
 6: (ThreadPool::WorkThread::entry()+0x10) [0xa76c20]
 7: (()+0x7f33) [0x7f58ad293f33]
 8: (clone()+0x6d) [0x7f58abd24ded]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

osd.3:

osd/ReplicatedBackend.cc: 641: FAILED assert(ip_op.waiting_for_commit.count(from))

 ceph version 0.80-419-gbfce3d4 (bfce3d4facad93ce6528a9595e9b3feb9d0884ca)
 1: (ReplicatedBackend::sub_op_modify_reply(std::tr1::shared_ptr<OpRequest>)+0x5ab) [0x92242b]
 2: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x1f6) [0x922736]
 3: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x303) [0x7bb403]
 4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x475) [0x633b15]
 5: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x203) [0x633f93]
 6: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x6689bc]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb10) [0xa75d30]
 8: (ThreadPool::WorkThread::entry()+0x10) [0xa76c20]
 9: (()+0x7f33) [0x7f4a1600ef33]
 10: (clone()+0x6d) [0x7f4a14a9fead]

osd.4

osd/ReplicatedBackend.cc: 641: FAILED assert(ip_op.waiting_for_commit.count(from))

 ceph version 0.80-419-gbfce3d4 (bfce3d4facad93ce6528a9595e9b3feb9d0884ca)
 1: (ReplicatedBackend::sub_op_modify_reply(std::tr1::shared_ptr<OpRequest>)+0x5ab) [0x92242b]
 2: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x1f6) [0x922736]
 3: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x303) [0x7bb403]
 4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x475) [0x633b15]
 5: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x203) [0x633f93]
 6: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x6689bc]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb10) [0xa75d30]
 8: (ThreadPool::WorkThread::entry()+0x10) [0xa76c20]
 9: (()+0x7f33) [0x7f3e79608f33]
 10: (clone()+0x6d) [0x7f3e78099ded]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 


Files

osd_logs.tar.gz (2.06 MB) osd_logs.tar.gz John Spray, 05/13/2014 02:50 PM
ceph-osd.15.log (16.9 MB) ceph-osd.15.log Sahana Lokeshappa, 06/25/2014 02:27 AM
log_osd_678.gz (4.35 MB) log_osd_678.gz Sahana Lokeshappa, 07/18/2014 05:38 AM
log_osd_345.gz (3.39 MB) log_osd_345.gz Sahana Lokeshappa, 07/18/2014 05:38 AM
log_osd_012.gz (4.62 MB) log_osd_012.gz Sahana Lokeshappa, 07/18/2014 05:38 AM
monlogs.gz (3.2 MB) monlogs.gz Sahana Lokeshappa, 07/18/2014 05:38 AM
Actions

Also available in: Atom PDF