Bug #10138: osd: crash in SnapSet::from_snap_set - Ceph - Ceph

Actions

#1

Updated by Paul Emmerich over 9 years ago

Sorry for the formatting fail. The crash log was supposed to look like this:

    -6> 2014-11-19 13:00:41.552841 7fe7a890e700  5 -- op tracker -- seq: 131507156, time: 2014-11-19 13:00:41.552841, event: write_thread_in_journal_buffer, op: osd_sub_op(client.844955.0:352418449 7.2c 7bd64cec/rbd_data.5797d2ae8944a.0000000000064882/head//7 [] v 3771'221148 snapset=0=[]:[] snapc=0=[])
    -5> 2014-11-19 13:00:41.553071 7fe7a810d700  5 -- op tracker -- seq: 131507156, time: 2014-11-19 13:00:41.553070, event: journaled_completion_queued, op: osd_sub_op(client.844955.0:352418449 7.2c 7bd64cec/rbd_data.5797d2ae8944a.0000000000064882/head//7 [] v 3771'221148 snapset=0=[]:[] snapc=0=[])
    -4> 2014-11-19 13:00:41.553126 7fe7a5908700  5 -- op tracker -- seq: 131507156, time: 2014-11-19 13:00:41.553125, event: commit_sent, op: osd_sub_op(client.844955.0:352418449 7.2c 7bd64cec/rbd_data.5797d2ae8944a.0000000000064882/head//7 [] v 3771'221148 snapset=0=[]:[] snapc=0=[])
    -3> 2014-11-19 13:00:41.553146 7fe7a5908700  1 -- 172.16.200.11:6816/26754 --> 172.16.200.12:6858/14093 -- osd_sub_op_reply(client.844955.0:352418449 7.2c 7bd64cec/rbd_data.5797d2ae8944a.0000000000064882/head//7 [] ondisk, result = 0) v2 -- ?+0 0x12fd0580 con 0xf87a420
    -2> 2014-11-19 13:00:41.553541 7fe7a6109700  5 -- op tracker -- seq: 131507156, time: 2014-11-19 13:00:41.553540, event: sub_op_applied, op: osd_sub_op(client.844955.0:352418449 7.2c 7bd64cec/rbd_data.5797d2ae8944a.0000000000064882/head//7 [] v 3771'221148 snapset=0=[]:[] snapc=0=[])
    -1> 2014-11-19 13:00:41.553565 7fe7a6109700  5 -- op tracker -- seq: 131507156, time: 2014-11-19 13:00:41.553564, event: done, op: osd_sub_op(client.844955.0:352418449 7.2c 7bd64cec/rbd_data.5797d2ae8944a.0000000000064882/head//7 [] v 3771'221148 snapset=0=[]:[] snapc=0=[])
     0> 2014-11-19 13:00:41.572467 7fe78f602700 -1 *** Caught signal (Aborted) **
 in thread 7fe78f602700

 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: /usr/bin/ceph-osd() [0xba6e6c]
 2: (()+0xf0a0) [0x7fe7b2db40a0]
 3: (gsignal()+0x35) [0x7fe7b171d165]
 4: (abort()+0x180) [0x7fe7b17203e0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fe7b1f7489d]
 6: (()+0x63996) [0x7fe7b1f72996]
 7: (()+0x639c3) [0x7fe7b1f729c3]
 8: (()+0x63bee) [0x7fe7b1f72bee]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x220) [0xc87260]
 10: (SnapSet::from_snap_set(librados::snap_set_t const&)+0x8c4) [0x8a3d54]
 11: (ReplicatedPG::finish_promote(int, std::tr1::shared_ptr<OpRequest>, ReplicatedPG::CopyResults*, std::tr1::shared_ptr<ObjectContext>)+0x670) [0x9796c0]
 12: (PromoteCallback::finish(boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>)+0x75) [0x9d3305]
 13: (GenContext<boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type> >::complete(boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>)+0x15) [0x9a1d35]
 14: (ReplicatedPG::process_copy_chunk(hobject_t, unsigned long, int)+0x451) [0x98bd31]
 15: (C_Copyfrom::finish(int)+0x86) [0x9d3f16]
 16: (Context::complete(int)+0x9) [0x7cd499]
 17: (Finisher::finisher_thread_entry()+0x148) [0xbc7238]
 18: (()+0x6b50) [0x7fe7b2dabb50]
 19: (clone()+0x6d) [0x7fe7b17c77bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions

Copy link

#2

Updated by Sage Weil over 9 years ago

Status changed from New to Rejected

Please try using the latest firefly release. 0.87 is comparatively old.. many bugs were fixed in 0.80.5 and again in 0.80.7. Thanks!

Actions

Copy link

#3

Updated by Paul Emmerich over 9 years ago

Sage Weil wrote:

Please try using the latest firefly release. 0.87 is comparatively old.. many bugs were fixed in 0.80.5 and again in 0.80.7. Thanks!

Uhm, so you are saying that 0.80.x firefly is newer than 0.87 giant?

We actually upgraded from 0.80.7 firefly to 0.87 giant recently due to issue #9765

Actions

Copy link

#4

Updated by Sage Weil over 9 years ago

Subject changed from OSD crashes when enabling a writeback cache tier to osd: crash in SnapSet::from_snap_set
Category set to OSD
Status changed from Rejected to 12
Priority changed from Normal to Urgent