Bug #12194
osd crash FAILED assert(!parent->get_log().get_missing().is_missing(soid))
0%
Description
After updating to Hammer 0.94.2 via 0.93 -> 0.94, one OSD crashes with the following assert.
-14> 2015-06-30 23:26:17.970533 7f02d0dd9700 1 -- 10.30.1.128:6805/4022 <== osd.63 10.30.1.125:6815/22138 11 ==== pg_info(1 pgs e28954:12.74) v4 ==== 759+0+0 (2625070229 0 0) 0xfe8d280 con 0xf2ac8e0 -13> 2015-06-30 23:26:17.970549 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970359, event: header_read, op: pg_info(1 pgs e28954:12.74) -12> 2015-06-30 23:26:17.970556 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970360, event: throttled, op: pg_info(1 pgs e28954:12.74) -11> 2015-06-30 23:26:17.970562 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970380, event: all_read, op: pg_info(1 pgs e28954:12.74) -10> 2015-06-30 23:26:17.970622 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970548, event: dispatched, op: pg_info(1 pgs e28954:12.74) -9> 2015-06-30 23:26:17.970636 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970636, event: waiting_for_osdmap, op: pg_info(1 pgs e28954:12.74) -8> 2015-06-30 23:26:17.970650 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970650, event: started, op: pg_info(1 pgs e28954:12.74) -7> 2015-06-30 23:26:17.970673 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970673, event: done, op: pg_info(1 pgs e28954:12.74) -6> 2015-06-30 23:26:17.970696 7f02d0dd9700 1 -- 10.30.1.128:6805/4022 <== osd.58 10.30.1.37:6802/2027655 9 ==== MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 v2 ==== 26+0+0 (2409593423 0 0) 0x432cb00 con 0xf2af220 -5> 2015-06-30 23:26:17.970719 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970049, event: header_read, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -4> 2015-06-30 23:26:17.970726 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970051, event: throttled, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -3> 2015-06-30 23:26:17.970732 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970057, event: all_read, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -2> 2015-06-30 23:26:17.970737 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970716, event: dispatched, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -1> 2015-06-30 23:26:17.970744 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970743, event: waiting_for_osdmap, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 0> 2015-06-30 23:26:17.972003 7f02c31c3700 -1 osd/ReplicatedBackend.cc: In function 'void ReplicatedBackend::sub_op_modify_impl(OpRequestRef) [with T = MOSDRepOp, int MSGTYPE = 112]' thread 7f02c31c3700 time 2015-06-30 23:26:17.968440 osd/ReplicatedBackend.cc: 1138: FAILED assert(!parent->get_log().get_missing().is_missing(soid)) ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: (void ReplicatedBackend::sub_op_modify_impl<MOSDRepOp, 112>(std::tr1::shared_ptr<OpRequest>)+0x1488) [0x9d7898] 2: (ReplicatedBackend::sub_op_modify(std::tr1::shared_ptr<OpRequest>)+0x56) [0x9bdcb6] 3: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x37f) [0x9c0d9f] 4: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x184) [0x873b04] 5: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x178) [0x683318] 6: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x59e) [0x683cae] 7: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x795) [0xafe0f5] 8: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb019f0] 9: /lib64/libpthread.so.0() [0x37632079d1] 10: (clone()+0x6d) [0x3762ee88fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 5 filestore 1/ 3 keyvaluestore 1/ 1 journal 0/ 5 ms 1/ 5 mon 5/20 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-osd.85.log
Each time i launch the OSD it crashes on a different pg, but with the same stack. I attached the full log.
History
#1 Updated by Kefu Chai over 8 years ago
- Description updated (diff)
#2 Updated by Ben Hines over 8 years ago
Continue to encounter this one periodically, now has happened on a 2nd OSD. btrfs, centos 7.1.
#3 Updated by Adolfo Brandes over 8 years ago
I can confirm what looks like this same exact bug. While still on 0.93, filesystem on OSDs had to be converted from btrfs to xfs "the hard way" (rsyncing -aX data out of btrfs and back in to recreated xfs OSDs), but everything worked ok. When uprading to 0.94.2 (Ubuntu 14.04 on kernel 3.19, btw), I see this exact behavior when restarting OSDs. To fix it, I need to zap and recreate each disk, one node at a time, but the problem will "propagate" back to already converted OSDs even if I don't restart a 0.94.2 OSD on old data and simply zap it out of the gate.
The latter is probably due to the 0.93 OSDs elsewhere propagating bogus data.
#4 Updated by Sage Weil over 8 years ago
Does this appear to be related to 0.93 -> ahmmer upgrade? I don't think we've seen this with stable users...
#5 Updated by Sage Weil over 8 years ago
- Status changed from New to Need More Info
#6 Updated by Ben Hines over 8 years ago
I'm now on 94.3 (and planning to avoid future prereleases ;) -- have not seen this crash since the last one a month ago.
Fortunately, most of the data in our cluster is deleted and recreated every few weeks, so old data is likely to get purged and repaired.
#7 Updated by Luis Periquito about 8 years ago
I'm seeing the exact same issue in a production cluster, that was created as Hammer (0.94.2) and since upgraded to 0.94.3 and 0.94.5.
This cluster had always a XFS filesystem for the OSDs.
We have powered off a rack (power failure). When the nodes came back up some of the OSDs would crash. One of them crashed repeatedly, and I set that one as out. The others I just restarted and then they kept working. I had some multiple OSDs failing in the same hosts, hosts with only one OSD failing, and hosts without any failure.
The messages are all very similar and look like this:
2016-03-11 14:39:08.393277 7f4ee9681700 -1 osd/ReplicatedBackend.cc: In function 'void ReplicatedBackend::sub_op_modify_impl(OpRequestRef) [with T = MOSDRepOp; int MSGTYPE = 112; OpRequestRef = std::tr1::shared_ptr<OpRequest>]' thread 7f4ee9681700 time 2016-03-11 14:39:08.391222 osd/ReplicatedBackend.cc: 1138: FAILED assert(!parent->get_log().get_missing().is_missing(soid)) ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc60eb] 2: (void ReplicatedBackend::sub_op_modify_impl<MOSDRepOp, 112>(std::tr1::shared_ptr<OpRequest>)+0xe19) [0xa2e0f9] 3: (ReplicatedBackend::sub_op_modify(std::tr1::shared_ptr<OpRequest>)+0x49) [0xa19c09] 4: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x38e) [0xa1a03e] 5: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x167) [0x843de7] 6: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3d5) [0x6a1b05] 7: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x331) [0x6a2051] 8: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x875) [0xbb5ab5] 9: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xbb7bd0] 10: (()+0x8182) [0x7f4f09729182] 11: (clone()+0x6d) [0x7f4f07c9447d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The node I've set as out, after the whole cluster recovered and was HEALTH_OK, I just set it as in and it backfilled and is now working as normal.
The crusmap has a failure domain of rack.
#8 Updated by Sage Weil almost 7 years ago
- Status changed from Need More Info to Won't Fix
please reopen this ticket if you are still seeing this (on a later hammer version or on jewel+)