Bug #12194
closedosd crash FAILED assert(!parent->get_log().get_missing().is_missing(soid))
0%
Description
After updating to Hammer 0.94.2 via 0.93 -> 0.94, one OSD crashes with the following assert.
-14> 2015-06-30 23:26:17.970533 7f02d0dd9700 1 -- 10.30.1.128:6805/4022 <== osd.63 10.30.1.125:6815/22138 11 ==== pg_info(1 pgs e28954:12.74) v4 ==== 759+0+0 (2625070229 0 0) 0xfe8d280 con 0xf2ac8e0 -13> 2015-06-30 23:26:17.970549 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970359, event: header_read, op: pg_info(1 pgs e28954:12.74) -12> 2015-06-30 23:26:17.970556 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970360, event: throttled, op: pg_info(1 pgs e28954:12.74) -11> 2015-06-30 23:26:17.970562 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970380, event: all_read, op: pg_info(1 pgs e28954:12.74) -10> 2015-06-30 23:26:17.970622 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970548, event: dispatched, op: pg_info(1 pgs e28954:12.74) -9> 2015-06-30 23:26:17.970636 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970636, event: waiting_for_osdmap, op: pg_info(1 pgs e28954:12.74) -8> 2015-06-30 23:26:17.970650 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970650, event: started, op: pg_info(1 pgs e28954:12.74) -7> 2015-06-30 23:26:17.970673 7f02d0dd9700 5 -- op tracker -- seq: 359, time: 2015-06-30 23:26:17.970673, event: done, op: pg_info(1 pgs e28954:12.74) -6> 2015-06-30 23:26:17.970696 7f02d0dd9700 1 -- 10.30.1.128:6805/4022 <== osd.58 10.30.1.37:6802/2027655 9 ==== MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 v2 ==== 26+0+0 (2409593423 0 0) 0x432cb00 con 0xf2af220 -5> 2015-06-30 23:26:17.970719 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970049, event: header_read, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -4> 2015-06-30 23:26:17.970726 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970051, event: throttled, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -3> 2015-06-30 23:26:17.970732 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970057, event: all_read, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -2> 2015-06-30 23:26:17.970737 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970716, event: dispatched, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 -1> 2015-06-30 23:26:17.970744 7f02d0dd9700 5 -- op tracker -- seq: 360, time: 2015-06-30 23:26:17.970743, event: waiting_for_osdmap, op: MRecoveryReserve REQUEST pgid: 6.6e, query_epoch: 28954 0> 2015-06-30 23:26:17.972003 7f02c31c3700 -1 osd/ReplicatedBackend.cc: In function 'void ReplicatedBackend::sub_op_modify_impl(OpRequestRef) [with T = MOSDRepOp, int MSGTYPE = 112]' thread 7f02c31c3700 time 2015-06-30 23:26:17.968440 osd/ReplicatedBackend.cc: 1138: FAILED assert(!parent->get_log().get_missing().is_missing(soid)) ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: (void ReplicatedBackend::sub_op_modify_impl<MOSDRepOp, 112>(std::tr1::shared_ptr<OpRequest>)+0x1488) [0x9d7898] 2: (ReplicatedBackend::sub_op_modify(std::tr1::shared_ptr<OpRequest>)+0x56) [0x9bdcb6] 3: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x37f) [0x9c0d9f] 4: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x184) [0x873b04] 5: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x178) [0x683318] 6: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x59e) [0x683cae] 7: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x795) [0xafe0f5] 8: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb019f0] 9: /lib64/libpthread.so.0() [0x37632079d1] 10: (clone()+0x6d) [0x3762ee88fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 5 filestore 1/ 3 keyvaluestore 1/ 1 journal 0/ 5 ms 1/ 5 mon 5/20 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-osd.85.log
Each time i launch the OSD it crashes on a different pg, but with the same stack. I attached the full log.
Files
Updated by Ben Hines over 8 years ago
Continue to encounter this one periodically, now has happened on a 2nd OSD. btrfs, centos 7.1.
Updated by Adolfo Brandes over 8 years ago
I can confirm what looks like this same exact bug. While still on 0.93, filesystem on OSDs had to be converted from btrfs to xfs "the hard way" (rsyncing -aX data out of btrfs and back in to recreated xfs OSDs), but everything worked ok. When uprading to 0.94.2 (Ubuntu 14.04 on kernel 3.19, btw), I see this exact behavior when restarting OSDs. To fix it, I need to zap and recreate each disk, one node at a time, but the problem will "propagate" back to already converted OSDs even if I don't restart a 0.94.2 OSD on old data and simply zap it out of the gate.
The latter is probably due to the 0.93 OSDs elsewhere propagating bogus data.
Updated by Sage Weil over 8 years ago
Does this appear to be related to 0.93 -> ahmmer upgrade? I don't think we've seen this with stable users...
Updated by Ben Hines over 8 years ago
I'm now on 94.3 (and planning to avoid future prereleases ;) -- have not seen this crash since the last one a month ago.
Fortunately, most of the data in our cluster is deleted and recreated every few weeks, so old data is likely to get purged and repaired.
Updated by Luis Periquito about 8 years ago
I'm seeing the exact same issue in a production cluster, that was created as Hammer (0.94.2) and since upgraded to 0.94.3 and 0.94.5.
This cluster had always a XFS filesystem for the OSDs.
We have powered off a rack (power failure). When the nodes came back up some of the OSDs would crash. One of them crashed repeatedly, and I set that one as out. The others I just restarted and then they kept working. I had some multiple OSDs failing in the same hosts, hosts with only one OSD failing, and hosts without any failure.
The messages are all very similar and look like this:
2016-03-11 14:39:08.393277 7f4ee9681700 -1 osd/ReplicatedBackend.cc: In function 'void ReplicatedBackend::sub_op_modify_impl(OpRequestRef) [with T = MOSDRepOp; int MSGTYPE = 112; OpRequestRef = std::tr1::shared_ptr<OpRequest>]' thread 7f4ee9681700 time 2016-03-11 14:39:08.391222 osd/ReplicatedBackend.cc: 1138: FAILED assert(!parent->get_log().get_missing().is_missing(soid)) ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc60eb] 2: (void ReplicatedBackend::sub_op_modify_impl<MOSDRepOp, 112>(std::tr1::shared_ptr<OpRequest>)+0xe19) [0xa2e0f9] 3: (ReplicatedBackend::sub_op_modify(std::tr1::shared_ptr<OpRequest>)+0x49) [0xa19c09] 4: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x38e) [0xa1a03e] 5: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x167) [0x843de7] 6: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3d5) [0x6a1b05] 7: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x331) [0x6a2051] 8: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x875) [0xbb5ab5] 9: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xbb7bd0] 10: (()+0x8182) [0x7f4f09729182] 11: (clone()+0x6d) [0x7f4f07c9447d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The node I've set as out, after the whole cluster recovered and was HEALTH_OK, I just set it as in and it backfilled and is now working as normal.
The crusmap has a failure domain of rack.
Updated by Sage Weil about 7 years ago
- Status changed from Need More Info to Won't Fix
please reopen this ticket if you are still seeing this (on a later hammer version or on jewel+)