Project

General

Profile

Actions

Bug #40576

closed

src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)

Added by Kenneth Van Alstyne almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

It appears that a snapshot containing an inconsistent object was removed at some point, causing ceph-osd to now crash upon startup trying to recover an object that doesn't exist. I'm not sure how to get out of this state, as I've tried placing objects back where ceph-osd expects them, deleting the object with ceph-objectstore-tool, among other things. Logs are below:

   -65> 2019-06-28 02:05:35.605735 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) getattr(4422): 0.86_head/#0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:ad30# '_'
   -64> 2019-06-28 02:05:35.605772 7f26841ff700 10 filestore(/var/lib/ceph/osd/ceph-9) getattr(4456): 0.86_head/#0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:ad30# '_' = 272
   -63> 2019-06-28 02:05:35.605780 7f26841ff700 20 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]   0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:ad30 177772'14642335
   -62> 2019-06-28 02:05:35.605793 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) getattr(4422): 0.86_head/#0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:b58f# '_'
   -61> 2019-06-28 02:05:35.605831 7f26841ff700 10 filestore(/var/lib/ceph/osd/ceph-9) getattr(4456): 0.86_head/#0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:b58f# '_' = 272
   -60> 2019-06-28 02:05:35.605842 7f26841ff700 20 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]   0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:b58f 194432'17055067
   -59> 2019-06-28 02:05:35.605856 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) getattr(4422): 0.86_head/#0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:head# '_'
   -58> 2019-06-28 02:05:35.605896 7f26841ff700 10 filestore(/var/lib/ceph/osd/ceph-9) getattr(4456): 0.86_head/#0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:head# '_' = 295
   -57> 2019-06-28 02:05:35.605904 7f26841ff700 20 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]   0:619c1df5:::rbd_data.c01ee86b8b4567.00000000000078d6:head 202012'18086755
   -56> 2019-06-28 02:05:35.605918 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) getattr(4422): 0.86_head/#0:619c200f:::rbd_data.51b3966b8b4567.000000000000d0b4:head# '_'
   -55> 2019-06-28 02:05:35.605956 7f26841ff700 10 filestore(/var/lib/ceph/osd/ceph-9) getattr(4456): 0.86_head/#0:619c200f:::rbd_data.51b3966b8b4567.000000000000d0b4:head# '_' = 295
   -54> 2019-06-28 02:05:35.605964 7f26841ff700 20 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]   0:619c200f:::rbd_data.51b3966b8b4567.000000000000d0b4:head 198559'17594081
   -53> 2019-06-28 02:05:35.605977 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) getattr(4422): 0.86_head/#0:619c39fb:::rbd_data.6e98246b8b4567.0000000000001101:head# '_'
   -52> 2019-06-28 02:05:35.606014 7f26841ff700 10 filestore(/var/lib/ceph/osd/ceph-9) getattr(4456): 0.86_head/#0:619c39fb:::rbd_data.6e98246b8b4567.0000000000001101:head# '_' = 295
   -51> 2019-06-28 02:05:35.606022 7f26841ff700 20 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]   0:619c39fb:::rbd_data.6e98246b8b4567.0000000000001101:head 64556'6904478
   -50> 2019-06-28 02:05:35.606301 7f26841ff700  1 -- 10.0.0.2:6817/25972 --> 10.0.0.1:6830/2017138 -- pg_scan(digest 0.86 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head-0:619c4ebb:::rbd_data.51a1f46b8b4567.000000000000834d:head e 205161/205161) v2 -- 0x7f267861f100 con 0
   -49> 2019-06-28 02:05:35.606447 7f26841ff700 10 osd.9 205161 dequeue_op 0x7f267f020840 finish
   -48> 2019-06-28 02:05:35.606464 7f26841ff700 20 osd.9 op_wq(4) _process empty q, waiting
   -13> 2019-06-28 02:05:35.633334 7f26841ff700 20 osd.9 op_wq(4) _process 0.86 item PGQueueable(0x7f267f022600 prio 1 cost 1000 e205161) queued
   -12> 2019-06-28 02:05:35.633348 7f26841ff700 20 osd.9 op_wq(4) _process 0.86 item PGQueueable(0x7f267f022600 prio 1 cost 1000 e205161) pg 0x7f26b40ba800
   -11> 2019-06-28 02:05:35.633352 7f26841ff700 10 osd.9 205161 dequeue_op 0x7f267f022600 prio 1 cost 1000 latency 0.000165 MOSDPGPush(0.86 205161/205159 [PushOp(0:618b9825:::rbd_data.554d416b8b4567.000000000000151f:b691, version: 198559'17593798, data_included: [], data_size: 0, omap_header_size: 0, omap_entries_size: 0, attrset_size: 1, recovery_info: ObjectRecoveryInfo(0:618b9825:::rbd_data.554d416b8b4567.000000000000151f:b691@198559'17593798, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v3 pg pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]
   -10> 2019-06-28 02:05:35.633384 7f26841ff700 20 osd.9 205161 share_map osd.12 10.0.0.1:6830/2017138 205161
    -9> 2019-06-28 02:05:35.633390 7f26841ff700 20 osd.9 205161 should_share_map osd.12 10.0.0.1:6830/2017138 205161
    -8> 2019-06-28 02:05:35.633399 7f26841ff700 10 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped] _handle_message: 0x7f267f022600
    -7> 2019-06-28 02:05:35.633422 7f26841ff700 10 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped] handle_push ObjectRecoveryInfo(0:618b9825:::rbd_data.554d416b8b4567.000000000000151f:b691@198559'17593798, size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[])ObjectRecoveryProgress(!first, data_recovered_to:0, data_complete:true, omap_recovered_to:, omap_complete:true, error:false)
    -6> 2019-06-28 02:05:35.633455 7f26841ff700 10 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped] on_local_recover: 0:618b9825:::rbd_data.554d416b8b4567.000000000000151f:b691
    -5> 2019-06-28 02:05:35.633470 7f26841ff700 20 snap_mapper.remove_oid 0:618b9825:::rbd_data.554d416b8b4567.000000000000151f:b691
    -4> 2019-06-28 02:05:35.633472 7f26841ff700 20 snap_mapper._remove_oid 0:618b9825:::rbd_data.554d416b8b4567.000000000000151f:b691
    -3> 2019-06-28 02:05:35.633483 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) omap_get_values(5132): meta/#-1:c0371625:::snapmapper:0#
    -2> 2019-06-28 02:05:35.633713 7f26841ff700 15 filestore(/var/lib/ceph/osd/ceph-9) omap_get_values(5158): meta/#-1:c0371625:::snapmapper:0# = 0()
    -1> 2019-06-28 02:05:35.633726 7f26841ff700 20 osd.9 pg_epoch: 205161 pg[0.86( v 205161'18470166 (204988'18460094,205161'18470166] lb 0:618b7f49:::rbd_data.bf0de66b8b4567.000000000000c40b:head (bitwise) local-lis/les=204992/204993 n=4563 ec=59/59 lis/c 204992/201096 les/c/f 204993/204442/0 205158/205159/205159) [9,12,16]/[12,16] r=-1 lpr=205159 pi=[201096,205159)/1 luod=0'0 crt=205161'18470166 lcod 205161'18470165 active+remapped]  snapset 0=[]:[] legacy_snaps []
     0> 2019-06-28 02:05:35.636224 7f26841ff700 -1 /tmp/b-2I_Suq1c6XVP/build/node/root/packages/ceph/workdir-FunpCBst5QV4/src/osd/PrimaryLogPG.cc: In function 'virtual void PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, ObjectContextRef, bool, ObjectStore::Transaction*)' thread 7f26841ff700 time 2019-06-28 02:05:35.633744
2019-06-28 02:05:35.671180 7f26841ff700 -1 *** Caught signal (Aborted) **
 in thread 7f26841ff700 thread_name:tp_osd_tp
     0> 2019-06-28 02:05:35.671180 7f26841ff700 -1 *** Caught signal (Aborted) **
 in thread 7f26841ff700 thread_name:tp_osd_tp
Actions

Also available in: Atom PDF