Project

General

Profile

Bug #38840

Updated by David Zafman almost 4 years ago


dzafman-2019-03-20_19:53:02-rados-wip-zafman-testing-distro-basic-smithi/3754443

This looks like a cache tier eviction deleting a clone while a simultaneous scrub is happening.

Messages leading to the error with "smithi08716243-128:d8"

<pre>
2019-03-21 09:59:25.959 7f91c26d6700 20 bluestore(/var/lib/ceph/osd/ceph-4) _collection_list oid #3:bbcdb3d6:::smithi08716243-128:d8# end GHMAX
2019-03-21 09:59:25.959 7f91c26d6700 20 bluestore(/var/lib/ceph/osd/ceph-4) _collection_list oid #3:bbcdb3d6:::smithi08716243-128:d8# end #3:bbcdb3d6:::smithi08716243-128:fb#
2019-03-21 09:59:25.959 7f91c26d6700 10 osd.4 pg_epoch: 375 pg[3.1( v 375'2366 (0'0,375'2366] local-lis/les=360/361 n=53 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 crt=375'2366 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] be_scan_list (24/25 3:bbcdb3d6:::smithi08716243-128:d8)
2019-03-21 09:59:25.959 7f91c26d6700 10 bluestore(/var/lib/ceph/osd/ceph-4) stat 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8#
2019-03-21 09:59:25.959 7f91c26d6700 15 bluestore(/var/lib/ceph/osd/ceph-4) getattrs 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8#
2019-03-21 09:59:25.959 7f91c26d6700 10 bluestore(/var/lib/ceph/osd/ceph-4) getattrs 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8# = 0
2019-03-21 09:59:25.959 7f91c26d6700 25 osd.4 pg_epoch: 375 pg[3.1( v 375'2366 (0'0,375'2366] local-lis/les=360/361 n=53 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 crt=375'2366 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] be_scan_list 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.967 7f91be6ce700 10 osd.4 pg_epoch: 375 pg[3.1( v 375'2366 (0'0,375'2366] local-lis/les=360/361 n=53 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 crt=375'2366 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] be_select_auth_object: selecting osd 4 for obj 3:bbcdb3d6:::smithi08716243-128:d8 with oi 3:bbcdb3d6:::smithi08716243-128:d8(312'2265 client.4453.0:4316 dirty s 2622057 uv 1627 alloc_hint [0 0 0])
2019-03-21 09:59:25.971 7f91c4edb700 10 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=52 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] finish_flush 3:bbcdb3d6:::smithi08716243-128:d8 tid 1113 (0) Success
2019-03-21 09:59:25.971 7f91c36d8700 20 osd.4 375 agent_entry oids 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91c4edb700 10 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=52 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] agent_maybe_evict evicting 3:bbcdb3d6:::smithi08716243-128:d8(312'2265 client.4453.0:4316 dirty s 2622057 uv 1627 alloc_hint [0 0 0])
2019-03-21 09:59:25.971 7f91c4edb700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=52 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] simple_opc_create 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91c4edb700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=52 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] _delete_oid 3:bbcdb3d6:::smithi08716243-128:d8 whiteout=0 no_whiteout=1 try_no_whiteout=0
2019-03-21 09:59:25.971 7f91c4edb700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=52 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] finish_ctx 3:bbcdb3d6:::smithi08716243-128:d8 0xfb79600 op delete
2019-03-21 09:59:25.971 7f91c4edb700 7 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=52 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] issue_repop rep_tid 2009 o 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91c4edb700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] apply_stats 3:bbcdb3d6:::smithi08716243-128:d8 < [3:bbcdb3d6:::smithi08716243-128:fb,MAX)

2019-03-21 09:59:25.971 7f91c4edb700 1 -- [v2:172.21.15.205:6818/13410,v1:172.21.15.205:6819/13410] --> [v2:172.21.15.87:6804/13348,v1:172.21.15.87:6805/13348] -- osd_repop(osd.4.0:2009 3.1 e375/360 3:bbcdb3d6:::smithi08716243-128:d8 v 375'2368) v2 -- 0x11c85000 con 0xf1fe480
2019-03-21 09:59:25.971 7f91c4edb700 20 snap_mapper.remove_oid 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91c4edb700 20 snap_mapper._remove_oid 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91c4edb700 20 snap_mapper.get_snaps 3:bbcdb3d6:::smithi08716243-128:d8 ca
2019-03-21 09:59:25.971 7f91c4edb700 20 snap_mapper.clear_snaps 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91c4edb700 10 osd.4 pg_epoch: 375 pg[3.1( v 375'2367 (0'0,375'2367] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] append_log log((0'0,375'2367], crt=375'2367) [375'2368 (312'2265) delete 3:bbcdb3d6:::smithi08716243-128:d8 by osd.4.0:2009 2019-03-21 09:59:25.975993 0]
2019-03-21 09:59:25.971 7f91c4edb700 10 osd.4 pg_epoch: 375 pg[3.1( v 375'2368 (0'0,375'2368] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 lua=375'2367 crt=375'2367 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] add_log_entry 375'2368 (312'2265) delete 3:bbcdb3d6:::smithi08716243-128:d8 by osd.4.0:2009 2019-03-21 09:59:25.975993 0
2019-03-21 09:59:25.971 7f91c4edb700 20 trim 375'2368 (312'2265) delete 3:bbcdb3d6:::smithi08716243-128:d8 by osd.4.0:2009 2019-03-21 09:59:25.975993 0
2019-03-21 09:59:25.971 7f91c4edb700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2368 (0'0,375'2368] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 lua=375'2367 crt=375'2368 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] rollforward: entry=375'2368 (312'2265) delete 3:bbcdb3d6:::smithi08716243-128:d8 by osd.4.0:2009 2019-03-21 09:59:25.975993 0
2019-03-21 09:59:25.971 7f91c4edb700 15 bluestore(/var/lib/ceph/osd/ceph-4) _remove 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8# onode 0x11960000 txc 0xc9c2300
2019-03-21 09:59:25.971 7f91c4edb700 15 bluestore(/var/lib/ceph/osd/ceph-4) _do_truncate 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8# 0x0
2019-03-21 09:59:25.971 7f91c4edb700 10 bluestore(/var/lib/ceph/osd/ceph-4) _remove 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8# = 0
2019-03-21 09:59:25.971 7f91be6ce700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2368 (0'0,375'2368] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2368 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] scrub 3:bbcdb3d6:::smithi08716243-128:d8 3:bbcdb3d6:::smithi08716243-128:d8(312'2265 client.4453.0:4316 dirty s 2622057 uv 1627 alloc_hint [0 0 0])
2019-03-21 09:59:25.971 7f91be6ce700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2368 (0'0,375'2368] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2368 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] scrub_snapshot_metadata scrub matched clone 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91be6ce700 20 osd.4 pg_epoch: 375 pg[3.1( v 375'2368 (0'0,375'2368] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=375'2366 crt=375'2368 lcod 375'2365 mlcod 375'2365 active+clean+scrubbing trimq=[107~1,10a~3,10e~2] ps=[da~1,e7~1,ea~1,f1~1,f9~1,fe~1]] _scan_snaps 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 09:59:25.971 7f91be6ce700 20 snap_mapper.get_snaps 3:bbcdb3d6:::smithi08716243-128:d8 got.empty()
2019-03-21 09:59:25.971 7f91be6ce700 -1 log_channel(cluster) log [ERR] : osd.4 found snap mapper error on pg 3.1 oid 3:bbcdb3d6:::smithi08716243-128:d8 snaps missing in mapper, should be: ca was r -2...repaired
</pre>

Later the object info is missing and "Can not trim 3:bbcdb3d6:::smithi08716243-128:d8 repair needed" keeps happening.

<pre>
2019-03-21 09:59:25.971 7f91be6ce700 20 snap_mapper.add_oid 3:bbcdb3d6:::smithi08716243-128:d8 ca
2019-03-21 09:59:25.971 7f91be6ce700 20 snap_mapper.get_snaps 3:bbcdb3d6:::smithi08716243-128:d8 got.empty()
2019-03-21 09:59:25.971 7f91be6ce700 20 snap_mapper.set_snaps 3:bbcdb3d6:::smithi08716243-128:d8 ca
2019-03-21 10:00:01.990 7f91c26d6700 10 osd.4 pg_epoch: 411 pg[3.1( v 411'2577 (0'0,411'2577] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=411'2575 crt=411'2577 lcod 411'2574 mlcod 411'2574 active+clean+snaptrim trimq=[ca~1,116~2,11c~1,11f~2,127~2,12f~1,131~1,136~1] ps=[10d~1]] SnapTrimmer state<Trimming/AwaitAsyncWork>: AwaitAsyncWork react trimming 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 10:00:01.990 7f91c26d6700 10 osd.4 pg_epoch: 411 pg[3.1( v 411'2577 (0'0,411'2577] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=411'2575 crt=411'2577 lcod 411'2574 mlcod 411'2574 active+clean+snaptrim trimq=[ca~1,116~2,11c~1,11f~2,127~2,12f~1,131~1,136~1] ps=[10d~1]] get_object_context: obc NOT found in cache: 3:bbcdb3d6:::smithi08716243-128:d8
2019-03-21 10:00:01.990 7f91c26d6700 15 bluestore(/var/lib/ceph/osd/ceph-4) getattr 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8# _
2019-03-21 10:00:01.990 7f91c26d6700 10 bluestore(/var/lib/ceph/osd/ceph-4) getattr 3.1_head #3:bbcdb3d6:::smithi08716243-128:d8# _ = -2
2019-03-21 10:00:01.990 7f91c26d6700 10 osd.4 pg_epoch: 411 pg[3.1( v 411'2577 (0'0,411'2577] local-lis/les=360/361 n=51 ec=20/20 lis/c 360/360 les/c/f 361/361/0 360/360/360) [4,0] r=0 lpr=360 luod=411'2575 crt=411'2577 lcod 411'2574 mlcod 411'2574 active+clean+snaptrim trimq=[ca~1,116~2,11c~1,11f~2,127~2,12f~1,131~1,136~1] ps=[10d~1]] get_object_context: no obc for soid 3:bbcdb3d6:::smithi08716243-128:d8 and !can_create
2019-03-21 10:00:01.990 7f91c26d6700 -1 log_channel(cluster) log [ERR] : trim_object: Can not trim 3:bbcdb3d6:::smithi08716243-128:d8 repair needed (no obc)
</pre>

Back