Project

General

Profile

Actions

Bug #42186

closed

"2019-10-04T19:31:51.053283+0000 osd.7 (osd.7) 108 : cluster [ERR] 2.5s0 shard 0(1) 2:a4330255:::smithi02718399-20:c0 : missing" in cluster log

Added by Sage Weil over 4 years ago. Updated over 3 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/sage-2019-10-04_18:20:43-rados-wip-sage-testing-2019-10-04-0923-distro-basic-smithi/4358878


Related issues 1 (0 open1 closed)

Has duplicate RADOS - Bug #42178: scrub errors due to missing objectsDuplicate10/03/2019

Actions
Actions #1

Updated by Sage Weil over 4 years ago

/a/sage-2019-10-06_19:16:50-rados-master-distro-basic-smithi/4364824

"2019-10-06T21:24:06.336207+0000 osd.1 (osd.1) 24 : cluster [ERR] 2.d shard 1 2:b5e6290d:::benchmark_data_smithi095_18505_object14978:head : missing" in cluster log

Actions #2

Updated by Sage Weil over 4 years ago

  • Priority changed from High to Urgent
Actions #3

Updated by Neha Ojha over 4 years ago

  • Has duplicate Bug #42178: scrub errors due to missing objects added
Actions #4

Updated by David Zafman over 4 years ago

From the original description /a/sage-2019-10-04_18:20:43-rados-wip-sage-testing-2019-10-04-0923-distro-basic-smithi/4358878

osd.7 is the primary and it is scrubbing with a trimq. The snapshot c0 is the one that was found missing on osd.0. I don't know why there is a rollforward but


2019-10-04T19:31:51.052+0000 7f6dfa305700 10 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] _handle_message: replica_scrub(pg: 2.5s1,from:0'0,to:221'180,epoch:221/212,start:2:a0000000::::0,end:MAX,chunky:1,deep:0,version:9,allow_preemption:1,priority=5) v9
2019-10-04T19:31:51.052+0000 7f6dfa305700  7 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] replica_scrub
2019-10-04T19:31:51.052+0000 7f6dfa305700 10 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] requeue_scrub: queueing
2019-10-04T19:31:51.052+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] scrub state BUILD_MAP_REPLICA [2:a0000000::::0,MAX) max_end MAX
2019-10-04T19:31:51.052+0000 7f6df62fd700 10 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] build_scrub_map_chunk [2:a0000000::::0,MAX)  pos (0/0)
2019-10-04T19:31:51.052+0000 7f6df62fd700 10 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] requeue_scrub: queueing
2019-10-04T19:31:51.052+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] scrub final state BUILD_MAP_REPLICA [2:a0000000::::0,MAX) max_end MAX
2019-10-04T19:31:51.052+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] scrub state BUILD_MAP_REPLICA [2:a0000000::::0,MAX) max_end MAX
2019-10-04T19:31:51.052+0000 7f6df62fd700 10 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] build_scrub_map_chunk [2:a0000000::::0,MAX)  pos (0/5 2:a13f2802:::smithi02718399-16:9a)

Just showing object smithi02718399-20
2019-10-04T19:31:51.052+0000 7f6df62fd700 20 bluestore(/var/lib/ceph/osd/ceph-0) _collection_list oid 1#2:a4330255:::smithi02718399-20:a6# end 1#MAX#
2019-10-04T19:31:51.052+0000 7f6df62fd700 20 bluestore(/var/lib/ceph/osd/ceph-0) _collection_list oid 1#2:a4330255:::smithi02718399-20:b3# end 1#MAX#
2019-10-04T19:31:51.052+0000 7f6df62fd700 20 bluestore(/var/lib/ceph/osd/ceph-0) _collection_list oid 1#2:a4330255:::smithi02718399-20:head# end 1#MAX#

2019-10-04T19:31:51.052+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] build_scrub_map_chunk finishing
2019-10-04T19:31:51.053+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] build_scrub_map_chunk done, got 5 items
2019-10-04T19:31:51.053+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 lcod 215'178 active mbc={} ps=32] scrub final state INACTIVE [MIN,MIN) max_end MIN
2019-10-04T19:31:51.055+0000 7f6df62fd700  7 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=215'178 active mbc={} ps=32] handle_scrub_reserve_release MOSDScrubReserve(2.5s1 RELEASE e221) v1

Primary sees it missing
2019-10-04T19:31:51.053+0000 7f2e00969700  2 osd.7 pg_epoch: 221 pg[2.5s0( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=0 lpr=212 luod=215'178 crt=215'178 lcod 213'176 mlcod 213'176 active+clean+scrubbing trimq=[9c~3,a0~1,a2~3,a6~1,a9~2,ad~2,b3~1,b6~1,b9~1,bf~1,c2~1] ps=32] 2.5s0 shard 0(1) 2:a4330255:::smithi02718399-20:c0 : missing

Replica creates this snapshot here
2019-10-04T19:31:51.056+0000 7f6df62fd700 20 osd.0 pg_epoch: 221 pg[2.5s1( v 221'180 (0'0,221'180] local-lis/les=212/213 n=6 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=221'180 active mbc={} ps=32] rollforward: entry=221'179 (215'178) clone    2:a4330255:::smithi02718399-20:c0 by unknown.0.0:0 2019-10-04T19:31:12.109543+0000 0 snaps [c0,be,b8] ObjectCleanRegions clean_offsets: [0~18446744073709551615], clean_omap: 1, new_object: 0

Snaptrim delete on primary
2019-10-04T19:32:01.079+0000 7f2e04971700 10 osd.7 pg_epoch: 232 pg[2.5s0( v 231'194 (0'0,231'194] local-lis/les=212/213 n=3 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=0 lpr=212 crt=231'194 lcod 231'193 mlcod 231'193 active+clean+inconsistent+snaptrim trimq=[b8~2,be~3,c2~1,cd~1] ps=46] append_log log((0'0,231'194], crt=231'194) [232'195 (221'179) delete   2:a4330255:::smithi02718399-20:c0 by unknown.0.0:0 2019-10-04T19:32:01.079997+0000 0 ObjectCleanRegions clean_offsets: [0~18446744073709551615], clean_omap: 1, new_object: 0,232'196 (230'192) modify   2:a4330255:::smithi02718399-20:head by unknown.0.0:0 2019-10-04T19:32:01.079997+0000 0 ObjectCleanRegions clean_offsets: [0~18446744073709551615], clean_omap: 1, new_object: 0]

Delete on replica
2019-10-04T19:32:01.082+0000 7f6df62fd700 10 osd.0 pg_epoch: 232 pg[2.5s1( v 231'194 (0'0,231'194] local-lis/les=212/213 n=3 ec=20/20 lis/c=212/212 les/c/f=213/213/0 sis=212) [7,0,1]p7(0) r=1 lpr=212 luod=0'0 crt=231'194 active mbc={} ps=46] append_log log((0'0,231'194], crt=231'194) [232'195 (221'179) delete   2:a4330255:::smithi02718399-20:c0 by unknown.0.0:0 2019-10-04T19:32:01.079997+0000 0 ObjectCleanRegions clean_offsets: [0~18446744073709551615], clean_omap: 1, new_object: 0,232'196 (230'192) modify   2:a4330255:::smithi02718399-20:head by unknown.0.0:0 2019-10-04T19:32:01.079997+0000 0 ObjectCleanRegions clean_offsets: [0~18446744073709551615], clean_omap: 1, new_object: 0]
Actions #5

Updated by Neha Ojha over 4 years ago

  • Assignee set to Neha Ojha
Actions #6

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions #7

Updated by Sage Weil over 4 years ago

  • Priority changed from Urgent to High
Actions #8

Updated by Neha Ojha over 3 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF