Project

General

Profile

Actions

Bug #49525

closed

found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missing in mapper, should be: dc was r -2...repaired

Added by Sage Weil about 3 years ago. Updated 22 days ago.

Status:
Resolved
Priority:
Normal
Category:
Scrub/Repair
Target version:
-
% Done:

100%

Source:
Development
Tags:
backport_processed
Backport:
quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-02-26T22:46:33.060+0000 7fd3f7624700 20 snap_mapper.remove_oid 3:4abe9991:::smithi10121515-14:e4
2021-02-26T22:46:33.060+0000 7fd3f7624700 20 snap_mapper._remove_oid 3:4abe9991:::smithi10121515-14:e4
2021-02-26T22:46:33.060+0000 7fd3f7624700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:e4 dc
2021-02-26T22:46:33.060+0000 7fd3f7624700 20 snap_mapper.clear_snaps 3:4abe9991:::smithi10121515-14:e4
2021-02-26T22:46:33.060+0000 7fd3f7624700 20 snap_mapper.clear_snaps rm OBJ_.0_0000000000000003.25D79998.e4.smithi10121515-14..
2021-02-26T22:46:33.060+0000 7fd3f7624700 20 snap_mapper._remove_oid rm SNA_3_00000000000000DC_.0_0000000000000003.25D79998.e4.smithi10121515-14..
...
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.remove_oid 3:4abe9991:::smithi10121515-14:e4
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper._remove_oid 3:4abe9991:::smithi10121515-14:e4
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:e4 got.empty()
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.add_oid 3:4abe9991:::smithi10121515-14:e4 dc
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:e4 got.empty()
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.set_snaps 3:4abe9991:::smithi10121515-14:e4 dc
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.set_snaps set OBJ_.1_0000000000000003.25D79998.e4.smithi10121515-14..
2021-02-26T22:47:08.268+0000 7fd3fb62c700 20 snap_mapper.add_oid set SNA_3_00000000000000DC_.1_0000000000000003.25D79998.e4.smithi10121515-14..
...
2021-02-26T22:50:00.857+0000 7f9f96517700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:1c9 1a2,1a4,1a6,1a9,1ac,1af,1b0,1b6,1b9,1be,1c1,1c4,1c5,1c8,1c9
2021-02-26T22:50:00.857+0000 7f9f96517700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:19d 18c,191,192,196,199,19a,19d
2021-02-26T22:50:00.857+0000 7f9f96517700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:18b 182,186,187,18a,18b
2021-02-26T22:50:00.857+0000 7f9f96517700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:17e 16b,17e
2021-02-26T22:50:00.857+0000 7f9f96517700 20 snap_mapper.get_snaps 3:4abe9991:::smithi10121515-14:e4 got.empty()
2021-02-26T22:50:00.857+0000 7f9f96517700 -1 log_channel(cluster) log [ERR] : osd.4 found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missing in mapper, should be: dc was  r -2...repaired

/a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5916958

looks it was removed twice and then we expected it to be there.


Related issues 4 (1 open3 closed)

Related to RADOS - Bug #56438: found snap mapper error on pg 3.bs0> oid 3:d81a0fb3:::smithi10749189-30:137 snaps missing in mapper, should be: 132,137 ...repairedNeed More Info

Actions
Has duplicate RADOS - Bug #55794: scrub: scrub is not prevented from started while snap-trimming is in progressDuplicateRonen Friedman

Actions
Copied to RADOS - Backport #55972: quincy: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missing in mapper, should be: dc was r -2...repairedResolvedActions
Copied to RADOS - Backport #55973: pacific: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missing in mapper, should be: dc was r -2...repairedRejectedRonen FriedmanActions
Actions #1

Updated by Neha Ojha about 3 years ago

  • Assignee set to Ronen Friedman

Ronen, can you check if this is caused due to a race between scrub and snap remove.

Actions #2

Updated by Neha Ojha almost 3 years ago

The sequence looks a lit different this time.

/a/rfriedma-2021-06-26_19:32:15-rados-wip-ronenf-scrubs-config-distro-basic-smithi/6194769

2021-06-27T00:35:13.055+0000 7f0219eda700 -1 log_channel(cluster) log [ERR] : osd.2 found snap mapper error on pg 3.0 oid 3:0343d018:::smithi02739718-47:58 snaps missing in mapper, should be: 4f,55,58 was  r -2...repaired
2021-06-27T00:35:13.062+0000 7f0215ed2700 -1 log_channel(cluster) log [ERR] : osd.2 found snap mapper error on pg 3.0 oid 3:0995216d:::smithi02739718-477:76 snaps missing in mapper, should be: 5e,63,67,69,71,75,76 was  r -2...repaired
2021-06-27T00:35:13.079+0000 7f0215ed2700 -1 log_channel(cluster) log [ERR] : osd.2 found snap mapper error on pg 3.0 oid 3:2a0952a2:::smithi02739718-230:6b snaps missing in mapper, should be: 3f,41,46,49,4c,4f,5e,61,63,67,69,6b was  r -2...repaired
2021-06-27T00:35:13.079+0000 7f0215ed2700 -1 log_channel(cluster) log [ERR] : osd.2 found snap mapper error on pg 3.0 oid 3:2457da21:::smithi02739718-4:78 snaps missing in mapper, should be: 41,46,49,4c,4f,5e,63,67,69,71,75,78 was  r -2...repaired

Looking at what happened on "3:0343d018:::smithi02739718-47:58" just before this, seems like the snaps should have been present after recovery:

2021-06-27T00:34:06.965+0000 7f5acbce5700 10 osd.2 pg_epoch: 243 pg[3.0( v 243'710 lc 168'583 (0'0,243'710] local-lis/les=240/241 n=175 ec=19/19 lis/c=240/203 les/c/f=241/204/0 sis=240) [2,1,6]/[0,2] r=1 lpr=240 pi=[203,240)/2 luod=0'0 lua=233'702 crt=243'710 mlcod 30'13 active+remapped m=85 mbc={}] on_local_recover: 3:0343d018:::smithi02739718-47:58
2021-06-27T00:34:06.965+0000 7f5acbce5700 20 snap_mapper.remove_oid 3:0343d018:::smithi02739718-47:58
2021-06-27T00:34:06.965+0000 7f5acbce5700 20 snap_mapper._remove_oid 3:0343d018:::smithi02739718-47:58
2021-06-27T00:34:06.965+0000 7f5acbce5700 20 snap_mapper.get_snaps 3:0343d018:::smithi02739718-47:58 got.empty()
2021-06-27T00:34:06.965+0000 7f5acbce5700 20 snap_mapper.add_oid 3:0343d018:::smithi02739718-47:58 4f,55,58
2021-06-27T00:34:06.965+0000 7f5acbce5700 20 snap_mapper.get_snaps 3:0343d018:::smithi02739718-47:58 got.empty()
2021-06-27T00:34:06.965+0000 7f5acbce5700 20 snap_mapper.set_snaps 3:0343d018:::smithi02739718-47:58 4f,55,58
...
2021-06-27T00:35:13.055+0000 7f0219eda700 20 osd.2 pg_epoch: 309 pg[3.0( v 308'878 (0'0,308'878] local-lis/les=304/305 n=179 ec=19/19 lis/c=304/304 les/c/f=305/305/0 sis=304 pruub=10.067659378s) [2,1,6] r=0 lpr=304 crt=308'878 lcod 308'876 mlcod 308'876 active+clean+scrubbing+snaptrim+wait pruub 27.799087524s@ [ 3.0:  ]  trimq=72]  scrubber pg(3.0) _scan_snaps 3:0343d018:::smithi02739718-47:58
2021-06-27T00:35:13.055+0000 7f0219eda700 20 snap_mapper.get_snaps 3:0343d018:::smithi02739718-47:58 got.empty()
2021-06-27T00:35:13.055+0000 7f0219eda700 -1 log_channel(cluster) log [ERR] : osd.2 found snap mapper error on pg 3.0 oid 3:0343d018:::smithi02739718-47:58 snaps missing in mapper, should be: 4f,55,58 was  r -2...repaired
Actions #3

Updated by Kamoltat (Junior) Sirivadhna almost 3 years ago

spotted again /a/ksirivad-2021-07-11_01:45:00-rados-wip-pg-autoscaler-overlap-distro-basic-smithi/6262966/

Actions #4

Updated by Ronen Friedman almost 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 46440

Indeed caused by scrub starting while the PG is being snap-trimmed.

Actions #5

Updated by Neha Ojha almost 2 years ago

  • Has duplicate Bug #55794: scrub: scrub is not prevented from started while snap-trimming is in progress added
Actions #6

Updated by Laura Flores almost 2 years ago

  • Backport set to quincy,pacific
Actions #7

Updated by Neha Ojha almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #8

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #55972: quincy: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missing in mapper, should be: dc was r -2...repaired added
Actions #9

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #55973: pacific: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missing in mapper, should be: dc was r -2...repaired added
Actions #10

Updated by Radoslaw Zarzynski almost 2 years ago

  • Related to Bug #56438: found snap mapper error on pg 3.bs0> oid 3:d81a0fb3:::smithi10749189-30:137 snaps missing in mapper, should be: 132,137 ...repaired added
Actions #12

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #13

Updated by Laura Flores about 1 year ago

  • Translation missing: en.field_tag_list set to test-failure

/a/yuriw-2023-03-13_19:57:13-rados-wip-yuri6-testing-2023-03-12-0918-pacific-distro-default-smithi/7205944

Actions #14

Updated by Laura Flores 12 months ago

/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251426

Actions #15

Updated by Konstantin Shalygin 4 months ago

  • Category set to Scrub/Repair
  • Component(RADOS) OSD added
Actions #16

Updated by Konstantin Shalygin 22 days ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100
  • Source set to Development
Actions

Also available in: Atom PDF