Project

General

Profile

Bug #41317

PeeringState::GoClean will call purge_strays unconditionally

Added by Samuel Just 11 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
mimic,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature:

Description

http://pulpito.ceph.com/kchai-2019-08-15_06:56:12-rados-wip-kefu-testing-2019-08-15-1125-distro-basic-mira/4216150/

The primary:

2019-08-15T10:18:10.462+0000 7f53c3afe700 10 osd.1 pg_epoch: 782 pg[2.2as0( v 657'572 (0'0,657'572] local-lis/les=778/779 n=16 ec=637/22 lis/c=778/711 les/c/f=779/712/0 sis=778) [1,2147483647,2]p1(0) r=0 lpr=778 pi=[711,778)/1 crt=657'572 
mlcod 0'0 active+undersized+degraded mbc={0={},1={},2={}} ps=102] recovery done, no backfill
...
2019-08-15T10:18:10.462+0000 7f53c3afe700  5 osd.1 pg_epoch: 782 pg[2.2as0( v 657'572 (0'0,657'572] local-lis/les=778/779 n=16 ec=637/22 lis/c=778/711 les/c/f=779/712/0 sis=778) [1,2147483647,2]p1(0) r=0 lpr=778 pi=[711,778)/1 crt=657'572 
mlcod 0'0 active+undersized+degraded mbc={0={},1={},2={}} ps=102] enter Started/Primary/Active/Clean
...
2019-08-15T10:18:10.462+0000 7f53c3afe700 10 osd.1 pg_epoch: 782 pg[2.2as0( v 657'572 (0'0,657'572] local-lis/les=778/779 n=16 ec=637/22 lis/c=778/711 les/c/f=779/712/0 sis=778) [1,2147483647,2]p1(0) r=0 lpr=778 pi=[711,778)/1 crt=657'572 
mlcod 0'0 active+undersized+degraded mbc={0={},1={},2={}} ps=102] _finish_recovery
2019-08-15T10:18:10.462+0000 7f53c3afe700 10 osd.1 pg_epoch: 782 pg[2.2as0( v 657'572 (0'0,657'572] local-lis/les=778/779 n=16 ec=637/22 lis/c=778/711 les/c/f=779/712/0 sis=778) [1,2147483647,2]p1(0) r=0 lpr=778 pi=[711,778)/1 crt=657'572 mlcod 0'0 active+undersized+degraded mbc={0={},1={},2={}} ps=102] purge_strays 2(0),3(2)
2019-08-15T10:18:10.462+0000 7f53c3afe700 10 osd.1 pg_epoch: 782 pg[2.2as0( v 657'572 (0'0,657'572] local-lis/les=778/779 n=16 ec=637/22 lis/c=778/711 les/c/f=779/712/0 sis=778) [1,2147483647,2]p1(0) r=0 lpr=778 pi=[711,778)/1 crt=657'572 mlcod 0'0 active+undersized+degraded mbc={0={},1={},2={}} ps=102] sending PGRemove to osd.2(0)
...
2019-08-15T10:18:36.369+0000 7f53c3afe700 10 osd.1 pg_epoch: 799 pg[2.2as0( v 657'572 (0'0,657'572] local-lis/les=778/779 n=16 ec=637/22 lis/c=778/711 les/c/f=779/712/0 sis=778) [1,2147483647,2]p1(0) r=0 lpr=778 pi=[711,778)/1 crt=657'572 mlcod 0'0 active+undersized+degraded mbc={0={},1={},2={}} ps=102] discover_all_missing: osd.2(0): requesting pg_missing_t

The PGRemove message causes osd.2 to transition to ToDelete. The subsequent query causes it to crash.


Related issues

Related to RADOS - Bug #40963: mimic: MQuery during Deleting state Resolved
Copied to RADOS - Backport #43319: nautilus: PeeringState::GoClean will call purge_strays unconditionally Resolved
Copied to RADOS - Backport #43320: mimic: PeeringState::GoClean will call purge_strays unconditionally Resolved

History

#1 Updated by Samuel Just 11 months ago

  • Assignee set to Samuel Just

#2 Updated by Patrick Donnelly 11 months ago

  • Project changed from Ceph to RADOS
  • Target version set to v15.0.0
  • Start date deleted (08/16/2019)
  • Source set to Q/A

#3 Updated by Neha Ojha 7 months ago

  • Related to Bug #40963: mimic: MQuery during Deleting state added

#4 Updated by Neha Ojha 7 months ago

  • Assignee changed from Samuel Just to Neha Ojha
  • Backport changed from nautilus,others(?) to mimic,nautilus
  • Pull request ID set to 32195

#5 Updated by Neha Ojha 7 months ago

  • Status changed from New to Fix Under Review

#6 Updated by Sage Weil 7 months ago

  • Status changed from Fix Under Review to Pending Backport

#7 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #43319: nautilus: PeeringState::GoClean will call purge_strays unconditionally added

#8 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #43320: mimic: PeeringState::GoClean will call purge_strays unconditionally added

#9 Updated by Nathan Cutler 4 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF