Project

General

Profile

Bug #41680

Removed OSDs with outstanding peer failure reports crash the monitor

Added by shuguang wang 7 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous, mimic, nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature:

Description

The osd have been reduced, but reported anomaly information for partner OSD Previously. However, reporters of failure_info this is osd not deleted, so Monitor::check_failure() tries to look them up in the OSDMap and crashes.


Related issues

Copied to RADOS - Backport #42152: nautilus: Removed OSDs with outstanding peer failure reports crash the monitor Resolved
Copied to RADOS - Backport #42153: luminous: Removed OSDs with outstanding peer failure reports crash the monitor Resolved
Copied to RADOS - Backport #42154: mimic: Removed OSDs with outstanding peer failure reports crash the monitor Resolved

History

#1 Updated by Greg Farnum 7 months ago

  • Status changed from New to Won't Fix

OSD failure reports will die out on their own eventually and there's no general reason to expect a removed OSD was inaccurate in its reports.

#2 Updated by Greg Farnum 6 months ago

  • Subject changed from osd reduced causing dirty data in monitor to Removed OSDs with outstanding peer failure reports crash the monitor
  • Description updated (diff)
  • Category set to Correctness/Safety
  • Status changed from Won't Fix to 17
  • Component(RADOS) Monitor added

#3 Updated by Kefu Chai 6 months ago

  • Status changed from 17 to Pending Backport
  • Backport set to luminous, mimic, nautilus
  • Pull request ID set to 30200

#4 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #42152: nautilus: Removed OSDs with outstanding peer failure reports crash the monitor added

#5 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #42153: luminous: Removed OSDs with outstanding peer failure reports crash the monitor added

#6 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #42154: mimic: Removed OSDs with outstanding peer failure reports crash the monitor added

#7 Updated by Nathan Cutler 5 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF