Project

General

Profile

Actions

Bug #46914

closed

mon: stuck osd_pgtemp message forwards

Added by Greg Farnum over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Administration/Usability
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus, nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

https://bugzilla.redhat.com/show_bug.cgi?id=1866257

We've seen osd_pgtemp messages which are forwarded to the leader not getting replies.


Related issues 2 (0 open2 closed)

Copied to RADOS - Backport #47091: octopus: mon: stuck osd_pgtemp message forwardsResolvedNathan CutlerActions
Copied to RADOS - Backport #47092: nautilus: mon: stuck osd_pgtemp message forwardsResolvedNeha OjhaActions
Actions #1

Updated by Greg Farnum over 3 years ago

Looking through the source, there's one clear way this happens: the leader may decide that a message can get dropped in preprocess_pgtemp() where the peon disagreed. That will usually involve some kind of race in changing states (eg, OSD getting marked down), but it seems possible.

So, just mark the op as no_reply when dropping the message.

Actions #2

Updated by Greg Farnum over 3 years ago

  • Status changed from New to Fix Under Review
  • Backport set to octopus, nautilus
Actions #3

Updated by Kefu Chai over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47091: octopus: mon: stuck osd_pgtemp message forwards added
Actions #5

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47092: nautilus: mon: stuck osd_pgtemp message forwards added
Actions #6

Updated by Neha Ojha over 3 years ago

  • Pull request ID set to 36593
Actions #7

Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF