Project

General

Profile

Bug #40468

mon: assert on remote state in Paxos::dispatch can fail

Added by Greg Farnum 2 months ago. Updated 2 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
Start date:
06/20/2019
Due date:
% Done:

0%

Source:
Tags:
Backport:
nautilus, mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:

Description

Paxos.cc::dispatch, line 1426 in current master:

ceph_assert(mon->is_leader() || (mon->is_peon() && m->get_source().num() == mon->get_leader()));

In gregf-2019-06-20_03:43:56-rados:monthrash-wip-elector-distro-basic-mira/4049829 the leader freezes for 16 seconds (I think bad luck in the mon thrasher?), a new one is elected, and the old leader sends out a paxos propose that triggers this assert in its peons. We can probably fix it by checking that the leader is new enough (the same election epoch I think is safe?)

History

#1 Updated by Greg Farnum 2 months ago

  • Status changed from New to Need Review
  • Backport set to nautilus, mimic, luminous
  • Pull request ID set to 28680

#2 Updated by Greg Farnum 2 months ago

  • Status changed from Need Review to Rejected

Bug in branch, not upstream.

Also available in: Atom PDF