Project

General

Profile

Actions

Bug #40468

closed

mon: assert on remote state in Paxos::dispatch can fail

Added by Greg Farnum almost 5 years ago. Updated almost 5 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus, mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Paxos.cc::dispatch, line 1426 in current master:

ceph_assert(mon->is_leader() || (mon->is_peon() && m->get_source().num() == mon->get_leader()));

In gregf-2019-06-20_03:43:56-rados:monthrash-wip-elector-distro-basic-mira/4049829 the leader freezes for 16 seconds (I think bad luck in the mon thrasher?), a new one is elected, and the old leader sends out a paxos propose that triggers this assert in its peons. We can probably fix it by checking that the leader is new enough (the same election epoch I think is safe?)

Actions #1

Updated by Greg Farnum almost 5 years ago

  • Status changed from New to Fix Under Review
  • Backport set to nautilus, mimic, luminous
  • Pull request ID set to 28680
Actions #2

Updated by Greg Farnum almost 5 years ago

  • Status changed from Fix Under Review to Rejected

Bug in branch, not upstream.

Actions

Also available in: Atom PDF