Bug #4879
mon/Paxos.cc: 557: FAILED assert(begin->last_committed == last_committed)
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
0> 2013-04-30 20:36:23.023960 7ffe2d304700 -1 mon/Paxos.cc: In function 'void Paxos::handle_begin(MMonPaxos*)' thread 7ffe2d304700 time 2013-04-30 20:36:23.022752 mon/Paxos.cc: 557: FAILED assert(begin->last_committed == last_committed) ceph version 0.60-773-g8828e9f (8828e9f9b4f96c6cf26cdce64be14db540fb00cf) 1: (Paxos::handle_begin(MMonPaxos*)+0xaf0) [0x4ea460] 2: (Paxos::dispatch(PaxosServiceMessage*)+0x25b) [0x4eb4fb] 3: (Monitor::_ms_dispatch(Message*)+0x10ac) [0x4c39dc] 4: (Monitor::ms_dispatch(Message*)+0x32) [0x4da702] 5: (DispatchQueue::entry()+0x3f1) [0x6afeb1] 6: (DispatchQueue::DispatchThread::entry()+0xd) [0x63e57d] 7: (()+0x7e9a) [0x7ffe32067e9a] 8: (clone()+0x6d) [0x7ffe30617ccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
job
ubuntu@teuthology:/a/sage-2013-04-30_20:06:38-rados-wip-mds-testing-basic/4131$ cat orig.config.yaml kernel: kdb: true sha1: 78faa055b2cab19860e4b53867f6951612d63494 machine_type: plana nuke-on-error: true overrides: ceph: conf: global: ms inject socket failures: 5000 mds: debug mds: 20 debug ms: 20 mon: debug mon: 20 debug ms: 20 debug paxos: 20 log-whitelist: - slow request sha1: 8828e9f9b4f96c6cf26cdce64be14db540fb00cf s3tests: branch: next workunit: sha1: 8828e9f9b4f96c6cf26cdce64be14db540fb00cf roles: - - mon.a - mon.d - mon.g - osd.0 - - mon.b - mon.e - mon.h - mds.a - - mon.c - mon.f - mon.i - osd.1 tasks: - chef: null - clock.check: null - install: null - ceph: null - mon_recovery: null
Associated revisions
mon/Paxos: update first_committed when we trim
The Paxos::trim() -> ::trim_to() path trims old states but does not
update first_committed. This misinforms later paxos rounds such that
peers think they can participate and end up with COMMIT messages
following the COLLECT/LAST exchange that are for future commits they
can't do anything with and then crash out when they get the BEGIN:
mon/Paxos.cc: 557: FAILED assert(begin->last_committed == last_committed)
Fixes: #4879
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
History
#1 Updated by Sage Weil almost 11 years ago
- Status changed from New to Fix Under Review
- Assignee set to Greg Farnum
wip-4879
#2 Updated by Sage Weil almost 11 years ago
- Status changed from Fix Under Review to Resolved