Actions
Bug #4040
closedmon: Single-Paxos: on PGMonitor, FAILED assert(0 == "update_from_paxos: error parsing incremental update")
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
INFO:teuthology.task.ceph.mon.f.err:mon/PGMonitor.cc: 182: FAILED assert(0 == "update_from_paxos: error parsing incremental update") INFO:teuthology.task.ceph.mon.f.err: ceph version 0.56-489-g7bcacc4 (7bcacc45e0fca3460c87ac0800edf44382835685) INFO:teuthology.task.ceph.mon.f.err: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x90d2b9] INFO:teuthology.task.ceph.mon.f.err: 2: (PGMonitor::update_from_paxos()+0x978) [0x7ccde8] INFO:teuthology.task.ceph.mon.f.err: 3: (Monitor::_ms_dispatch(Message*)+0x13eb) [0x709049] INFO:teuthology.task.ceph.mon.f.err: 4: (Monitor::ms_dispatch(Message*)+0x38) [0x71f598] INFO:teuthology.task.ceph.mon.f.err: 5: (Messenger::ms_deliver_dispatch(Message*)+0x9b) [0x971dd5] INFO:teuthology.task.ceph.mon.f.err: 6: (DispatchQueue::entry()+0x549) [0x971581] INFO:teuthology.task.ceph.mon.f.err: 7: (DispatchQueue::DispatchThread::entry()+0x1c) [0x8f8a74] INFO:teuthology.task.ceph.mon.f.err: 8: (Thread::_entry_func(void*)+0x23) [0x900971] INFO:teuthology.task.ceph.mon.f.err: 9: (()+0x7e9a) [0x7f1788e7ae9a] INFO:teuthology.task.ceph.mon.f.err: 10: (clone()+0x6d) [0x7f17876334bd]
This yaml is awesome to flush out issues.
Files
Updated by Joao Eduardo Luis about 11 years ago
Something got messed up when updating the 'last_committed' version on mon.f, which by the way has fallen some 10 versions behind the leader (mon.c):
mon.f@3(peon).pg v487 update_from_paxos applying incremental 488 mon.f@3(peon).pg v487 update_from_paxos: error parsing incremental update: buffer::end_of_buffer mon/PGMonitor.cc: In function 'virtual void PGMonitor::update_from_paxos()' thread 7f1783b20700 time 2013-02-07 03:48:42.048063 mon/PGMonitor.cc: 182: FAILED assert(0 == "update_from_paxos: error parsing incremental update")
$ tst mon.f/store.db get pgmap 488 (pgmap, 488) does not exist $ tst mon.c/store.db get pgmap 488 (pgmap, 488) 0000 : 05 05 8c 00 00 00 e8 01 00 00 00 00 00 00 00 00 : ................ 0010 : 00 00 02 00 00 00 07 00 00 00 02 02 28 00 00 00 : ............(... 0020 : 00 e8 e1 92 a3 0c 14 00 00 64 c0 d2 7e e9 2d 00 : .........d..~.-. 0030 : 00 84 21 c0 24 23 26 00 00 00 00 00 00 00 00 00 : ..!.$#&......... 0040 : 00 00 00 00 00 00 00 00 0c 00 00 00 02 02 28 00 : ..............(. 0050 : 00 00 00 e8 e1 92 a3 0c 14 00 00 64 c0 d2 7e e9 : ...........d..~. 0060 : 2d 00 00 84 21 c0 24 23 26 00 00 00 00 00 00 00 : -...!.$#&....... 0070 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : ................ 0080 : 00 00 00 00 00 00 33 33 73 3f 9a 99 59 3f 00 00 : ......33s?..Y?.. 0090 : 00 00 : .. $ tst mon.c/store.db get pgmap last_committed (pgmap, last_committed) 0000 : f1 01 00 00 00 00 00 00 : ........ $ tst mon.f/store.db get pgmap last_committed (pgmap, last_committed) 0000 : e9 01 00 00 00 00 00 00 : ........ $ python -c 'print "{c} {f}".format(c=0x01f1,f=0x01e9)' 497 489 $ tst mon.f/store.db get pgmap 487 (pgmap, 487) 0000 : 05 05 f0 00 00 00 e7 01 00 00 00 00 00 00 00 00 : ................ 0010 : 00 00 04 00 00 00 03 00 00 00 02 02 28 00 00 00 : ............(... 0020 : 00 68 e6 d8 d1 e8 3f 00 00 5c 88 96 6d 72 1d 00 : .h....?..\..mr.. 0030 : 00 0c 5e 42 64 76 22 00 00 00 00 00 00 00 00 00 : ..^Bdv"......... 0040 : 00 00 00 00 00 00 00 00 05 00 00 00 02 02 28 00 : ..............(. 0050 : 00 00 00 68 e6 d8 d1 e8 3f 00 00 5c 88 96 6d 72 : ...h....?..\..mr 0060 : 1d 00 00 0c 5e 42 64 76 22 00 00 00 00 00 00 00 : ....^Bdv"....... 0070 : 00 00 00 00 00 00 00 00 00 00 06 00 00 00 02 02 : ................ 0080 : 28 00 00 00 00 68 e6 d8 d1 e8 3f 00 00 5c 88 96 : (....h....?..\.. 0090 : 6d 72 1d 00 00 0c 5e 42 64 76 22 00 00 00 00 00 : mr....^Bdv"..... 00a0 : 00 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00 : ................ 00b0 : 02 02 28 00 00 00 00 68 e6 d8 d1 e8 3f 00 00 5c : ..(....h....?..\ 00c0 : 88 96 6d 72 1d 00 00 0c 5e 42 64 76 22 00 00 00 : ..mr....^Bdv"... 00d0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : ................ 00e0 : 00 00 00 00 00 00 00 00 00 00 33 33 73 3f 9a 99 : ..........33s?.. 00f0 : 59 3f 00 00 00 00 : Y?.... $ tst mon.c/store.db get pgmap 487 (pgmap, 487) 0000 : 05 05 f0 00 00 00 e7 01 00 00 00 00 00 00 00 00 : ................ 0010 : 00 00 04 00 00 00 03 00 00 00 02 02 28 00 00 00 : ............(... 0020 : 00 68 e6 d8 d1 e8 3f 00 00 5c 88 96 6d 72 1d 00 : .h....?..\..mr.. 0030 : 00 0c 5e 42 64 76 22 00 00 00 00 00 00 00 00 00 : ..^Bdv"......... 0040 : 00 00 00 00 00 00 00 00 05 00 00 00 02 02 28 00 : ..............(. 0050 : 00 00 00 68 e6 d8 d1 e8 3f 00 00 5c 88 96 6d 72 : ...h....?..\..mr 0060 : 1d 00 00 0c 5e 42 64 76 22 00 00 00 00 00 00 00 : ....^Bdv"....... 0070 : 00 00 00 00 00 00 00 00 00 00 06 00 00 00 02 02 : ................ 0080 : 28 00 00 00 00 68 e6 d8 d1 e8 3f 00 00 5c 88 96 : (....h....?..\.. 0090 : 6d 72 1d 00 00 0c 5e 42 64 76 22 00 00 00 00 00 : mr....^Bdv"..... 00a0 : 00 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00 : ................ 00b0 : 02 02 28 00 00 00 00 68 e6 d8 d1 e8 3f 00 00 5c : ..(....h....?..\ 00c0 : 88 96 6d 72 1d 00 00 0c 5e 42 64 76 22 00 00 00 : ..mr....^Bdv"... 00d0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : ................ 00e0 : 00 00 00 00 00 00 00 00 00 00 33 33 73 3f 9a 99 : ..........33s?.. 00f0 : 59 3f 00 00 00 00 : Y?....
Updated by Joao Eduardo Luis about 11 years ago
Also, I suspect this might be causing the same problem described on #4026
Updated by Joao Eduardo Luis about 11 years ago
Triggered again, same symptoms, and it appears as if the issue is a skipped version on the store:
from the original crash, on mon.f's store, note how there's no version 488:
pgmap:480 pgmap:481 pgmap:482 pgmap:483 pgmap:484 pgmap:485 pgmap:486 pgmap:487 pgmap:489
on the most recent crash, note how version 608 is missing (latest version 609):
pgmap:600 pgmap:601 pgmap:602 pgmap:603 pgmap:604 pgmap:605 pgmap:606 pgmap:607 pgmap:609
Updated by Joao Eduardo Luis about 11 years ago
- Status changed from New to Resolved
Actions