Project

General

Profile

Actions

Bug #1909

closed

Two mons crash after starting the third one

Added by Maciej Galkiewicz over 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I had three mons. One of them was reinstalled without removing it from the cluster. Now after starting reinstalled mon, the rest crash with error:

2012-01-09 16:52:32.251857 7f28a7231700 -- 1.1.1.1:6789/0 >> 2.2.2.2:6800/0 pipe(0x2709780 sd=41 pgs=1 cs=1 l=0).fault with nothing to send, going to standby
2012-01-09 16:52:37.276071 7f28a8839700 log [INF] : mon.n4c1 calling new monitor election
mon/MonMap.h: In function 'entity_inst_t MonMap::get_inst(unsigned int)', in thread '7f28a8839700'
mon/MonMap.h: 162: FAILED assert(m < rank_addr.size())
 ceph version 0.39-195-ge18b1c9 (commit:e18b1c9734e88e3b779ba2d70cdd54f8fb94743d)
 1: (Elector::defer(int)+0x29a) [0x5050aa]
 2: (Elector::handle_propose(MMonElection*)+0x30b) [0x5053eb]
 3: (Elector::dispatch(Message*)+0x7cb) [0x506d8b]
 4: (Monitor::_ms_dispatch(Message*)+0xcf4) [0x47e7f4]
 5: (Monitor::ms_dispatch(Message*)+0x90) [0x48c720]
 6: (SimpleMessenger::dispatch_entry()+0x869) [0x582ef9]
 7: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4664bc]
 8: (()+0x68ba) [0x7f28ac2c98ba]
 9: (clone()+0x6d) [0x7f28aab2502d]
 ceph version 0.39-195-ge18b1c9 (commit:e18b1c9734e88e3b779ba2d70cdd54f8fb94743d)
 1: (Elector::defer(int)+0x29a) [0x5050aa]
 2: (Elector::handle_propose(MMonElection*)+0x30b) [0x5053eb]
 3: (Elector::dispatch(Message*)+0x7cb) [0x506d8b]
 4: (Monitor::_ms_dispatch(Message*)+0xcf4) [0x47e7f4]
 5: (Monitor::ms_dispatch(Message*)+0x90) [0x48c720]
 6: (SimpleMessenger::dispatch_entry()+0x869) [0x582ef9]
 7: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4664bc]
 8: (()+0x68ba) [0x7f28ac2c98ba]
 9: (clone()+0x6d) [0x7f28aab2502d]
*** Caught signal (Aborted) **
 in thread 7f28a8839700
 ceph version 0.39-195-ge18b1c9 (commit:e18b1c9734e88e3b779ba2d70cdd54f8fb94743d)
 1: /usr/bin/ceph-mon() [0x5cfc89]
 2: (()+0xef60) [0x7f28ac2d1f60]
 3: (gsignal()+0x35) [0x7f28aaa88165]
 4: (abort()+0x180) [0x7f28aaa8af70]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f28ab309c2d]
 6: (()+0xb8dd6) [0x7f28ab307dd6]
 7: (()+0xb8e03) [0x7f28ab307e03]
 8: (()+0xb8efe) [0x7f28ab307efe]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x3a7) [0x5a1e17]
 10: (Elector::defer(int)+0x29a) [0x5050aa]
 11: (Elector::handle_propose(MMonElection*)+0x30b) [0x5053eb]
 12: (Elector::dispatch(Message*)+0x7cb) [0x506d8b]
 13: (Monitor::_ms_dispatch(Message*)+0xcf4) [0x47e7f4]
 14: (Monitor::ms_dispatch(Message*)+0x90) [0x48c720]
 15: (SimpleMessenger::dispatch_entry()+0x869) [0x582ef9]
 16: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4664bc]
 17: (()+0x68ba) [0x7f28ac2c98ba]
 18: (clone()+0x6d) [0x7f28aab2502d]

Is it necessary to remove and add it once again?

Actions

Also available in: Atom PDF