Project

General

Profile

Actions

Bug #6041

closed

Failing to add 3rd monitor

Added by Bram Pieters over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After adding an additional (3rd) monitor, that new monitor will crash during first sync.

Ceph version: 0.64

2013-08-16 13:03:49.113593 7ff246803780 0 ceph version 0.64 (42e06c12db63bae292acc074548c06478fa92ea2), process ceph-mon, pid 23753
2013-08-16 13:03:53.699706 7ff246803780 0 mon.1 does not exist in monmap, will attempt to join an existing cluster
2013-08-16 13:03:53.700401 7ff246803780 1 mon.1@-1(probing) e0 preinit fsid 527ae0c2-4d1d-4262-8a70-2ef36b41f63d
2013-08-16 13:03:58.595346 7ff1b653d700 0 mon.1@-1(probing) e7 my rank is now 0 (was 1)
2013-08-16 13:03:58.596415 7ff2233e6700 0 -
192.168.135.200:6789/0 >> 192.168.135.201:6789/0 pipe(0x28d0000 sd=24 :6789 s=0 pgs=0 cs=0 l=0).accept connect_seq 2 vs existing 0 state connecting
2013-08-16 13:03:58.596509 7ff2233e6700 0 -- 192.168.135.200:6789/0 >> 192.168.135.201:6789/0 pipe(0x28d0000 sd=24 :6789 s=0 pgs=0 cs=0 l=0).accept we reset (peer sent cseq 2, 0x2b0e780.cseq = 0), sendin
g RESETSESSION
2013-08-16 13:03:58.596927 7ff2233e6700 0 -- 192.168.135.200:6789/0 >> 192.168.135.201:6789/0 pipe(0x28d0000 sd=24 :6789 s=0 pgs=0 cs=0 l=0).accept connect_seq 0 vs existing 0 state connecting
2013-08-16 13:04:58.605749 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 72% total 19518392 used 5412656 avail 14105736
2013-08-16 13:05:58.620107 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 70% total 19518392 used 5818504 avail 13699888
2013-08-16 13:06:58.699185 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 67% total 19518392 used 6314288 avail 13204104
2013-08-16 13:07:41.480078 7ff1b6d3e700 1 mon.1@0(synchronizing sync( requester state chunks )) e7 discarding message auth(proto 0 30 bytes epoch 0) v1 and sending client elsewhere
2013-08-16 13:07:58.740044 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 65% total 19518392 used 6774000 avail 12744392
2013-08-16 13:08:58.836225 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 63% total 19518392 used 7209132 avail 12309260
2013-08-16 13:09:07.121687 7ff1b6d3e700 1 mon.1@0(synchronizing sync( requester state chunks )) e7 discarding message auth(proto 0 30 bytes epoch 0) v1 and sending client elsewhere
2013-08-16 13:09:58.836459 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 60% total 19518392 used 7619188 avail 11899204
2013-08-16 13:10:58.912055 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 58% total 19518392 used 8024784 avail 11493608
2013-08-16 13:11:58.947890 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 57% total 19518392 used 8239480 avail 11278912
2013-08-16 13:13:11.413905 7ff1b6d3e700 1 mon.1@0(synchronizing sync( requester state chunks )) e7 sync_timeout mon.2 192.168.135.202:6789/0
2013-08-16 13:13:11.413967 7ff1b6d3e700 0 mon.1@0(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 61% total 19518392 used 7480576 avail 12037816
2013-08-16 13:13:11.414004 7ff1b653d700 1 mon.1@0(synchronizing sync( requester state chunks )) e7 handle_sync_chunk stray message -- drop it.
2013-08-16 13:16:08.445530 7ff1b6d3e700 1 mon.1@0(synchronizing sync( requester state chunks )) e7 sync_requester_abort no longer a sync requester
2013-08-16 13:16:08.445760 7ff1b653d700 1 mon.1@0(probing) e7 handle_sync_chunk stray message -- drop it.
2013-08-16 13:18:00.180372 7ff1b6d3e700 -1 mon/Monitor.cc: In function 'void Monitor::sync_timeout(entity_inst_t&)' thread 7ff1b6d3e700 time 2013-08-16 13:18:00.138301
mon/Monitor.cc: 1171: FAILED assert(0 == "We should never reach this")

ceph version 0.64 (42e06c12db63bae292acc074548c06478fa92ea2)
1: (Monitor::sync_timeout(entity_inst_t&)+0xa67) [0x4b5797]
2: (Context::complete(int)+0xa) [0x4c384a]
3: (SafeTimer::timer_thread()+0x453) [0x5a79d3]
4: (SafeTimerThread::entry()+0xd) [0x5a9b9d]
5: (()+0x68ca) [0x7ff2463e88ca]
6: (clone()+0x6d) [0x7ff244a1fb6d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Files

ceph-mon.1.log (1.93 MB) ceph-mon.1.log Mon log file Bram Pieters, 08/17/2013 12:00 AM
Actions

Also available in: Atom PDF