Project

General

Profile

Bug #2338

Bug #2221: Monitor setup bugs

mon: adding new monitors simultaneously can allow a new mon to become leader

Added by Greg Farnum over 8 years ago. Updated over 8 years ago.

Status:
Rejected
Priority:
Urgent
Assignee:
Category:
Monitor
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

When you add two new monitors (out of 3 total) to a cluster you can end up with one of the new monitors being the leader. This causes crashes on asserts because the new monitor doesn't have in-memory maps for a lot of things but does have different values for the pending.
Obviously that's bad. We should diagnose more precise causes and figure out how to prevent them.

History

#1 Updated by Greg Farnum over 8 years ago

2012-04-24 00:18:58.821381 7f00b20ef780 store(/srv/ceph/mon.peon5753) mount
2012-04-24 00:18:58.821593 7f00b20ef780 ceph version 0.45-11-gb5635ee
(commit:b5635eef3515bccfed86e457d8ca4b0ff2742c40)
, process ceph-mon, pid 9327
2012-04-24 00:18:58.821724 7f00b20ef780 --
[2607:f298:4:2243::5753]:6789/0 accepter.bind my_inst.addr is [2607:f298:4:2
243::5753]:6789/0 need_addr=0
2012-04-24 00:18:58.822687 7f00b20ef780 --
[2607:f298:4:2243::5753]:6789/0 messenger.start
2012-04-24 00:18:58.822712 7f00b20ef780 --
[2607:f298:4:2243::5753]:6789/0 accepter.start
2012-04-24 00:18:58.822827 7f00b20ef780 mon.peon5753@1(probing) e3 init
fsid 49794861-9482-4491-87ee-f43955a29181
2012-04-24 00:18:58.823171 7f00b20ef780 --
[2607:f298:4:2243::5753]:6789/0 --> mon.0 [2607:f298:4:2243::5752]:6789/0 --
mon_probe(probe 49794861-9482-4491-87ee-f43955a29181 name peon5753) v1
-- ?+0 0x26ca000
2012-04-24 00:18:58.823242 7f00b20ef780 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
mon_probe(probe 49794861-9482-4491-87ee-f43955a29181 name peon5753) v1
-- ?+0 0x26ca600
2012-04-24 00:18:58.825559 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.0 [2607:f298:4:2243::5752]:6789/0 1
==== mon_probe(reply 49794861-9482-4491-87ee-f43955a29181 name peon5752
quorum 0 versions {auth=5,logm=3,mdsmap=1,monma
p=1,osdmap=2,pgmap=3}) v1 ==== 377+0+0 (1156773136 0 0) 0x26ca000 con
0x26bc780
2012-04-24 00:18:58.825639 7f00ad96e700 log [INF] : mon.peon5753 calling
new monitor election
2012-04-24 00:18:58.825672 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.1 [2607:f298:4:2243::5753]:6789/0 --
log(1 entries) v1 -- ?+0 0x26b5a00
2012-04-24 00:18:58.841045 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.0 [2607:f298:4:2243::5752]:6789/0 --
election(49794861-9482-4491-87ee-f43955a29181 propose 1) v2 -- ?+0
0x26a46c0
2012-04-24 00:18:58.841072 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
election(49794861-9482-4491-87ee-f43955a29181 propose 1) v2 -- ?+0
0x26a4b40
2012-04-24 00:18:58.841115 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 0
==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x26b5a00 con 0x26bc000
2012-04-24 00:18:58.841170 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 1
==== mon_probe(reply 49794861-9482-4491-87ee-f43955a29181 name peon5754
versions {auth=0,logm=0,mdsmap=0,monmap=0,osdma
p=0,pgmap=0}) v1 ==== 669+0+0 (4255925484 0 0) 0x26ca600 con 0x26bcc80
2012-04-24 00:19:01.739838 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 2
==== election(49794861-9482-4491-87ee-f43955a29181 propose 5) v2 ====
518+0+0 (1200745803 0 0) 0x26a4d80 con 0x26bcc80
2012-04-24 00:19:01.758223 7f00ad96e700 log [INF] : mon.peon5753 calling
new monitor election
2012-04-24 00:19:01.758250 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.1 [2607:f298:4:2243::5753]:6789/0 --
log(1 entries) v1 -- ?+0 0x26b5c00
2012-04-24 00:19:01.758273 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.0 [2607:f298:4:2243::5752]:6789/0 --
election(49794861-9482-4491-87ee-f43955a29181 propose 5) v2 -- ?+0
0x26a4900
2012-04-24 00:19:01.758302 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
election(49794861-9482-4491-87ee-f43955a29181 propose 5) v2 -- ?+0
0x26d1480
2012-04-24 00:19:01.758353 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 0
==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x26b5c00 con 0x26bc000
2012-04-24 00:19:01.759005 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 3
==== election(49794861-9482-4491-87ee-f43955a29181 ack 5) v2 ====
518+0+0 (2273960751 0 0) 0x26d1900 con 0x26bcc80
2012-04-24 00:19:06.771883 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
election(49794861-9482-4491-87ee-f43955a29181 victory 6) v2 -- ?+0
0x26a46c0
2012-04-24 00:19:06.771911 7f00ad16d700 log [INF] : mon.peon5753@1 won
leader election with quorum 1,2
2012-04-24 00:19:06.771928 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.1 [2607:f298:4:2243::5753]:6789/0 --
log(1 entries) v1 -- ?+0 0x26d5000
2012-04-24 00:19:06.772011 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 0
==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x26d5000 con 0x26bc000
2012-04-24 00:19:06.784258 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(pgmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bb780
2012-04-24 00:19:06.793476 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(mdsmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bbc80
2012-04-24 00:19:06.801906 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(osdmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bba00
2012-04-24 00:19:06.810907 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(logm collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6500
2012-04-24 00:19:06.822719 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(monmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6280
2012-04-24 00:19:06.831813 7f00ad16d700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(auth collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6000
2012-04-24 00:19:06.831940 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 4
==== paxos(pgmap last lc 0 fc 0 pn 301 opn 301) v1 ==== 137+0+0
(1126231997 0 0) 0x26bbc80 con 0x26bcc80
2012-04-24 00:19:06.840234 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(pgmap begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6780
2012-04-24 00:19:06.840265 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 5
==== paxos(mdsmap last lc 0 fc 0 pn 301 opn 301) v1 ==== 385+0+0
(4010801635 0 0) 0x26bba00 con 0x26bcc80
2012-04-24 00:19:06.848696 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(mdsmap begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bbc80
2012-04-24 00:19:06.848782 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 6
==== paxos(osdmap last lc 0 fc 0 pn 301 opn 301) v1 ==== 922+0+0
(1223976345 0 0) 0x26d6500 con 0x26bcc80
2012-04-24 00:19:06.861139 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(osdmap begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bba00
2012-04-24 00:19:06.861168 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 7
==== paxos(logm last lc 0 fc 0 pn 301 opn 301) v1 ==== 311+0+0
(2075763455 0 0) 0x26d6280 con 0x26bcc80
2012-04-24 00:19:06.873395 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(logm begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6500
2012-04-24 00:19:06.873425 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 8
==== paxos(monmap last lc 0 fc 0 pn 301 opn 0) v1 ==== 84+0+0
(1941942587 0 0) 0x26d6000 con 0x26bcc80
2012-04-24 00:19:06.873449 7f00ad96e700 --
[2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 --
paxos(monmap lease lc 0 fc 0 pn 0 opn 0) v1 -- ?+0 0x26d6280
*** Caught signal (Aborted) **
in thread 7f00ad96e700
ceph version 0.45-11-gb5635ee
(commit:b5635eef3515bccfed86e457d8ca4b0ff2742c40)
1: /usr/bin/ceph-mon() [0x5f5366]
2: (()+0xfcb0) [0x7f00b1ccecb0]
3: (gsignal()+0x35) [0x7f00b03c3445]
4: (abort()+0x17b) [0x7f00b03c6bab]
5: (()+0x2f10e) [0x7f00b03bc10e]
6: (()+0x2f1b2) [0x7f00b03bc1b2]
7: (MonmapMonitor::create_initial()+0x1f7) [0x4ce5b7]
8: (PaxosService::_active()+0x36e) [0x49a65e]
9: (Context::complete(int)+0xa) [0x4829da]
10: (finish_contexts(CephContext*, std::list<Context*,
std::allocator<Context*> >&, int)+0x6d) [0x48465d]
11: (Paxos::handle_last(MMonPaxos*)+0x916) [0x495fe6]
12: (Paxos::dispatch(PaxosServiceMessage*)+0x21b) [0x4963ab]
13: (Monitor::_ms_dispatch(Message*)+0x119e) [0x481dbe]
14: (Monitor::ms_dispatch(Message*)+0x32) [0x48cf22]
15: (SimpleMessenger::dispatch_entry()+0x79b) [0x59a75b]
16: (SimpleMessenger::DispatchThread::entry()+0xd) [0x56c5ed]
17: (()+0x7e9a) [0x7f00b1cc6e9a]
18: (clone()+0x6d) [0x7f00b047f4bd]

#2 Updated by Greg Farnum over 8 years ago

  • Due date set to 04/24/2012
  • Assignee set to Greg Farnum
  • Parent task set to #2221

Looks like the original monitor doesn't believe in the existence of the new monitors; I'll need to check out why.

2012-04-24 01:18:43.778953 7f118dbae700 -- [2607:f298:4:2243::5752]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 8 ==== election(31b8be2f-ac05-4e56-96b7-e702df166e29 propose 1
) v2 ==== 518+0+0 (3164389216 0 0) 0x1347240 con 0x133d140
2012-04-24 01:18:43.778968 7f118dbae700 mon.peon5752@0(leader).elector(0)  ignoring bogus election message with bad mon rank mon.1

The crash in create_initial() is almost certainly the cause of (2) in #2221.

#3 Updated by Sage Weil over 8 years ago

  • Priority changed from Normal to Urgent

#4 Updated by Greg Farnum over 8 years ago

Okay, looks like the new Monitor isn't joining properly because in this case it's been fed a monmap that contains itself already.
But I don't know that it's completely user error because I suspect that if you don't start it up that way it will fail in other places — but even if it doesn't this sort of setup is the kind of thing we should be a little more graceful about. :/

#5 Updated by Greg Farnum over 8 years ago

  • Status changed from New to Rejected

Everything does work properly if you aren't located in the monmap. I would love to come up with a way of providing user feedback to warn against this error, but I just don't see a good way to make it happen. Anything to do it would have to start in Monitor::handle_probe_reply and be special-cased based on existing versions of PaxosSystems or something, which is...ugly...

Also available in: Atom PDF