Bug #2338
closedBug #2221: Monitor setup bugs
mon: adding new monitors simultaneously can allow a new mon to become leader
0%
Description
When you add two new monitors (out of 3 total) to a cluster you can end up with one of the new monitors being the leader. This causes crashes on asserts because the new monitor doesn't have in-memory maps for a lot of things but does have different values for the pending.
Obviously that's bad. We should diagnose more precise causes and figure out how to prevent them.
Updated by Greg Farnum about 12 years ago
2012-04-24 00:18:58.821381 7f00b20ef780 store(/srv/ceph/mon.peon5753) mount 2012-04-24 00:18:58.821593 7f00b20ef780 ceph version 0.45-11-gb5635ee (commit:b5635eef3515bccfed86e457d8ca4b0ff2742c40) , process ceph-mon, pid 9327 2012-04-24 00:18:58.821724 7f00b20ef780 -- [2607:f298:4:2243::5753]:6789/0 accepter.bind my_inst.addr is [2607:f298:4:2 243::5753]:6789/0 need_addr=0 2012-04-24 00:18:58.822687 7f00b20ef780 -- [2607:f298:4:2243::5753]:6789/0 messenger.start 2012-04-24 00:18:58.822712 7f00b20ef780 -- [2607:f298:4:2243::5753]:6789/0 accepter.start 2012-04-24 00:18:58.822827 7f00b20ef780 mon.peon5753@1(probing) e3 init fsid 49794861-9482-4491-87ee-f43955a29181 2012-04-24 00:18:58.823171 7f00b20ef780 -- [2607:f298:4:2243::5753]:6789/0 --> mon.0 [2607:f298:4:2243::5752]:6789/0 -- mon_probe(probe 49794861-9482-4491-87ee-f43955a29181 name peon5753) v1 -- ?+0 0x26ca000 2012-04-24 00:18:58.823242 7f00b20ef780 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- mon_probe(probe 49794861-9482-4491-87ee-f43955a29181 name peon5753) v1 -- ?+0 0x26ca600 2012-04-24 00:18:58.825559 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.0 [2607:f298:4:2243::5752]:6789/0 1 ==== mon_probe(reply 49794861-9482-4491-87ee-f43955a29181 name peon5752 quorum 0 versions {auth=5,logm=3,mdsmap=1,monma p=1,osdmap=2,pgmap=3}) v1 ==== 377+0+0 (1156773136 0 0) 0x26ca000 con 0x26bc780 2012-04-24 00:18:58.825639 7f00ad96e700 log [INF] : mon.peon5753 calling new monitor election 2012-04-24 00:18:58.825672 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.1 [2607:f298:4:2243::5753]:6789/0 -- log(1 entries) v1 -- ?+0 0x26b5a00 2012-04-24 00:18:58.841045 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.0 [2607:f298:4:2243::5752]:6789/0 -- election(49794861-9482-4491-87ee-f43955a29181 propose 1) v2 -- ?+0 0x26a46c0 2012-04-24 00:18:58.841072 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- election(49794861-9482-4491-87ee-f43955a29181 propose 1) v2 -- ?+0 0x26a4b40 2012-04-24 00:18:58.841115 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x26b5a00 con 0x26bc000 2012-04-24 00:18:58.841170 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 1 ==== mon_probe(reply 49794861-9482-4491-87ee-f43955a29181 name peon5754 versions {auth=0,logm=0,mdsmap=0,monmap=0,osdma p=0,pgmap=0}) v1 ==== 669+0+0 (4255925484 0 0) 0x26ca600 con 0x26bcc80 2012-04-24 00:19:01.739838 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 2 ==== election(49794861-9482-4491-87ee-f43955a29181 propose 5) v2 ==== 518+0+0 (1200745803 0 0) 0x26a4d80 con 0x26bcc80 2012-04-24 00:19:01.758223 7f00ad96e700 log [INF] : mon.peon5753 calling new monitor election 2012-04-24 00:19:01.758250 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.1 [2607:f298:4:2243::5753]:6789/0 -- log(1 entries) v1 -- ?+0 0x26b5c00 2012-04-24 00:19:01.758273 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.0 [2607:f298:4:2243::5752]:6789/0 -- election(49794861-9482-4491-87ee-f43955a29181 propose 5) v2 -- ?+0 0x26a4900 2012-04-24 00:19:01.758302 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- election(49794861-9482-4491-87ee-f43955a29181 propose 5) v2 -- ?+0 0x26d1480 2012-04-24 00:19:01.758353 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x26b5c00 con 0x26bc000 2012-04-24 00:19:01.759005 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 3 ==== election(49794861-9482-4491-87ee-f43955a29181 ack 5) v2 ==== 518+0+0 (2273960751 0 0) 0x26d1900 con 0x26bcc80 2012-04-24 00:19:06.771883 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- election(49794861-9482-4491-87ee-f43955a29181 victory 6) v2 -- ?+0 0x26a46c0 2012-04-24 00:19:06.771911 7f00ad16d700 log [INF] : mon.peon5753@1 won leader election with quorum 1,2 2012-04-24 00:19:06.771928 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.1 [2607:f298:4:2243::5753]:6789/0 -- log(1 entries) v1 -- ?+0 0x26d5000 2012-04-24 00:19:06.772011 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x26d5000 con 0x26bc000 2012-04-24 00:19:06.784258 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(pgmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bb780 2012-04-24 00:19:06.793476 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(mdsmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bbc80 2012-04-24 00:19:06.801906 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(osdmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bba00 2012-04-24 00:19:06.810907 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(logm collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6500 2012-04-24 00:19:06.822719 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(monmap collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6280 2012-04-24 00:19:06.831813 7f00ad16d700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(auth collect lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6000 2012-04-24 00:19:06.831940 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 4 ==== paxos(pgmap last lc 0 fc 0 pn 301 opn 301) v1 ==== 137+0+0 (1126231997 0 0) 0x26bbc80 con 0x26bcc80 2012-04-24 00:19:06.840234 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(pgmap begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6780 2012-04-24 00:19:06.840265 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 5 ==== paxos(mdsmap last lc 0 fc 0 pn 301 opn 301) v1 ==== 385+0+0 (4010801635 0 0) 0x26bba00 con 0x26bcc80 2012-04-24 00:19:06.848696 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(mdsmap begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bbc80 2012-04-24 00:19:06.848782 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 6 ==== paxos(osdmap last lc 0 fc 0 pn 301 opn 301) v1 ==== 922+0+0 (1223976345 0 0) 0x26d6500 con 0x26bcc80 2012-04-24 00:19:06.861139 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(osdmap begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26bba00 2012-04-24 00:19:06.861168 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 7 ==== paxos(logm last lc 0 fc 0 pn 301 opn 301) v1 ==== 311+0+0 (2075763455 0 0) 0x26d6280 con 0x26bcc80 2012-04-24 00:19:06.873395 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(logm begin lc 0 fc 0 pn 301 opn 0) v1 -- ?+0 0x26d6500 2012-04-24 00:19:06.873425 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 <== mon.2 [2607:f298:4:2243::5754]:6789/0 8 ==== paxos(monmap last lc 0 fc 0 pn 301 opn 0) v1 ==== 84+0+0 (1941942587 0 0) 0x26d6000 con 0x26bcc80 2012-04-24 00:19:06.873449 7f00ad96e700 -- [2607:f298:4:2243::5753]:6789/0 --> mon.2 [2607:f298:4:2243::5754]:6789/0 -- paxos(monmap lease lc 0 fc 0 pn 0 opn 0) v1 -- ?+0 0x26d6280 *** Caught signal (Aborted) ** in thread 7f00ad96e700 ceph version 0.45-11-gb5635ee (commit:b5635eef3515bccfed86e457d8ca4b0ff2742c40) 1: /usr/bin/ceph-mon() [0x5f5366] 2: (()+0xfcb0) [0x7f00b1ccecb0] 3: (gsignal()+0x35) [0x7f00b03c3445] 4: (abort()+0x17b) [0x7f00b03c6bab] 5: (()+0x2f10e) [0x7f00b03bc10e] 6: (()+0x2f1b2) [0x7f00b03bc1b2] 7: (MonmapMonitor::create_initial()+0x1f7) [0x4ce5b7] 8: (PaxosService::_active()+0x36e) [0x49a65e] 9: (Context::complete(int)+0xa) [0x4829da] 10: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x6d) [0x48465d] 11: (Paxos::handle_last(MMonPaxos*)+0x916) [0x495fe6] 12: (Paxos::dispatch(PaxosServiceMessage*)+0x21b) [0x4963ab] 13: (Monitor::_ms_dispatch(Message*)+0x119e) [0x481dbe] 14: (Monitor::ms_dispatch(Message*)+0x32) [0x48cf22] 15: (SimpleMessenger::dispatch_entry()+0x79b) [0x59a75b] 16: (SimpleMessenger::DispatchThread::entry()+0xd) [0x56c5ed] 17: (()+0x7e9a) [0x7f00b1cc6e9a] 18: (clone()+0x6d) [0x7f00b047f4bd]
Updated by Greg Farnum about 12 years ago
- Due date set to 04/24/2012
- Assignee set to Greg Farnum
- Parent task set to #2221
Looks like the original monitor doesn't believe in the existence of the new monitors; I'll need to check out why.
2012-04-24 01:18:43.778953 7f118dbae700 -- [2607:f298:4:2243::5752]:6789/0 <== mon.1 [2607:f298:4:2243::5753]:6789/0 8 ==== election(31b8be2f-ac05-4e56-96b7-e702df166e29 propose 1 ) v2 ==== 518+0+0 (3164389216 0 0) 0x1347240 con 0x133d140 2012-04-24 01:18:43.778968 7f118dbae700 mon.peon5752@0(leader).elector(0) ignoring bogus election message with bad mon rank mon.1
The crash in create_initial() is almost certainly the cause of (2) in #2221.
Updated by Greg Farnum about 12 years ago
Okay, looks like the new Monitor isn't joining properly because in this case it's been fed a monmap that contains itself already.
But I don't know that it's completely user error because I suspect that if you don't start it up that way it will fail in other places — but even if it doesn't this sort of setup is the kind of thing we should be a little more graceful about. :/
Updated by Greg Farnum almost 12 years ago
- Status changed from New to Rejected
Everything does work properly if you aren't located in the monmap. I would love to come up with a way of providing user feedback to warn against this error, but I just don't see a good way to make it happen. Anything to do it would have to start in Monitor::handle_probe_reply and be special-cased based on existing versions of PaxosSystems or something, which is...ugly...