Bug #12941
closedmon/OSDMonitor.cc: 204: FAILED assert(err == 0) 0.94
0%
Description
Yesterday I found my cluster is broken, later found to be two monitor is broken(A total of three), I want to repair it, so I use the command:
ceph-mon -i vm13__
But the following error?
mon/OSDMonitor.cc: In function 'virtual void OSDMonitor::update_from_paxos(bool*)' thread 7f7fa64248c0 time 2015-09-04 13:31:48.448126 mon/OSDMonitor.cc: 204: FAILED assert(err == 0) ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7e708b] 2: (OSDMonitor::update_from_paxos(bool*)+0x21eb) [0x62d04b] 3: (PaxosService::refresh(bool*)+0x19a) [0x60d64a] 4: (Monitor::refresh_from_paxos(bool*)+0x183) [0x5ba4a3] 5: (Monitor::init_paxos()+0x85) [0x5ba7e5] 6: (Monitor::preinit()+0x7d7) [0x5bf447] 7: (main()+0x22dd) [0x5819ed] 8: (__libc_start_main()+0xf5) [0x7f7fa385cec5] 9: ceph-mon() [0x5a3607] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2015-09-04 13:31:48.449369 7f7fa64248c0 -1 mon/OSDMonitor.cc: In function 'virtual void OSDMonitor::update_from_paxos(bool*)' thread 7f7fa64248c0 time 2015-09-04 13:31:48.448126 mon/OSDMonitor.cc: 204: FAILED assert(err == 0) ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7e708b] 2: (OSDMonitor::update_from_paxos(bool*)+0x21eb) [0x62d04b] 3: (PaxosService::refresh(bool*)+0x19a) [0x60d64a] 4: (Monitor::refresh_from_paxos(bool*)+0x183) [0x5ba4a3] 5: (Monitor::init_paxos()+0x85) [0x5ba7e5] 6: (Monitor::preinit()+0x7d7) [0x5bf447] 7: (main()+0x22dd) [0x5819ed] 8: (__libc_start_main()+0xf5) [0x7f7fa385cec5] 9: ceph-mon() [0x5a3607] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 0> 2015-09-04 13:31:48.449369 7f7fa64248c0 -1 mon/OSDMonitor.cc: In function 'virtual void OSDMonitor::update_from_paxos(bool*)' thread 7f7fa64248c0 time 2015-09-04 13:31:48.448126 mon/OSDMonitor.cc: 204: FAILED assert(err == 0) ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7e708b] 2: (OSDMonitor::update_from_paxos(bool*)+0x21eb) [0x62d04b] 3: (PaxosService::refresh(bool*)+0x19a) [0x60d64a] 4: (Monitor::refresh_from_paxos(bool*)+0x183) [0x5ba4a3] 5: (Monitor::init_paxos()+0x85) [0x5ba7e5] 6: (Monitor::preinit()+0x7d7) [0x5bf447] 7: (main()+0x22dd) [0x5819ed] 8: (__libc_start_main()+0xf5) [0x7f7fa385cec5] 9: ceph-mon() [0x5a3607] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion' *** Caught signal (Aborted) ** in thread 7f7fa64248c0 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: ceph-mon() [0x9b050a] 2: (()+0x10340) [0x7f7fa5526340] 3: (gsignal()+0x39) [0x7f7fa3871cc9] 4: (abort()+0x148) [0x7f7fa38750d8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f7fa417c535] 6: (()+0x5e6d6) [0x7f7fa417a6d6] 7: (()+0x5e703) [0x7f7fa417a703] 8: (()+0x5e922) [0x7f7fa417a922] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0x7e7278] 10: (OSDMonitor::update_from_paxos(bool*)+0x21eb) [0x62d04b] 11: (PaxosService::refresh(bool*)+0x19a) [0x60d64a] 12: (Monitor::refresh_from_paxos(bool*)+0x183) [0x5ba4a3] 13: (Monitor::init_paxos()+0x85) [0x5ba7e5] 14: (Monitor::preinit()+0x7d7) [0x5bf447] 15: (main()+0x22dd) [0x5819ed] 16: (__libc_start_main()+0xf5) [0x7f7fa385cec5] 17: ceph-mon() [0x5a3607] 2015-09-04 13:31:48.451603 7f7fa64248c0 -1 *** Caught signal (Aborted) ** in thread 7f7fa64248c0 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: ceph-mon() [0x9b050a] 2: (()+0x10340) [0x7f7fa5526340] 3: (gsignal()+0x39) [0x7f7fa3871cc9] 4: (abort()+0x148) [0x7f7fa38750d8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f7fa417c535] 6: (()+0x5e6d6) [0x7f7fa417a6d6] 7: (()+0x5e703) [0x7f7fa417a703] 8: (()+0x5e922) [0x7f7fa417a922] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0x7e7278] 10: (OSDMonitor::update_from_paxos(bool*)+0x21eb) [0x62d04b] 11: (PaxosService::refresh(bool*)+0x19a) [0x60d64a] 12: (Monitor::refresh_from_paxos(bool*)+0x183) [0x5ba4a3] 13: (Monitor::init_paxos()+0x85) [0x5ba7e5] 14: (Monitor::preinit()+0x7d7) [0x5bf447] 15: (main()+0x22dd) [0x5819ed] 16: (__libc_start_main()+0xf5) [0x7f7fa385cec5] 17: ceph-mon() [0x5a3607] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 0> 2015-09-04 13:31:48.451603 7f7fa64248c0 -1 *** Caught signal (Aborted) ** in thread 7f7fa64248c0 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: ceph-mon() [0x9b050a] 2: (()+0x10340) [0x7f7fa5526340] 3: (gsignal()+0x39) [0x7f7fa3871cc9] 4: (abort()+0x148) [0x7f7fa38750d8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f7fa417c535] 6: (()+0x5e6d6) [0x7f7fa417a6d6] 7: (()+0x5e703) [0x7f7fa417a703] 8: (()+0x5e922) [0x7f7fa417a922] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0x7e7278] 10: (OSDMonitor::update_from_paxos(bool*)+0x21eb) [0x62d04b] 11: (PaxosService::refresh(bool*)+0x19a) [0x60d64a] 12: (Monitor::refresh_from_paxos(bool*)+0x183) [0x5ba4a3] 13: (Monitor::init_paxos()+0x85) [0x5ba7e5] 14: (Monitor::preinit()+0x7d7) [0x5bf447] 15: (main()+0x22dd) [0x5819ed] 16: (__libc_start_main()+0xf5) [0x7f7fa385cec5] 17: ceph-mon() [0x5a3607] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. [27101]: (33) Numerical argument out of domain
Is this a monitor related file corruption yet?
Files
Updated by bo cai over 8 years ago
Looks like the redmine format has a little problem
Updated by Loïc Dachary over 8 years ago
- Description updated (diff)
- Target version deleted (
v0.94.4)
Updated by Loïc Dachary over 8 years ago
mon/OSDMonitor.cc: 204: FAILED assert(err == 0) is at https://github.com/ceph/ceph/blob/v0.94.2/src/mon/OSDMonitor.cc#L204
Updated by Loïc Dachary over 8 years ago
- Status changed from New to Need More Info
It would help a lot if you could include more information from the log files. Is the file system on which the Mon reside ok ? Or did it suffer a failure / repair of some sort ?
Updated by bo cai over 8 years ago
- File ceph-mon.vm14.log ceph-mon.vm14.log added
- File syslog syslog added
I have a total of three monitors , one of which can not be started(vm14) , it is no problem with file system.
I even can get the monmap info by use command: ceph-mon -i vm14 --extract-monmap myfile && monmaptool --print myfile.
then print info:
epoch 1
fsid dc4f5635-5454-4d2c-94b7-ec05254596fe
last_changed 0.000000
created 0.000000
0: 172.16.31.212:6789/0 mon.vm13
1: 172.16.31.213:6789/0 mon.vm14
2: 172.16.31.214:6789/0 mon.vm15
so I think the file system is ok.
And I uploading the log of three host, hope that helps.
If you need more information please tell me.
Updated by Loïc Dachary over 8 years ago
- Status changed from Need More Info to New
Updated by Loïc Dachary over 8 years ago
- Subject changed from can not start monitor to mon/OSDMonitor.cc: 204: FAILED assert(err == 0) 0.94
Updated by Sage Weil about 7 years ago
- Status changed from New to Can't reproduce
please reopen if this is still a problem