Bug #3790
closedMon crash after update to ceph version 0.56-209-g310112f
0%
Description
I have a single node cluster on burnupi60 updated each morning to the latest Master branch. After the update this morning and restart of ceph, the mon crashed with "FAILED assert(0 == "We are alone; this shouldn't have been scheduled!")
--- end dump of recent events ---
2013-01-11 08:25:06.054340 7ff7c8e36780 -1 ** Caught signal (Aborted) *
in thread 7ff7c8e36780
ceph version 0.56-209-g310112f (310112f702d14294e6ba48f8af41a306288cba65)
1: /usr/bin/ceph-mon() [0x5461ca]
2: (()+0xfcb0) [0x7ff7c8a1acb0]
3: (gsignal()+0x35) [0x7ff7c710b425]
4: (abort()+0x17b) [0x7ff7c710eb8b]
5: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7ff7c7a5d69d]
6: (()+0xb5846) [0x7ff7c7a5b846]
7: (()+0xb5873) [0x7ff7c7a5b873]
8: (()+0xb596e) [0x7ff7c7a5b96e]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0x5f5ebf]
10: (Monitor::timecheck()+0xa30) [0x48d320]
11: (Monitor::win_election(unsigned int, std::set<int, std::less<int>, std::allocator<int> >&, unsigned long)+0x31d) [0x48d6bd]
12: (Monitor::win_standalone_election()+0x1af) [0x48d8af]
13: (Monitor::bootstrap()+0xc52) [0x48e582]
14: (Monitor::init()+0x11a) [0x490eba]
15: (main()+0x143f) [0x47106f]
16: (__libc_start_main()+0xed) [0x7ff7c70f676d]
17: /usr/bin/ceph-mon() [0x4730b9]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- begin dump of recent events ---
0> 2013-01-11 08:25:06.054340 7ff7c8e36780 -1 ** Caught signal (Aborted) *
in thread 7ff7c8e36780
ceph version 0.56-209-g310112f (310112f702d14294e6ba48f8af41a306288cba65)
1: /usr/bin/ceph-mon() [0x5461ca]
2: (()+0xfcb0) [0x7ff7c8a1acb0]
3: (gsignal()+0x35) [0x7ff7c710b425]
4: (abort()+0x17b) [0x7ff7c710eb8b]
5: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7ff7c7a5d69d]
6: (()+0xb5846) [0x7ff7c7a5b846]
7: (()+0xb5873) [0x7ff7c7a5b873]
8: (()+0xb596e) [0x7ff7c7a5b96e]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0x5f5ebf]
10: (Monitor::timecheck()+0xa30) [0x48d320]
11: (Monitor::win_election(unsigned int, std::set<int, std::less<int>, std::allocator<int> >&, unsigned long)+0x31d) [0x48d6bd]
12: (Monitor::win_standalone_election()+0x1af) [0x48d8af]
13: (Monitor::bootstrap()+0xc52) [0x48e582]
14: (Monitor::init()+0x11a) [0x490eba]
15: (main()+0x143f) [0x47106f]
16: (__libc_start_main()+0xed) [0x7ff7c70f676d]
17: /usr/bin/ceph-mon() [0x4730b9]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
0/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 hadoop
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 100000
max_new 1000
log_file /var/log/ceph/ceph-mon.a.log
Updated by Ken Franklin over 11 years ago
Previous installed version was .56-193.
Updated by Joao Eduardo Luis over 11 years ago
- Status changed from New to In Progress
- Assignee set to Joao Eduardo Luis
My fault. Forgot a check on win_election().
Any chance you can test 6104629d95207f3dfd3a744d81b011b6a714070e on wip-3790?
I'm about to do so as well.
Updated by Joao Eduardo Luis over 11 years ago
- Status changed from In Progress to 4
Had a redundant check on the previous commit; fixed and rebased it and the new commit can be found on wip-3790 commit:a3b3a09a3a22c29fb4a6a0131ab4e3f2e54fcad1
Updated by Sage Weil over 11 years ago
- Status changed from 4 to Resolved
looks good, merged into master. 8d0fa15e6aa3847e89de5d5adfca0a863e8da976