Actions
Bug #5256
closedUpgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-6> 2013-06-05 18:25:33.071576 7fcd873a7700 1 mon.ms-fe1003@1(synchronizing sync( requester state chunks )) e17 sync_requester_abort no longer a sync requester -5> 2013-06-05 18:25:33.071600 7fcd873a7700 1 -- 10.64.16.150:6789/0 --> mon.0 10.64.0.167:6789/0 -- mon_probe(probe c9da36e1-694a-4166-b346-9d8d4d1d1ac1 name ms-fe1003) v4 -- ?+0 0x4788600 -4> 2013-06-05 18:25:33.071634 7fcd873a7700 1 -- 10.64.16.150:6789/0 --> mon.2 10.64.32.10:6789/0 -- mon_probe(probe c9da36e1-694a-4166-b346-9d8d4d1d1ac1 name ms-fe1003) v4 -- ?+0 0x4788000 -3> 2013-06-05 18:25:33.071714 7fcd873a7700 1 -- 10.64.16.150:6789/0 <== mon.2 10.64.32.10:6789/0 359 ==== mon_sync( chunk bl 987152 bytes last_key ( osdmap,full_182463 ) ) v1 ==== 987343+0+0 (2285625880 0 0) 0x4f02840 con 0x1b3e6e0 -2> 2013-06-05 18:25:33.071754 7fcd873a7700 1 mon.ms-fe1003@1(probing) e17 handle_sync_chunk stray message -- drop it. -1> 2013-06-05 18:25:33.071758 7fcd873a7700 1 -- 10.64.16.150:6789/0 <== osd.132 10.64.0.178:6805/27958 1 ==== auth(proto 0 28 bytes epoch 17) v1 ==== 58+0+0 (2383461374 0 0) 0x3732900 con 0x231f580 0> 2013-06-05 18:25:33.072854 7fcd87ba8700 -1 *** Caught signal (Segmentation fault) ** in thread 7fcd87ba8700 ceph version 0.61.2-58-g7d549cb (7d549cb82ab8ebcf1cc104fc557d601b486c7635) 1: /usr/bin/ceph-mon() [0x599a3a] 2: (()+0xfcb0) [0x7fcd8c112cb0] 3: (Monitor::C_HeartbeatInterval::finish(int)+0x53) [0x4d06a3] 4: (Context::complete(int)+0xa) [0x4cb61a] 5: (SafeTimer::timer_thread()+0x425) [0x649695] 6: (SafeTimerThread::entry()+0xd) [0x64a2cd] 7: (()+0x7e9a) [0x7fcd8c10ae9a] 8: (clone()+0x6d) [0x7fcd8a6baccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 0/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 hadoop 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-mon.ms-fe1003.log --- end dump of recent events --- 2013-06-05 18:25:33.245545 7ff10f703780 0 ceph version 0.61.2-58-g7d549cb (7d549cb82ab8ebcf1cc104fc557d601b486c7635), process ceph-mon, pid 11154 2013-06-05 18:25:33.281636 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:33.459061 7ff10f703780 -1 there is an on-going (maybe aborted?) conversion. 2013-06-05 18:25:33.459073 7ff10f703780 -1 you should check what happened 2013-06-05 18:25:33.459771 7ff10f703780 -1 found errors while attempting to convert the monitor store: (17) File exists 2013-06-05 18:25:33.549295 7ff10f703780 -1 ERROR: on disk data includes unsupported features: compat={},rocompat={},incompat={4=} 2013-06-05 18:25:33.761693 7ff10f703780 -1 obtain_monmap unable to find a monmap 2013-06-05 18:25:33.761711 7ff10f703780 -1 unable to obtain a monmap: (2) No such file or directory 2013-06-05 18:25:33.761725 7ff10f703780 0 mon.ms-fe1003 does not exist in monmap, will attempt to join an existing cluster 2013-06-05 18:25:33.761943 7ff10f703780 -1 no public_addr or public_network specified, and mon.ms-fe1003 not present in monmap or ceph.conf 2013-06-05 18:25:33.762389 7ff10f703780 1 mon.ms-fe1003@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000 2013-06-05 18:25:33.762455 7ff10f703780 -1 mon.ms-fe1003@-1(probing) e0 error: cluster_uuid file exists with value 'c9da36e1-694a-4166-b346-9d8d4d1d1ac1', != our uuid 00000000-0000-0000-0000-000000000000 2013-06-05 18:25:34.297132 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:35.313140 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:36.327476 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:37.339902 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:38.354432 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:39.369904 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined 2013-06-05 18:25:40.384275 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
Actions