Project

General

Profile

Actions

Bug #5256

closed

Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion

Added by Faidon Liambotis almost 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
Urgent
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description


    -6> 2013-06-05 18:25:33.071576 7fcd873a7700  1 mon.ms-fe1003@1(synchronizing sync( requester state chunks )) e17 sync_requester_abort no longer a sync requester
    -5> 2013-06-05 18:25:33.071600 7fcd873a7700  1 -- 10.64.16.150:6789/0 --> mon.0 10.64.0.167:6789/0 -- mon_probe(probe c9da36e1-694a-4166-b346-9d8d4d1d1ac1 name ms-fe1003) v4 -- ?+0 0x4788600
    -4> 2013-06-05 18:25:33.071634 7fcd873a7700  1 -- 10.64.16.150:6789/0 --> mon.2 10.64.32.10:6789/0 -- mon_probe(probe c9da36e1-694a-4166-b346-9d8d4d1d1ac1 name ms-fe1003) v4 -- ?+0 0x4788000
    -3> 2013-06-05 18:25:33.071714 7fcd873a7700  1 -- 10.64.16.150:6789/0 <== mon.2 10.64.32.10:6789/0 359 ==== mon_sync( chunk bl 987152 bytes last_key ( osdmap,full_182463 ) ) v1 ==== 987343+0+0 (2285625880 0 0) 0x4f02840 con 0x1b3e6e0
    -2> 2013-06-05 18:25:33.071754 7fcd873a7700  1 mon.ms-fe1003@1(probing) e17 handle_sync_chunk stray message -- drop it.
    -1> 2013-06-05 18:25:33.071758 7fcd873a7700  1 -- 10.64.16.150:6789/0 <== osd.132 10.64.0.178:6805/27958 1 ==== auth(proto 0 28 bytes epoch 17) v1 ==== 58+0+0 (2383461374 0 0) 0x3732900 con 0x231f580
     0> 2013-06-05 18:25:33.072854 7fcd87ba8700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fcd87ba8700

 ceph version 0.61.2-58-g7d549cb (7d549cb82ab8ebcf1cc104fc557d601b486c7635)
 1: /usr/bin/ceph-mon() [0x599a3a]
 2: (()+0xfcb0) [0x7fcd8c112cb0]
 3: (Monitor::C_HeartbeatInterval::finish(int)+0x53) [0x4d06a3]
 4: (Context::complete(int)+0xa) [0x4cb61a]
 5: (SafeTimer::timer_thread()+0x425) [0x649695]
 6: (SafeTimerThread::entry()+0xd) [0x64a2cd]
 7: (()+0x7e9a) [0x7fcd8c10ae9a]
 8: (clone()+0x6d) [0x7fcd8a6baccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   0/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 hadoop
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-mon.ms-fe1003.log
--- end dump of recent events ---
2013-06-05 18:25:33.245545 7ff10f703780  0 ceph version 0.61.2-58-g7d549cb (7d549cb82ab8ebcf1cc104fc557d601b486c7635), process ceph-mon, pid 11154
2013-06-05 18:25:33.281636 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:33.459061 7ff10f703780 -1 there is an on-going (maybe aborted?) conversion.
2013-06-05 18:25:33.459073 7ff10f703780 -1 you should check what happened
2013-06-05 18:25:33.459771 7ff10f703780 -1 found errors while attempting to convert the monitor store: (17) File exists
2013-06-05 18:25:33.549295 7ff10f703780 -1 ERROR: on disk data includes unsupported features: compat={},rocompat={},incompat={4=}
2013-06-05 18:25:33.761693 7ff10f703780 -1 obtain_monmap unable to find a monmap
2013-06-05 18:25:33.761711 7ff10f703780 -1 unable to obtain a monmap: (2) No such file or directory
2013-06-05 18:25:33.761725 7ff10f703780  0 mon.ms-fe1003 does not exist in monmap, will attempt to join an existing cluster
2013-06-05 18:25:33.761943 7ff10f703780 -1 no public_addr or public_network specified, and mon.ms-fe1003 not present in monmap or ceph.conf
2013-06-05 18:25:33.762389 7ff10f703780  1 mon.ms-fe1003@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000
2013-06-05 18:25:33.762455 7ff10f703780 -1 mon.ms-fe1003@-1(probing) e0 error: cluster_uuid file exists with value 'c9da36e1-694a-4166-b346-9d8d4d1d1ac1', != our uuid 00000000-0000-0000-0000-000000000000
2013-06-05 18:25:34.297132 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:35.313140 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:36.327476 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:37.339902 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:38.354432 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:39.369904 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined
2013-06-05 18:25:40.384275 7ff10c57b700 -1 asok(0x2463000) AdminSocket: request 'mon_status' not defined

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)ResolvedJoao Eduardo Luis06/10/2013

Actions
Actions

Also available in: Atom PDF