Project

General

Profile

Actions

Bug #4723

closed

FAILED assert(!db->create_and_open(std::cerr)) after IO Error.

Added by Matthew Roy about 11 years ago. Updated almost 11 years ago.

Status:
Can't reproduce
Priority:
Low
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
5 - suggestion
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

VERY low priority.

Top of console output is below:

=== mon.a === 
Starting Ceph mon.a on nut...
IO error: lock /var/lib/ceph/mon/ceph-a/store.db/LOCK: Resource temporarily unavailable
IO error: lock /var/lib/ceph/mon/ceph-a/store.db/LOCK: Resource temporarily unavailable
mon/Monitor.cc: In function 'int Monitor::StoreConverter::convert()' thread 7fc87e2fb780 time 2013-04-13 14:06:02.837079
mon/Monitor.cc: 4151: FAILED assert(!db->create_and_open(std::cerr))
 ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c)
 1: (Monitor::StoreConverter::convert()+0x42c) [0x49c8bc]
 2: (main()+0x7b4) [0x47cfe4]
 3: (__libc_start_main()+0xed) [0x7fc87c55776d]
 4: /usr/bin/ceph-mon() [0x4804ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2013-04-13 14:06:02.837645 7fc87e2fb780 -1 mon/Monitor.cc: In function 'int Monitor::StoreConverter::convert()' thread 7fc87e2fb780 time 2013-04-13 14:06:02.837079
mon/Monitor.cc: 4151: FAILED assert(!db->create_and_open(std::cerr))

 ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c)
 1: (Monitor::StoreConverter::convert()+0x42c) [0x49c8bc]
 2: (main()+0x7b4) [0x47cfe4]
 3: (__libc_start_main()+0xed) [0x7fc87c55776d]
 4: /usr/bin/ceph-mon() [0x4804ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

This came from a cluster with a lot of mon turnover due to #3495 and a watchdog timer to work around that issue. Console output is attached, monitor log is at http://goo.gl/na9GI


Files

ceph-mon-dbassert.console.log (4.46 KB) ceph-mon-dbassert.console.log Matthew Roy, 04/13/2013 11:24 AM
Actions #1

Updated by Joao Eduardo Luis about 11 years ago

  • Status changed from New to 4
  • Assignee set to Joao Eduardo Luis

Missed this bug completely.

It looks as if you had another monitor running when the new monitor was started.

The message could be slightly improved though -- but I honestly thought I already had taken care of that.

Actions #2

Updated by Matthew Roy about 11 years ago

In that case, maybe the real problem is that the init script didn't kill the other process, this output came from running "# service ceph restart mon" on a system with a dead mon that wasn't joining quorum. I'll check the next time it happens to make sure the dead mon process isn't still half-running.

Actions #3

Updated by Matthew Roy about 11 years ago

This should probably be closed with can't reproduce. Now that the cluster is healthy I'm not able to produce the same bug to check and see if the mon process is actually dead.

Actions #4

Updated by Ian Colle almost 11 years ago

  • Status changed from 4 to Can't reproduce
Actions

Also available in: Atom PDF