Actions
Bug #5640
closedmon: FAILED assert(!db->create_and_open(std::cerr)) on 0.61.4
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I've been unable to reproduce this, but it kept happening on a user's cluster while trying to upgrade from bobtail to cuttlefish 0.61.4.
2013-07-11 12:35:17.446679 7fefea7ff780 0 ceph version 0.61.4-42-gb684fcf (b684fcfd7a47001286548e28d99e3f920e9bab5b), process ceph-mon, pid 14273 2013-07-11 12:41:23.970419 7f84dc80b780 0 ceph version 0.61.4-42-gb684fcf (b684fcfd7a47001286548e28d99e3f920e9bab5b), process ceph-mon, pid 14865 2013-07-11 12:41:23.982740 7f84dc80b780 10 needs_conversion 2013-07-11 12:41:23.982764 7f84dc80b780 1 store(/etc/ceph/mon.a) mount 2013-07-11 12:41:23.982797 7f84dc80b780 15 store(/etc/ceph/mon.a) exists_bl magic 2013-07-11 12:41:23.982809 7f84dc80b780 20 _check_gv_store 2013-07-11 12:41:23.982811 7f84dc80b780 15 store(/etc/ceph/mon.a) exists_bl feature_set 2013-07-11 12:41:23.982827 7f84dc80b780 20 store(/etc/ceph/mon.a) reading at off 0 of 120 2013-07-11 12:41:23.982869 7f84dc80b780 15 store(/etc/ceph/mon.a) get_bl feature_set = 120 bytes 2013-07-11 12:41:23.982892 7f84dc80b780 1 found old GV monitor store format -- should convert! 2013-07-11 12:41:23.983958 7f84dc80b780 -1 mon/Monitor.cc: In function 'int Monitor::StoreConverter::convert()' thread 7f84dc80b780 time 2013-07-11 12:41:23.983152 mon/Monitor.cc: 4329: FAILED assert(!db->create_and_open(std::cerr)) ceph version 0.61.4-42-gb684fcf (b684fcfd7a47001286548e28d99e3f920e9bab5b) 1: (Monitor::StoreConverter::convert()+0x3d6) [0x4b2846] 2: (main()+0x16b4) [0x4750e4] 3: (__libc_start_main()+0xfd) [0x325a61ecdd] 4: /usr/bin/ceph-mon() [0x472769] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Running strace (attached) shows that store.db/LOCK was already being held when 'Monitor::StoreConvert::convert()' attempted to open the database:
[pid 15630] futex(0x2ed00ac, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...> [pid 15626] munmap(0x7f73b8cd3000, 4096) = 0 [pid 15626] write(2, "IO error: lock /etc/ceph/mon.a/s"..., 69IO error: lock /etc/ceph/mon.a/store.db/LOCK: already held by process) = 69 [pid 15626] write(2, "\n", 1 ) = 1 [pid 15626] futex(0x325a990600, FUTEX_WAKE_PRIVATE, 2147483647) = 0 [pid 15626] futex(0x325c015a90, FUTEX_WAKE_PRIVATE, 2147483647) = 0 [pid 15626] write(2, "mon/Monitor.cc: In function 'int"..., 190mon/Monitor.cc: In function 'int Monitor::StoreConverter::convert()' thread 7f73b8cbf780 time 2013-07-11 13:00:13.261293 mon/Monitor.cc: 4329: FAILED assert(!db->create_and_open(std::cerr)) ) = 190
We're now going to resort to analysing the code.
Files
Actions