Project

General

Profile

Actions

Bug #151

closed

cmon crash in PGMonitor::update_from_paxos at mon/PGMonitor.cc:90

Added by ar Fred almost 14 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

one of my 3 monitors crashed today, the whole ceph cluster was idle at that time.

cmon compiled at f7708dea1f, please find the analysis of the core dump below:

(gdb) bt
#0 0x00007f262f55ca75 in GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007f262f5605c0 in *_GI_abort () at abort.c:92
#2 0x00007f262f555941 in *
_GI
_assert_fail (assertion=0x55b979 "success", file=<value optimized out>, line=90, function=0x55f220 "virtual bool PGMonitor::update_from_paxos()")
at assert.c:81
#3 0x00000000004b432e in PGMonitor::update_from_paxos (this=<value optimized out>) at mon/PGMonitor.cc:90
#4 0x000000000047ca06 in PaxosService::_active (this=0xcb54c0) at mon/PaxosService.cc:174
#5 0x000000000047a54c in finish_contexts(std::list<Context
, std::allocator<Context*> >&, int) ()
#6 0x00000000004791d0 in Paxos::handle_last (this=0xcb52a0, last=0x7f2628004a20) at mon/Paxos.cc:281
#7 0x00000000004795fb in Paxos::dispatch (this=0xcb52a0, m=0x7f2628004a20) at mon/Paxos.cc:809
#8 0x00000000004684ac in Monitor::_ms_dispatch (this=<value optimized out>, m=0x7f2628004a20) at mon/Monitor.cc:716
#9 0x000000000047333d in Monitor::ms_dispatch(Message*) ()
#10 0x00000000004522e9 in Messenger::ms_deliver_dispatch (this=<value optimized out>) at msg/Messenger.h:97
#11 SimpleMessenger::dispatch_entry (this=<value optimized out>) at msg/SimpleMessenger.cc:323
#12 0x000000000044534c in SimpleMessenger::DispatchThread::entry (this=0xcac850) at msg/SimpleMessenger.h:494
#13 0x0000000000456d9a in Thread::_entry_func (arg=0x5c6) at ./common/Thread.h:39
#14 0x00007f26303ef9ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#15 0x00007f262f60f69d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#16 0x0000000000000000 in ?? ()

(gdb) info threads
10 Thread 1529 0x00007f262f602f53 in *GI_poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
9 Thread 1526 pthread_cond_wait@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
8 Thread 1543 pthread_cond_wait
@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
7 Thread 1481 0x00007f262f602f53 in *GI_poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
6 Thread 1527 pthread_cond_wait@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
5 Thread 1525 pthread_cond_timedwait
@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:211
4 Thread 1542 0x00007f262f602f53 in *GI_poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
3 Thread 1528 0x00007f262f602f53 in *GI_poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
2 Thread 1478 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  • 1 Thread 1524 0x00007f262f55ca75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64

I'm keeping the core dump in case you need more information.


Files

mon0 (32.3 KB) mon0 logs for mon0 ar Fred, 05/26/2010 10:08 AM
mon0_pgmap.txt (2.42 KB) mon0_pgmap.txt find -ls on the mon0 pgmap dir ar Fred, 05/26/2010 10:19 AM
mon1_pgmap.txt (1.44 KB) mon1_pgmap.txt find -ls on the mon1 pgmap dir ar Fred, 05/26/2010 10:19 AM
mon1.gz (42.4 KB) mon1.gz mon1 log ar Fred, 05/26/2010 10:23 AM
Actions #1

Updated by ar Fred almost 14 years ago

Actions #4

Updated by ar Fred almost 14 years ago

Actions #5

Updated by Sage Weil almost 14 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF