Bug #2716
crash when cluster goes down and new one comes up
Status:
Resolved
Priority:
High
Assignee:
-
Category:
libceph
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
- vstart cluster
- mount uml
- do some stuff (dbench, control-c, sync)
- stop vstart cluster
- (waited a while)
- start new vstart cluster
-> crash
uml log attached
History
#1 Updated by Sage Weil about 11 years ago
- File c.gz added
#2 Updated by Sage Weil about 11 years ago
#0 0x00007f25855c2757 in kill () at ../sysdeps/unix/syscall-template.S:82 #1 0x0000000060031fe6 in uml_abort () at arch/um/os-Linux/util.c:93 #2 0x000000006003224a in os_dump_core () at arch/um/os-Linux/util.c:138 #3 0x000000006001fb9d in panic_exit (self=<optimized out>, unused1=<optimized out>, unused2=<optimized out>) at arch/um/kernel/um_arch.c:240 #4 0x000000006005c700 in notifier_call_chain (nl=<optimized out>, val=0, v=0x605469a0, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93 #5 0x000000006005c78f in __atomic_notifier_call_chain (nr_calls=0x0, nr_to_call=-1, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182 #6 atomic_notifier_call_chain (nh=<optimized out>, val=<optimized out>, v=<optimized out>) at kernel/notifier.c:191 #7 0x0000000060362451 in panic (fmt=0x603e8022 "Segfault with no mm") at kernel/panic.c:120 #8 0x000000006001f709 in segv (fi=..., ip=1614078625, is_user=0, regs=0x6046f810) at arch/um/kernel/trap.c:232 #9 0x000000006001f90f in segv_handler (sig=<optimized out>, regs=<optimized out>) at arch/um/kernel/trap.c:184 #10 0x0000000060030f1c in sig_handler_common (sig=11, mc=0x6046fbe8) at arch/um/os-Linux/signal.c:43 #11 0x0000000060031074 in sig_handler (sig=<optimized out>, mc=<optimized out>) at arch/um/os-Linux/signal.c:230 #12 0x0000000060030b57 in hard_handler (sig=<optimized out>, info=<optimized out>, p=0x6046fbc0) at arch/um/os-Linux/signal.c:164 #13 <signal handler called> #14 mon_alloc_msg (con=0x7d8eb2e8, hdr=0x7d8eb698, skip=0x800bfdcc) at net/ceph/mon_client.c:988 #15 0x000000006034aeed in ceph_con_in_msg_alloc (hdr=<optimized out>, con=0x7d8eb2e8) at net/ceph/messenger.c:2702 #16 read_partial_message (con=0x7d8eb2e8) at net/ceph/messenger.c:1842 #17 try_read (con=0x7d8eb2e8) at net/ceph/messenger.c:2179 #18 con_work (work=0x7d8eb708) at net/ceph/messenger.c:2288 #19 0x00000000600527d7 in process_one_work (worker=0x8004e0a0, work=0x7d8eb708) at kernel/workqueue.c:1876 #20 0x0000000060053b3a in worker_thread (__worker=0x8004e0a0) at kernel/workqueue.c:1987 #21 0x0000000060057620 in kthread (_create=0x800b9ca8) at kernel/kthread.c:121 #22 0x000000006002fdda in run_kernel_thread (fn=0x60057555 <kthread>, arg=0x800b9ca8, jmp_ptr=<optimized out>) at arch/um/os-Linux/process.c:257 #23 0x000000006001d0c2 in new_thread_handler () at arch/um/kernel/process.c:153 (gdb) f 14 #14 mon_alloc_msg (con=0x7d8eb2e8, hdr=0x7d8eb698, skip=0x800bfdcc) at net/ceph/mon_client.c:988 988 m = ceph_msg_get(monc->m_auth_reply); (gdb) p monc $1 = (struct ceph_mon_client *) 0x0
#3 Updated by Sage Weil about 11 years ago
- Status changed from New to Resolved
bad con->private = NULL in monc __close_session