Bug #1549
closedmds: zeroed root CDir* vtable in scatter_writebehind_finish
0%
Description
Logs are in teuthology:~teuthworker/archive/nightly_coverage_2011-09-20/342/
2011-09-20T11:47:50.064 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2011-09-20T11:47:50.064 INFO:teuthology.task.ceph.mds.0.err: in thread 0x7f089d459700 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.34-549-gd64237a (commit:d64237a6a555944d6d35676490bc4fb7c7db965d) 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/cmds() [0x8ec204] 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7f08a0cd1b40] 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 3: (Mutation::drop_local_auth_pins()+0x39) [0x5920a9] 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 4: (Mutation::cleanup()+0x11) [0x592fa1] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::scatter_writebehind_finish(ScatterLock*, Mutation*)+0x1f5) [0x68d785] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x69a23d] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 7: (Context::complete(int)+0x12) [0x49b862] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 8: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7eadbe] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 9: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7e2a16] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 10: (Journaler::C_Flush::finish(int)+0x1d) [0x7eafcd] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xd8a) [0x7b160a] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 12: (MDS::handle_core_message(Message*)+0xedf) [0x4c5fff] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 13: (MDS::_dispatch(Message*)+0x3c) [0x4c615c] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 14: (MDS::ms_dispatch(Message*)+0x97) [0x4c8697] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 15: (SimpleMessenger::dispatch_entry()+0x9d2) [0x822012] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 16: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x492b2c] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 17: (Thread::_entry_func(void*)+0x12) [0x816052] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 18: (()+0x7971) [0x7f08a0cc9971] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 19: (clone()+0x6d) [0x7f089f75d92d]
Updated by Sage Weil over 12 years ago
Grr, I ran a loop on #1464 for days and wasn't able to hit this. Want to see the mds log to see how we got into this corner.
Updated by Sage Weil over 12 years ago
{CDentry,CInode,CDir}::auth_pin() pin the object too, so i'm not sure how we can have a use-after-free in the code that is dropping auth pins.
Updated by Sage Weil over 12 years ago
- Target version changed from v0.37 to v0.38
Updated by Josh Durgin over 12 years ago
This happened again today after fsstress. From teuthology:~teuthworker/archive/nightly_coverage_2011-10-27/1083/teuthology.log:
2011-10-27T01:36:40.407 INFO:teuthology.task.ceph:Shutting down mds daemons... 2011-10-27T01:36:40.412 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2011-10-27T01:36:40.412 INFO:teuthology.task.ceph.mds.0.err: in thread 7fe7a7ffc700 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.37-190-g11691a7 (commit:11691a7111d7329a6d11e25ad19005e3824e9dbb) 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x90d164] 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7fe7ab874b40] 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: 3: (Mutation::drop_local_auth_pins()+0x39) [0x5952c9] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 4: (Mutation::cleanup()+0x11) [0x5961c1] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::scatter_writebehind_finish(ScatterLock*, Mutation*)+0x1f5) [0x690af5] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x69d50d] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 7: (Context::complete(int)+0x12) [0x49e402] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 8: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e2a6e] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 9: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d6ac6] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 10: (Journaler::C_Flush::finish(int)+0x1d) [0x7e2c9d] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xf39) [0x7b2549] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 12: (MDS::handle_core_message(Message*)+0xecf) [0x4ca93f] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 13: (MDS::_dispatch(Message*)+0x3c) [0x4caa9c] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 14: (MDS::ms_dispatch(Message*)+0xa9) [0x4cd089] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 15: (SimpleMessenger::dispatch_entry()+0x9c2) [0x817ad2] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 16: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x49551c] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 17: (Thread::_entry_func(void*)+0x12) [0x8126a2] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 18: (()+0x7971) [0x7fe7ab86c971] 2011-10-27T01:36:40.417 INFO:teuthology.task.ceph.mds.0.err: 19: (clone()+0x6d) [0x7fe7aa30092d]
Updated by Sage Weil over 12 years ago
- Status changed from New to Need More Info
- Assignee set to Sage Weil
bleh. need logs... i'll start this up in a loop again.
Updated by Sage Weil over 12 years ago
- Target version changed from v0.38 to v0.39
Updated by Josh Durgin over 12 years ago
This happened after the misc workunit today.
Updated by Sage Weil over 12 years ago
- Assignee deleted (
Sage Weil)
Someone needs to try to reproduce this with logs. fwiw metropolis:~sage/src/teuthology/hammer.sh is what i've been using.
Updated by Anonymous over 12 years ago
This happened again on 11/16, 2056 kclient_workunit_kernel_untar_build
2011-11-16T00:36:30.996 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) *
2011-11-16T00:36:30.997 INFO:teuthology.task.ceph.mds.0.err: in thread 7fbe995ef700
2011-11-16T00:36:30.998 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.38-181-g2e19550 (commit:2e195500b5d3a8ab8512bcf2a219a6b7ff922c97)
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x913774]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7fbe9d06cb40]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 3: (Mutation::drop_local_auth_pins()+0x39) [0x5962a9]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 4: (Mutation::cleanup()+0x11) [0x5971a1]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::scatter_writebehind_finish(ScatterLock, Mutation*)+0x1f5) [0x691aa5]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x69e4cd]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 7: (Context::complete(int)+0x12) [0x49f1f2]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 8: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e23ae]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 9: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d6406]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 10: (Journaler::C_Flush::finish(int)+0x1d) [0x7e25dd]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x100c) [0x7b43ac]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 12: (MDS::handle_core_message(Message*)+0xebf) [0x4cb72f]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 13: (MDS::_dispatch(Message*)+0x3c) [0x4cb88c]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 14: (MDS::ms_dispatch(Message*)+0xa5) [0x4cde85]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 15: (SimpleMessenger::dispatch_entry()+0x99a) [0x8178fa]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 16: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x49630c]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 17: (Thread::_entry_func(void*)+0x12) [0x812442]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 18: (()+0x7971) [0x7fbe9d064971]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 19: (clone()+0x6d) [0x7fbe9b8f392d]
2011-11-16T00:36:31.366 INFO:teuthology.task.ceph.mds.0.err:daemon-helper: command crashed with signal 11
Updated by Sage Weil over 12 years ago
- Status changed from Need More Info to In Progress
- Assignee set to Sage Weil
- Priority changed from Normal to High
Updated by Sage Weil over 12 years ago
- Translation missing: en.field_position set to 6
Updated by Sage Weil over 12 years ago
- Target version changed from v0.39 to v0.40
Updated by Sage Weil over 12 years ago
Happened twice today:
#0 0x00007f20be7fba0b in raise () from /lib/libpthread.so.0 #1 0x0000000000916a4b in reraise_fatal (signum=3392) at global/signal_handler.cc:59 #2 0x000000000091722c in handle_fatal_signal (signum=<value optimized out>) at global/signal_handler.cc:106 #3 <signal handler called> #4 0x0000000000596929 in Mutation::drop_local_auth_pins (this=0x2df12a00) at mds/Mutation.cc:91 #5 0x0000000000597821 in Mutation::cleanup (this=0x225a000) at mds/Mutation.cc:163 #6 0x0000000000682765 in Locker::scatter_writebehind_finish (this=0x2214a00, lock=0x22497d0, mut=0x2df12a00) at mds/Locker.cc:3625 #7 0x000000000069e64d in Locker::C_Locker_ScatterWB::finish(int) () #8 0x000000000049f7d2 in Context::complete (this=0x225a000, r=770779648) at ./include/Context.h:41 #9 0x00000000007e28ae in finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int) () #10 0x00000000007d6906 in Journaler::_finish_flush (this=0x2243000, r=<value optimized out>, start=788596852, stamp=<value optimized out>) at osdc/Journaler.cc:419 #11 0x00000000007e2add in Journaler::C_Flush::finish(int) () #12 0x00000000007b780e in Objecter::handle_osd_op_reply (this=0x2232000, m=0x538a1c0) at osdc/Objecter.cc:1205 #13 0x00000000004cbd0f in MDS::handle_core_message (this=0x2225a00, m=0x538a1c0) at mds/MDS.cc:1695 #14 0x00000000004cbe6c in MDS::_dispatch (this=0x2225a00, m=0x538a1c0) at mds/MDS.cc:1818 #15 0x00000000004ce465 in MDS::ms_dispatch (this=0x2225a00, m=0x538a1c0) at mds/MDS.cc:1631 #16 0x000000000081ab5a in ms_deliver_dispatch (this=0x2225000) at msg/Messenger.h:102 #17 SimpleMessenger::dispatch_entry (this=0x2225000) at msg/SimpleMessenger.cc:358 #18 0x000000000049684c in SimpleMessenger::DispatchThread::entry (this=0x2225488) at ./msg/SimpleMessenger.h:549 #19 0x00000000008156a2 in Thread::_entry_func (arg=0x225a000) at common/Thread.cc:41 #20 0x00007f20be7f3971 in start_thread () from /lib/libpthread.so.0 #21 0x00007f20bd08292d in clone () from /lib/libc.so.6 #22 0x0000000000000000 in ?? ()
or
#0 0x00007f539323ea0b in raise () from /lib/libpthread.so.0 #1 0x0000000000916a4b in reraise_fatal (signum=11357) at global/signal_handler.cc:59 #2 0x000000000091722c in handle_fatal_signal (signum=<value optimized out>) at global/signal_handler.cc:106 #3 <signal handler called> #4 0x000000000074046f in CInode::finish_scatter_gather_update (this=0x1082000, type=1024) at mds/CInode.cc:1724 #5 0x0000000000670ebe in Locker::scatter_writebehind (this=0x104ea00, lock=0x10827d0) at mds/Locker.cc:3588 #6 0x0000000000671952 in Locker::simple_lock (this=0x104ea00, lock=0x10827d0, need_issue=0x7f538f7bf9ff) at mds/Locker.cc:3471 #7 0x0000000000676d3f in Locker::scatter_eval (this=0x104ea00, lock=0x10827d0, need_issue=0x7f538f7bf9ff) at mds/Locker.cc:3665 #8 0x000000000067782d in Locker::eval (this=0x1093000, lock=0x400, need_issue=0x0) at mds/Locker.cc:971 #9 0x0000000000678070 in Locker::try_eval (this=0x104ea00, lock=0x10827d0, pneed_issue=0x7f538f7bf9ff) at mds/Locker.cc:915 #10 0x000000000067c431 in Locker::eval_gather (this=0x104ea00, lock=0x10827d0, first=<value optimized out>, pneed_issue=<value optimized out>, pfinishers=<value optimized out>) at mds/Locker.cc:751 #11 0x000000000068140d in Locker::wrlock_finish (this=0x104ea00, lock=0x10827d0, mut=0x19e7a00, pneed_issue=<value optimized out>) at mds/Locker.cc:1259 #12 0x00000000006819bd in Locker::_drop_non_rdlocks (this=0x104ea00, mut=0x19e7a00, pneed_issue=0x7f538f7bfb20) at mds/Locker.cc:491 #13 0x00000000006824a4 in Locker::drop_locks (this=0x104ea00, mut=0x19e7a00, pneed_issue=0x7f538f7bfb20) at mds/Locker.cc:524 #14 0x0000000000682755 in Locker::scatter_writebehind_finish (this=0x104ea00, lock=0x10827d0, mut=0x19e7a00) at mds/Locker.cc:3624 #15 0x000000000069e64d in Locker::C_Locker_ScatterWB::finish(int) () #16 0x000000000049f7d2 in Context::complete (this=0x1093000, r=1024) at ./include/Context.h:41 #17 0x00000000007e28ae in finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int) () #18 0x00000000007d6906 in Journaler::_finish_flush (this=0x107b000, r=<value optimized out>, start=12757588, stamp=<value optimized out>) at osdc/Journaler.cc:419 #19 0x00000000007e2add in Journaler::C_Flush::finish(int) () #20 0x00000000007b780e in Objecter::handle_osd_op_reply (this=0x106c000, m=0x1075a80) at osdc/Objecter.cc:1205 #21 0x00000000004cbd0f in MDS::handle_core_message (this=0x105fa00, m=0x1075a80) at mds/MDS.cc:1695 #22 0x00000000004cbe6c in MDS::_dispatch (this=0x105fa00, m=0x1075a80) at mds/MDS.cc:1818 #23 0x00000000004ce465 in MDS::ms_dispatch (this=0x105fa00, m=0x1075a80) at mds/MDS.cc:1631 #24 0x000000000081ab5a in ms_deliver_dispatch (this=0x105f000) at msg/Messenger.h:102 #25 SimpleMessenger::dispatch_entry (this=0x105f000) at msg/SimpleMessenger.cc:358 #26 0x000000000049684c in SimpleMessenger::DispatchThread::entry (this=0x105f488) at ./msg/SimpleMessenger.h:549 #27 0x00000000008156a2 in Thread::_entry_func (arg=0x1093000) at common/Thread.cc:41 #28 0x00007f5393236971 in start_thread () from /lib/libpthread.so.0 #29 0x00007f5391ac592d in clone () from /lib/libc.so.6
In both cases, we crash calling a method on a CDir* that has a zeroed vtable. In both cases, dir->inode->inode.ino == 1.
Updated by Sage Weil over 12 years ago
- Subject changed from mds segfault after trivial sync workunit on cfuse to mds: zeroed root CDir* vtable in scatter_writebehind_finish
Updated by Sage Weil over 12 years ago
the tasks were in nightly_coverage_2011-11-30-a
3433: collection:basic clusters:fixed-3.yaml tasks:kclient_workunit_kernel_untar_build.yaml
3435: collection:basic clusters:fixed-3.yaml tasks:kclient_workunit_suites_ffsb.yaml
Updated by Sage Weil over 12 years ago
- Translation missing: en.field_position deleted (
28) - Translation missing: en.field_position set to 1039
Updated by Sage Weil over 12 years ago
- Assignee deleted (
Sage Weil)
I think the next step here is to run the mds under valgrind.
Updated by Josh Durgin over 12 years ago
Happened again in teuthology:teuthworker~/archive/nightly_coverage_2011-12-13-a/4183/remote/ubuntu@sepia74.ceph.dreamhost.com/log/mds.0.log.gz
Updated by Sage Weil over 12 years ago
- Status changed from In Progress to Need More Info
Updated by Josh Durgin over 12 years ago
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-12-29-a/5318/remote/ubuntu@sepia60.ceph.dreamhost.com/log/mds.0.log.gz
Updated by Sage Weil over 12 years ago
hit this again, nightly_coverage_2011-12-29-b/5388
- kclient: null - locktest: - client.0 - client.1
Updated by Sage Weil over 12 years ago
- Target version deleted (
v0.40) - Translation missing: en.field_position deleted (
1087) - Translation missing: en.field_position set to 101
Updated by Josh Durgin over 12 years ago
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2012-01-15-b/7721/remote/ubuntu@sepia6.ceph.dreamhost.com/log/mds.a.log.gz
Updated by Sage Weil about 12 years ago
again,
2012-01-27T15:46:22.731 INFO:teuthology.task.ceph:Shutting down mds daemons... 2012-01-27T15:46:22.733 INFO:teuthology.task.ceph.mds.a.err:*** Caught signal (Terminated) ** 2012-01-27T15:46:22.733 INFO:teuthology.task.ceph.mds.a.err: in thread 7fe843b50780. Shutting down. 2012-01-27T15:46:22.749 INFO:teuthology.task.ceph.mds.a.err:*** Caught signal (Segmentation fault) ** 2012-01-27T15:46:22.749 INFO:teuthology.task.ceph.mds.a.err: in thread 7fe83fcb2700 2012-01-27T15:46:22.755 INFO:teuthology.task.ceph.mds.a.err: ceph version 0.40-242-g374fec4 (commit:374fec47253bad511eee52d372f182402fb17b1a) 2012-01-27T15:46:22.755 INFO:teuthology.task.ceph.mds.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x9219c4] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 2: (()+0xfb40) [0x7fe84372fb40] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 3: (CInode::finish_scatter_gather_update(int)+0x128f) [0x7433ff] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 4: (Locker::scatter_writebehind(ScatterLock*)+0x5ce) [0x67389e] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 5: (Locker::simple_lock(SimpleLock*, bool*)+0x5e2) [0x674332] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 6: (Locker::scatter_eval(ScatterLock*, bool*)+0x58f) [0x67971f] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 7: (Locker::eval(SimpleLock*, bool*)+0x6d) [0x67a20d] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 8: (Locker::try_eval(SimpleLock*, bool*)+0x830) [0x67aa50] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 9: (Locker::eval_gather(SimpleLock*, bool, bool*, std::list<Context*, std::allocator<Context*> >*)+0x1c31) [0x67ee11] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 10: (Locker::wrlock_finish(SimpleLock*, Mutation*, bool*)+0x45d) [0x683ded] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 11: (Locker::_drop_non_rdlocks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x19d) [0x68439d] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 12: (Locker::drop_locks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x94) [0x684e84] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 13: (Locker::scatter_writebehind_finish(ScatterLock*, Mutation*)+0x1e5) [0x685135] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 14: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x6a102d] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 15: (Context::complete(int)+0x12) [0x4a0cc2] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 16: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e765e] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 17: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x1fd) [0x7d9a4d] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 18: (Journaler::C_Flush::finish(int)+0x1d) [0x7e788d] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 19: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x173f) [0x7bb7bf] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 20: (MDS::handle_core_message(Message*)+0xecf) [0x4c8adf] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 21: (MDS::_dispatch(Message*)+0x3c) [0x4c8c3c] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 22: (MDS::ms_dispatch(Message*)+0xa9) [0x4cb229] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 23: (SimpleMessenger::dispatch_entry()+0xa1a) [0x8206ea] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 24: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x497b9c] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 25: (Thread::_entry_func(void*)+0x12) [0x81ac22] 2012-01-27T15:46:22.762 INFO:teuthology.task.ceph.mds.a.err: 26: (()+0x7971) [0x7fe843727971] 2012-01-27T15:46:22.763 INFO:teuthology.task.ceph.mds.a.err: 27: (clone()+0x6d) [0x7fe841fb692d] 2012-01-27T15:46:22.958 INFO:teuthology.task.ceph.mds.a.err:daemon-helper: command crashed with signal 11
i wonder if this is just an issue with the signal handler racing with the other threads? i think most (all?) of these crashes are when daemon-helper sends a signal to the process....
Updated by Sage Weil about 12 years ago
- Status changed from Need More Info to Resolved
using clean shutdown now, yay
Updated by John Spray over 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
1)
Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.