Bug #1549
mds: zeroed root CDir* vtable in scatter_writebehind_finish
0%
Description
Logs are in teuthology:~teuthworker/archive/nightly_coverage_2011-09-20/342/
2011-09-20T11:47:50.064 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2011-09-20T11:47:50.064 INFO:teuthology.task.ceph.mds.0.err: in thread 0x7f089d459700 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.34-549-gd64237a (commit:d64237a6a555944d6d35676490bc4fb7c7db965d) 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/cmds() [0x8ec204] 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7f08a0cd1b40] 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 3: (Mutation::drop_local_auth_pins()+0x39) [0x5920a9] 2011-09-20T11:47:50.066 INFO:teuthology.task.ceph.mds.0.err: 4: (Mutation::cleanup()+0x11) [0x592fa1] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::scatter_writebehind_finish(ScatterLock*, Mutation*)+0x1f5) [0x68d785] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x69a23d] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 7: (Context::complete(int)+0x12) [0x49b862] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 8: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7eadbe] 2011-09-20T11:47:50.067 INFO:teuthology.task.ceph.mds.0.err: 9: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7e2a16] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 10: (Journaler::C_Flush::finish(int)+0x1d) [0x7eafcd] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xd8a) [0x7b160a] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 12: (MDS::handle_core_message(Message*)+0xedf) [0x4c5fff] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 13: (MDS::_dispatch(Message*)+0x3c) [0x4c615c] 2011-09-20T11:47:50.068 INFO:teuthology.task.ceph.mds.0.err: 14: (MDS::ms_dispatch(Message*)+0x97) [0x4c8697] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 15: (SimpleMessenger::dispatch_entry()+0x9d2) [0x822012] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 16: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x492b2c] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 17: (Thread::_entry_func(void*)+0x12) [0x816052] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 18: (()+0x7971) [0x7f08a0cc9971] 2011-09-20T11:47:50.069 INFO:teuthology.task.ceph.mds.0.err: 19: (clone()+0x6d) [0x7f089f75d92d]
History
#1 Updated by Sage Weil about 12 years ago
Grr, I ran a loop on #1464 for days and wasn't able to hit this. Want to see the mds log to see how we got into this corner.
#2 Updated by Sage Weil about 12 years ago
{CDentry,CInode,CDir}::auth_pin() pin the object too, so i'm not sure how we can have a use-after-free in the code that is dropping auth pins.
#3 Updated by Sage Weil about 12 years ago
- Target version changed from v0.37 to v0.38
#4 Updated by Josh Durgin about 12 years ago
This happened again today after fsstress. From teuthology:~teuthworker/archive/nightly_coverage_2011-10-27/1083/teuthology.log:
2011-10-27T01:36:40.407 INFO:teuthology.task.ceph:Shutting down mds daemons... 2011-10-27T01:36:40.412 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2011-10-27T01:36:40.412 INFO:teuthology.task.ceph.mds.0.err: in thread 7fe7a7ffc700 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.37-190-g11691a7 (commit:11691a7111d7329a6d11e25ad19005e3824e9dbb) 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x90d164] 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7fe7ab874b40] 2011-10-27T01:36:40.414 INFO:teuthology.task.ceph.mds.0.err: 3: (Mutation::drop_local_auth_pins()+0x39) [0x5952c9] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 4: (Mutation::cleanup()+0x11) [0x5961c1] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::scatter_writebehind_finish(ScatterLock*, Mutation*)+0x1f5) [0x690af5] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x69d50d] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 7: (Context::complete(int)+0x12) [0x49e402] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 8: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e2a6e] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 9: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d6ac6] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 10: (Journaler::C_Flush::finish(int)+0x1d) [0x7e2c9d] 2011-10-27T01:36:40.415 INFO:teuthology.task.ceph.mds.0.err: 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xf39) [0x7b2549] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 12: (MDS::handle_core_message(Message*)+0xecf) [0x4ca93f] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 13: (MDS::_dispatch(Message*)+0x3c) [0x4caa9c] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 14: (MDS::ms_dispatch(Message*)+0xa9) [0x4cd089] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 15: (SimpleMessenger::dispatch_entry()+0x9c2) [0x817ad2] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 16: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x49551c] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 17: (Thread::_entry_func(void*)+0x12) [0x8126a2] 2011-10-27T01:36:40.416 INFO:teuthology.task.ceph.mds.0.err: 18: (()+0x7971) [0x7fe7ab86c971] 2011-10-27T01:36:40.417 INFO:teuthology.task.ceph.mds.0.err: 19: (clone()+0x6d) [0x7fe7aa30092d]
#5 Updated by Sage Weil about 12 years ago
- Status changed from New to Need More Info
- Assignee set to Sage Weil
bleh. need logs... i'll start this up in a loop again.
#6 Updated by Sage Weil about 12 years ago
- Target version changed from v0.38 to v0.39
#7 Updated by Josh Durgin about 12 years ago
This happened after the misc workunit today.
#8 Updated by Sage Weil about 12 years ago
- Assignee deleted (
Sage Weil)
Someone needs to try to reproduce this with logs. fwiw metropolis:~sage/src/teuthology/hammer.sh is what i've been using.
#9 Updated by Anonymous about 12 years ago
This happened again on 11/16, 2056 kclient_workunit_kernel_untar_build
2011-11-16T00:36:30.996 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) *
2011-11-16T00:36:30.997 INFO:teuthology.task.ceph.mds.0.err: in thread 7fbe995ef700
2011-11-16T00:36:30.998 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.38-181-g2e19550 (commit:2e195500b5d3a8ab8512bcf2a219a6b7ff922c97)
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x913774]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7fbe9d06cb40]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 3: (Mutation::drop_local_auth_pins()+0x39) [0x5962a9]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 4: (Mutation::cleanup()+0x11) [0x5971a1]
2011-11-16T00:36:30.999 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::scatter_writebehind_finish(ScatterLock, Mutation*)+0x1f5) [0x691aa5]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x69e4cd]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 7: (Context::complete(int)+0x12) [0x49f1f2]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 8: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e23ae]
2011-11-16T00:36:31.000 INFO:teuthology.task.ceph.mds.0.err: 9: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d6406]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 10: (Journaler::C_Flush::finish(int)+0x1d) [0x7e25dd]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x100c) [0x7b43ac]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 12: (MDS::handle_core_message(Message*)+0xebf) [0x4cb72f]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 13: (MDS::_dispatch(Message*)+0x3c) [0x4cb88c]
2011-11-16T00:36:31.001 INFO:teuthology.task.ceph.mds.0.err: 14: (MDS::ms_dispatch(Message*)+0xa5) [0x4cde85]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 15: (SimpleMessenger::dispatch_entry()+0x99a) [0x8178fa]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 16: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x49630c]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 17: (Thread::_entry_func(void*)+0x12) [0x812442]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 18: (()+0x7971) [0x7fbe9d064971]
2011-11-16T00:36:31.002 INFO:teuthology.task.ceph.mds.0.err: 19: (clone()+0x6d) [0x7fbe9b8f392d]
2011-11-16T00:36:31.366 INFO:teuthology.task.ceph.mds.0.err:daemon-helper: command crashed with signal 11
#10 Updated by Sage Weil about 12 years ago
- Status changed from Need More Info to In Progress
- Assignee set to Sage Weil
- Priority changed from Normal to High
#11 Updated by Sage Weil about 12 years ago
- translation missing: en.field_position set to 6
#12 Updated by Sage Weil about 12 years ago
- Target version changed from v0.39 to v0.40
#13 Updated by Sage Weil about 12 years ago
Happened twice today:
#0 0x00007f20be7fba0b in raise () from /lib/libpthread.so.0 #1 0x0000000000916a4b in reraise_fatal (signum=3392) at global/signal_handler.cc:59 #2 0x000000000091722c in handle_fatal_signal (signum=<value optimized out>) at global/signal_handler.cc:106 #3 <signal handler called> #4 0x0000000000596929 in Mutation::drop_local_auth_pins (this=0x2df12a00) at mds/Mutation.cc:91 #5 0x0000000000597821 in Mutation::cleanup (this=0x225a000) at mds/Mutation.cc:163 #6 0x0000000000682765 in Locker::scatter_writebehind_finish (this=0x2214a00, lock=0x22497d0, mut=0x2df12a00) at mds/Locker.cc:3625 #7 0x000000000069e64d in Locker::C_Locker_ScatterWB::finish(int) () #8 0x000000000049f7d2 in Context::complete (this=0x225a000, r=770779648) at ./include/Context.h:41 #9 0x00000000007e28ae in finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int) () #10 0x00000000007d6906 in Journaler::_finish_flush (this=0x2243000, r=<value optimized out>, start=788596852, stamp=<value optimized out>) at osdc/Journaler.cc:419 #11 0x00000000007e2add in Journaler::C_Flush::finish(int) () #12 0x00000000007b780e in Objecter::handle_osd_op_reply (this=0x2232000, m=0x538a1c0) at osdc/Objecter.cc:1205 #13 0x00000000004cbd0f in MDS::handle_core_message (this=0x2225a00, m=0x538a1c0) at mds/MDS.cc:1695 #14 0x00000000004cbe6c in MDS::_dispatch (this=0x2225a00, m=0x538a1c0) at mds/MDS.cc:1818 #15 0x00000000004ce465 in MDS::ms_dispatch (this=0x2225a00, m=0x538a1c0) at mds/MDS.cc:1631 #16 0x000000000081ab5a in ms_deliver_dispatch (this=0x2225000) at msg/Messenger.h:102 #17 SimpleMessenger::dispatch_entry (this=0x2225000) at msg/SimpleMessenger.cc:358 #18 0x000000000049684c in SimpleMessenger::DispatchThread::entry (this=0x2225488) at ./msg/SimpleMessenger.h:549 #19 0x00000000008156a2 in Thread::_entry_func (arg=0x225a000) at common/Thread.cc:41 #20 0x00007f20be7f3971 in start_thread () from /lib/libpthread.so.0 #21 0x00007f20bd08292d in clone () from /lib/libc.so.6 #22 0x0000000000000000 in ?? ()
or
#0 0x00007f539323ea0b in raise () from /lib/libpthread.so.0 #1 0x0000000000916a4b in reraise_fatal (signum=11357) at global/signal_handler.cc:59 #2 0x000000000091722c in handle_fatal_signal (signum=<value optimized out>) at global/signal_handler.cc:106 #3 <signal handler called> #4 0x000000000074046f in CInode::finish_scatter_gather_update (this=0x1082000, type=1024) at mds/CInode.cc:1724 #5 0x0000000000670ebe in Locker::scatter_writebehind (this=0x104ea00, lock=0x10827d0) at mds/Locker.cc:3588 #6 0x0000000000671952 in Locker::simple_lock (this=0x104ea00, lock=0x10827d0, need_issue=0x7f538f7bf9ff) at mds/Locker.cc:3471 #7 0x0000000000676d3f in Locker::scatter_eval (this=0x104ea00, lock=0x10827d0, need_issue=0x7f538f7bf9ff) at mds/Locker.cc:3665 #8 0x000000000067782d in Locker::eval (this=0x1093000, lock=0x400, need_issue=0x0) at mds/Locker.cc:971 #9 0x0000000000678070 in Locker::try_eval (this=0x104ea00, lock=0x10827d0, pneed_issue=0x7f538f7bf9ff) at mds/Locker.cc:915 #10 0x000000000067c431 in Locker::eval_gather (this=0x104ea00, lock=0x10827d0, first=<value optimized out>, pneed_issue=<value optimized out>, pfinishers=<value optimized out>) at mds/Locker.cc:751 #11 0x000000000068140d in Locker::wrlock_finish (this=0x104ea00, lock=0x10827d0, mut=0x19e7a00, pneed_issue=<value optimized out>) at mds/Locker.cc:1259 #12 0x00000000006819bd in Locker::_drop_non_rdlocks (this=0x104ea00, mut=0x19e7a00, pneed_issue=0x7f538f7bfb20) at mds/Locker.cc:491 #13 0x00000000006824a4 in Locker::drop_locks (this=0x104ea00, mut=0x19e7a00, pneed_issue=0x7f538f7bfb20) at mds/Locker.cc:524 #14 0x0000000000682755 in Locker::scatter_writebehind_finish (this=0x104ea00, lock=0x10827d0, mut=0x19e7a00) at mds/Locker.cc:3624 #15 0x000000000069e64d in Locker::C_Locker_ScatterWB::finish(int) () #16 0x000000000049f7d2 in Context::complete (this=0x1093000, r=1024) at ./include/Context.h:41 #17 0x00000000007e28ae in finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int) () #18 0x00000000007d6906 in Journaler::_finish_flush (this=0x107b000, r=<value optimized out>, start=12757588, stamp=<value optimized out>) at osdc/Journaler.cc:419 #19 0x00000000007e2add in Journaler::C_Flush::finish(int) () #20 0x00000000007b780e in Objecter::handle_osd_op_reply (this=0x106c000, m=0x1075a80) at osdc/Objecter.cc:1205 #21 0x00000000004cbd0f in MDS::handle_core_message (this=0x105fa00, m=0x1075a80) at mds/MDS.cc:1695 #22 0x00000000004cbe6c in MDS::_dispatch (this=0x105fa00, m=0x1075a80) at mds/MDS.cc:1818 #23 0x00000000004ce465 in MDS::ms_dispatch (this=0x105fa00, m=0x1075a80) at mds/MDS.cc:1631 #24 0x000000000081ab5a in ms_deliver_dispatch (this=0x105f000) at msg/Messenger.h:102 #25 SimpleMessenger::dispatch_entry (this=0x105f000) at msg/SimpleMessenger.cc:358 #26 0x000000000049684c in SimpleMessenger::DispatchThread::entry (this=0x105f488) at ./msg/SimpleMessenger.h:549 #27 0x00000000008156a2 in Thread::_entry_func (arg=0x1093000) at common/Thread.cc:41 #28 0x00007f5393236971 in start_thread () from /lib/libpthread.so.0 #29 0x00007f5391ac592d in clone () from /lib/libc.so.6
In both cases, we crash calling a method on a CDir* that has a zeroed vtable. In both cases, dir->inode->inode.ino == 1.
#14 Updated by Sage Weil about 12 years ago
- Subject changed from mds segfault after trivial sync workunit on cfuse to mds: zeroed root CDir* vtable in scatter_writebehind_finish
#15 Updated by Sage Weil about 12 years ago
the tasks were in nightly_coverage_2011-11-30-a
3433: collection:basic clusters:fixed-3.yaml tasks:kclient_workunit_kernel_untar_build.yaml
3435: collection:basic clusters:fixed-3.yaml tasks:kclient_workunit_suites_ffsb.yaml
#16 Updated by Sage Weil almost 12 years ago
- translation missing: en.field_position deleted (
28) - translation missing: en.field_position set to 1039
#17 Updated by Sage Weil almost 12 years ago
- Assignee deleted (
Sage Weil)
I think the next step here is to run the mds under valgrind.
#18 Updated by Josh Durgin almost 12 years ago
Happened again in teuthology:teuthworker~/archive/nightly_coverage_2011-12-13-a/4183/remote/ubuntu@sepia74.ceph.dreamhost.com/log/mds.0.log.gz
#19 Updated by Sage Weil almost 12 years ago
- Status changed from In Progress to Need More Info
#20 Updated by Josh Durgin almost 12 years ago
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-12-29-a/5318/remote/ubuntu@sepia60.ceph.dreamhost.com/log/mds.0.log.gz
#21 Updated by Sage Weil almost 12 years ago
hit this again, nightly_coverage_2011-12-29-b/5388
- kclient: null - locktest: - client.0 - client.1
#22 Updated by Sage Weil almost 12 years ago
- Priority changed from High to Normal
#23 Updated by Sage Weil almost 12 years ago
- Target version deleted (
v0.40) - translation missing: en.field_position deleted (
1087) - translation missing: en.field_position set to 101
#24 Updated by Josh Durgin almost 12 years ago
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2012-01-15-b/7721/remote/ubuntu@sepia6.ceph.dreamhost.com/log/mds.a.log.gz
#25 Updated by Sage Weil almost 12 years ago
again,
2012-01-27T15:46:22.731 INFO:teuthology.task.ceph:Shutting down mds daemons... 2012-01-27T15:46:22.733 INFO:teuthology.task.ceph.mds.a.err:*** Caught signal (Terminated) ** 2012-01-27T15:46:22.733 INFO:teuthology.task.ceph.mds.a.err: in thread 7fe843b50780. Shutting down. 2012-01-27T15:46:22.749 INFO:teuthology.task.ceph.mds.a.err:*** Caught signal (Segmentation fault) ** 2012-01-27T15:46:22.749 INFO:teuthology.task.ceph.mds.a.err: in thread 7fe83fcb2700 2012-01-27T15:46:22.755 INFO:teuthology.task.ceph.mds.a.err: ceph version 0.40-242-g374fec4 (commit:374fec47253bad511eee52d372f182402fb17b1a) 2012-01-27T15:46:22.755 INFO:teuthology.task.ceph.mds.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x9219c4] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 2: (()+0xfb40) [0x7fe84372fb40] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 3: (CInode::finish_scatter_gather_update(int)+0x128f) [0x7433ff] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 4: (Locker::scatter_writebehind(ScatterLock*)+0x5ce) [0x67389e] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 5: (Locker::simple_lock(SimpleLock*, bool*)+0x5e2) [0x674332] 2012-01-27T15:46:22.756 INFO:teuthology.task.ceph.mds.a.err: 6: (Locker::scatter_eval(ScatterLock*, bool*)+0x58f) [0x67971f] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 7: (Locker::eval(SimpleLock*, bool*)+0x6d) [0x67a20d] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 8: (Locker::try_eval(SimpleLock*, bool*)+0x830) [0x67aa50] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 9: (Locker::eval_gather(SimpleLock*, bool, bool*, std::list<Context*, std::allocator<Context*> >*)+0x1c31) [0x67ee11] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 10: (Locker::wrlock_finish(SimpleLock*, Mutation*, bool*)+0x45d) [0x683ded] 2012-01-27T15:46:22.757 INFO:teuthology.task.ceph.mds.a.err: 11: (Locker::_drop_non_rdlocks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x19d) [0x68439d] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 12: (Locker::drop_locks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x94) [0x684e84] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 13: (Locker::scatter_writebehind_finish(ScatterLock*, Mutation*)+0x1e5) [0x685135] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 14: (Locker::C_Locker_ScatterWB::finish(int)+0x1d) [0x6a102d] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 15: (Context::complete(int)+0x12) [0x4a0cc2] 2012-01-27T15:46:22.758 INFO:teuthology.task.ceph.mds.a.err: 16: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e765e] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 17: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x1fd) [0x7d9a4d] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 18: (Journaler::C_Flush::finish(int)+0x1d) [0x7e788d] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 19: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x173f) [0x7bb7bf] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 20: (MDS::handle_core_message(Message*)+0xecf) [0x4c8adf] 2012-01-27T15:46:22.759 INFO:teuthology.task.ceph.mds.a.err: 21: (MDS::_dispatch(Message*)+0x3c) [0x4c8c3c] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 22: (MDS::ms_dispatch(Message*)+0xa9) [0x4cb229] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 23: (SimpleMessenger::dispatch_entry()+0xa1a) [0x8206ea] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 24: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x497b9c] 2012-01-27T15:46:22.760 INFO:teuthology.task.ceph.mds.a.err: 25: (Thread::_entry_func(void*)+0x12) [0x81ac22] 2012-01-27T15:46:22.762 INFO:teuthology.task.ceph.mds.a.err: 26: (()+0x7971) [0x7fe843727971] 2012-01-27T15:46:22.763 INFO:teuthology.task.ceph.mds.a.err: 27: (clone()+0x6d) [0x7fe841fb692d] 2012-01-27T15:46:22.958 INFO:teuthology.task.ceph.mds.a.err:daemon-helper: command crashed with signal 11
i wonder if this is just an issue with the signal handler racing with the other threads? i think most (all?) of these crashes are when daemon-helper sends a signal to the process....
#26 Updated by Sage Weil almost 12 years ago
- Status changed from Need More Info to Resolved
using clean shutdown now, yay
#27 Updated by John Spray about 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
1)
Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.