Bug #1682
closedmds: segfault in CInode::authority
0%
Description
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-04/1469/teuthology.log:
2011-11-04T00:44:12.235 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2011-11-04T00:44:12.236 INFO:teuthology.task.ceph.mds.0.err: in thread 7fc289bc7700 2011-11-04T00:44:12.238 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.37-299-g256ac72 (commit:256ac72abc54504d613f2513fd8ac0a6a1e722fa) 2011-11-04T00:44:12.238 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x9102a4] 2011-11-04T00:44:12.238 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7fc28d43fb40] 2011-11-04T00:44:12.238 INFO:teuthology.task.ceph.mds.0.err: 3: (CInode::authority()+0x46) [0x71c2e6] 2011-11-04T00:44:12.239 INFO:teuthology.task.ceph.mds.0.err: 4: (CDir::authority()+0x56) [0x6ef6d6] 2011-11-04T00:44:12.239 INFO:teuthology.task.ceph.mds.0.err: 5: (CInode::authority()+0x49) [0x71c2e9] 2011-11-04T00:44:12.239 INFO:teuthology.task.ceph.mds.0.err: 6: (CDir::authority()+0x56) [0x6ef6d6] 2011-11-04T00:44:12.239 INFO:teuthology.task.ceph.mds.0.err: 7: (CInode::authority()+0x49) [0x71c2e9] 2011-11-04T00:44:12.239 INFO:teuthology.task.ceph.mds.0.err: 8: (CDir::authority()+0x56) [0x6ef6d6] 2011-11-04T00:44:12.240 INFO:teuthology.task.ceph.mds.0.err: 9: (CInode::authority()+0x49) [0x71c2e9] 2011-11-04T00:44:12.240 INFO:teuthology.task.ceph.mds.0.err: 10: (CDir::authority()+0x56) [0x6ef6d6] 2011-11-04T00:44:12.240 INFO:teuthology.task.ceph.mds.0.err: 11: (CInode::authority()+0x49) [0x71c2e9] 2011-11-04T00:44:12.240 INFO:teuthology.task.ceph.mds.0.err: 12: (CDir::authority()+0x56) [0x6ef6d6] 2011-11-04T00:44:12.241 INFO:teuthology.task.ceph.mds.0.err: 13: (CInode::authority()+0x49) [0x71c2e9] 2011-11-04T00:44:12.241 INFO:teuthology.task.ceph.mds.0.err: 14: (MDCache::predirty_journal_parents(Mutation*, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)+0xdb7) [0x5fa2c7] 2011-11-04T00:44:12.241 INFO:teuthology.task.ceph.mds.0.err: 15: (Locker::_do_cap_update(CInode*, Capability*, int, snapid_t, MClientCaps*, MClientCaps*)+0xca3) [0x686213] 2011-11-04T00:44:12.241 INFO:teuthology.task.ceph.mds.0.err: 16: (Locker::handle_client_caps(MClientCaps*)+0x1ebd) [0x68c37d] 2011-11-04T00:44:12.241 INFO:teuthology.task.ceph.mds.0.err: 17: (Locker::dispatch(Message*)+0xb5) [0x690045] 2011-11-04T00:44:12.242 INFO:teuthology.task.ceph.mds.0.err: 18: (MDS::handle_deferrable_message(Message*)+0x13df) [0x4ac80f] 2011-11-04T00:44:12.242 INFO:teuthology.task.ceph.mds.0.err: 19: (MDS::_dispatch(Message*)+0xe9a) [0x4cba9a] 2011-11-04T00:44:12.242 INFO:teuthology.task.ceph.mds.0.err: 20: (MDS::ms_dispatch(Message*)+0xa9) [0x4cd229] 2011-11-04T00:44:12.242 INFO:teuthology.task.ceph.mds.0.err: 21: (SimpleMessenger::dispatch_entry()+0x99a) [0x81854a] 2011-11-04T00:44:12.242 INFO:teuthology.task.ceph.mds.0.err: 22: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x4956cc] 2011-11-04T00:44:12.243 INFO:teuthology.task.ceph.mds.0.err: 23: (Thread::_entry_func(void*)+0x12) [0x813092] 2011-11-04T00:44:12.243 INFO:teuthology.task.ceph.mds.0.err: 24: (()+0x7971) [0x7fc28d437971] 2011-11-04T00:44:12.243 INFO:teuthology.task.ceph.mds.0.err: 25: (clone()+0x6d) [0x7fc28becb92d] 2011-11-04T00:44:26.932 INFO:teuthology.task.ceph.mds.0.err:daemon-helper: command crashed with signal 11
Files
Updated by Josh Durgin over 12 years ago
Another crash is CInode::Authority happened today, although a different backtrace.
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-18-2/2660/remote/ubuntu@sepia68.ceph.dreamhost.com/log/mds.0.log.gz
2011-11-18 12:38:08.714034 7fe54d18d700 mds.0.1 beacon_kill last_acked_stamp 2011-11-18 12:37:36.812900, we are laggy! *** Caught signal (Segmentation fault) ** in thread 7fe54eb92700 ceph version 0.38-199-gdedf2c4 (commit:dedf2c4a066876bdab9a0b0154196194cefc1340) 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x913614] 2: (()+0xfb40) [0x7fe55260fb40] 3: (CInode::authority()+0x46) [0x71cfa6] 4: (CDir::authority()+0x56) [0x6f0396] 5: (CInode::authority()+0x49) [0x71cfa9] 6: (Locker::try_eval(SimpleLock*, bool*)+0x2a) [0x6771ea] 7: (Locker::wrlock_finish(SimpleLock*, Mutation*, bool*)+0x384) [0x680cb4] 8: (Locker::_drop_non_rdlocks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x19d) [0x68133d] 9: (Locker::drop_locks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x94) [0x6917e4] 10: (Locker::file_update_finish(CInode*, Mutation*, bool, client_t, Capability*, MClientCaps*)+0x1ce) [0x691e0e] 11: (C_Locker_FileUpdate_finish::finish(int)+0x34) [0x69e514] 12: (Context::complete(int)+0x12) [0x49f1f2] 13: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e222e] 14: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d6286] 15: (Journaler::C_Flush::finish(int)+0x1d) [0x7e245d] 16: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe2e) [0x7b72de] 17: (MDS::handle_core_message(Message*)+0xebf) [0x4cb72f] 18: (MDS::_dispatch(Message*)+0x3c) [0x4cb88c] 19: (MDS::ms_dispatch(Message*)+0xa5) [0x4cde85] 20: (SimpleMessenger::dispatch_entry()+0x99a) [0x81777a] 21: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x49630c] 22: (Thread::_entry_func(void*)+0x12) [0x8122c2] 23: (()+0x7971) [0x7fe552607971] 24: (clone()+0x6d) [0x7fe550e9692d]
Updated by Sage Weil over 12 years ago
- Translation missing: en.field_position set to 5
Updated by Sage Weil over 12 years ago
Hrm, this has me stumped.
The log leading up is
2011-11-19 21:02:18.741494 7fa5a5d45700 mds.0.locker revoking pAsxLsXsxFsxcrwb on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@2},l=4103 | caps 0xe66dd40] 2011-11-19 21:02:18.741515 7fa5a5d45700 mds.0.locker eval 2496 [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 needsrecover s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103 | caps 0xe66dd40] 2011-11-19 21:02:18.741524 7fa5a5d45700 mds.0.locker eval doesn't want loner 2011-11-19 21:02:18.741544 7fa5a5d45700 mds.0.locker file_eval wanted= loner_wanted= other_wanted= filelock=(ifile excl) on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 needsrecover s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps 0xe66dd40] 2011-11-19 21:02:18.741565 7fa5a5d45700 mds.0.locker simple_sync on (ifile excl) on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 needsrecover s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps 0xe66dd40] 2011-11-19 21:02:18.741586 7fa5a5d45700 mds.0.cache queue_file_recover [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 needsrecover s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps 0xe66dd40] 2011-11-19 21:02:18.741597 7fa5a5d45700 mds.0.cache.snaprealm(1 seq 1 0x1ca3b40) get_snaps (seq 1 cached_seq 1) 2011-11-19 21:02:18.741605 7fa5a5d45700 mds.0.cache snaps in [2,head] are 2011-11-19 21:02:18.741625 7fa5a5d45700 mds.0.cache.ino(1000000c1a8) auth_pin by 0x1ca6200 on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 ap=1+0 recovering s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps authpin 0xe66dd40] now 1+0 2011-11-19 21:02:18.741636 7fa5a5d45700 mds.0.cache do_file_recover 1186 queued, 5 recovering 2011-11-19 21:02:18.741656 7fa5a5d45700 mds.0.cache.ino(1000000c1a8) auth_pin by 0xe66e490 on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 ap=2+0 recovering s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps authpin 0xe66dd40] now 2+0 2011-11-19 21:02:18.741677 7fa5a5d45700 mds.0.locker simple_eval (iauth excl) on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 ap=2+0 recovering s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps authpin 0xe66dd40] 2011-11-19 21:02:18.741698 7fa5a5d45700 mds.0.locker simple_eval stable, syncing (iauth excl) on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 ap=2+0 recovering s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps authpin 0xe66dd40] 2011-11-19 21:02:18.741729 7fa5a5d45700 mds.0.locker simple_sync on (iauth excl) on [inode 1000000c1a8 [2,head] /client.0/tmp/usr.2/include/c++/4.4/ext/vstring_fwd.h auth v382 ap=2+0 recovering s=3216 n(v0 b3216 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={4103=0-4194304@1} caps={4103=-/pAsxXsxFsxcrwb@3},l=4103(-1) | caps authpin 0xe66dd40] *** Caught signal (Segmentation fault) ** in thread 7fa5a5d45700 ceph version 0.38-204-g9920a16 (commit:9920a168c59807083019c62fdf381434edea12e5) 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x913894] 2: (()+0xfb40) [0x7fa5ab1c7b40] 3: (CInode::make_path_string(std::string&, bool, CDentry*)+0x1d) [0x726f0d] 4: (CDentry::make_path_string(std::string&)+0x30) [0x6e9130] 5: (CInode::make_path_string(std::string&, bool, CDentry*)+0x44) [0x726f34] 6: (CDentry::make_path_string(std::string&)+0x30) [0x6e9130] 7: (CInode::make_path_string(std::string&, bool, CDentry*)+0x44) [0x726f34] 8: (CDentry::make_path_string(std::string&)+0x30) [0x6e9130] 9: (CInode::make_path_string(std::string&, bool, CDentry*)+0x44) [0x726f34] 10: (CDentry::make_path_string(std::string&)+0x30) [0x6e9130] 11: (CInode::make_path_string(std::string&, bool, CDentry*)+0x44) [0x726f34] 12: (CDentry::make_path_string(std::string&)+0x30) [0x6e9130] 13: (CInode::make_path_string(std::string&, bool, CDentry*)+0x44) [0x726f34] 14: (CDentry::make_path_string(std::string&)+0x30) [0x6e9130] 15: (CInode::make_path_string(std::string&, bool, CDentry*)+0x44) [0x726f34] 16: (CInode::make_path_string_projected(std::string&)+0x2c) [0x73854c] 17: (operator<<(std::ostream&, CInode&)+0x32) [0x738822] 18: (CInode::print(std::ostream&)+0x1a) [0x73fdea] 19: (Locker::simple_eval(SimpleLock*, bool*)+0x10b) [0x6769db] 20: (Locker::eval(SimpleLock*, bool*)+0x2a) [0x67716a] 21: (Locker::eval(CInode*, int)+0x8ed) [0x682add] 22: (Locker::try_eval(MDSCacheObject*, int)+0x50b) [0x684f9b] 23: (Locker::revoke_stale_caps(Session*)+0x35d) [0x685c0d] 24: (Server::find_idle_sessions()+0x891) [0x528c01] 25: (MDS::tick()+0x470) [0x4abfc0] 26: (MDS::C_MDS_Tick::finish(int)+0x24) [0x4df9f4] 27: (SafeTimer::timer_thread()+0x4b0) [0x886510] 28: (SafeTimerThread::entry()+0x15) [0x88a7b5] 29: (Thread::_entry_func(void*)+0x12) [0x812542] 30: (()+0x7971) [0x7fa5ab1bf971] 31: (clone()+0x6d) [0x7fa5a9a4e92d] ~
We crash because a CDir is zeroed out in memory:
$38 = (CDentry * const) 0x11bc1720 (gdb) p this->dir $39 = (CDir *) 0x1ce6f80 (gdb) p this->dir->inode $40 = (CInode *) 0x0
(in fact, all of *this->dir is zeros.
The dentry is #1/client.0/tmp/usr.2 and the dir is /client.0/tmp, which you'll notice was just successfully printed in the previous line of the log.
Looking through the prior simple_sync() call, and the call chain leading up to the crash (we just finished an eval on authlock and are doing linklock now), I don't see anything that could trigger a close_dirfrag.
Going to bump the logging up to 20 in the hopes that that will have a bit more info, but I suspect something else ugly is going on. May need to run this workload through valgrind?
Updated by Sage Weil over 12 years ago
- Target version changed from v0.39 to v0.40
Updated by Sage Weil over 12 years ago
- Translation missing: en.field_position deleted (
37) - Translation missing: en.field_position set to 1043
Updated by Josh Durgin over 12 years ago
Not sure if this is the same underlying problem, but here's another CInode::authority crash from teuthology:~teuthworker/archive/nightly_coverage_2011-12-29-b/5388/remote/ubuntu@sepia36.ceph.dreamhost.com/log/mds.0.log.gz during the locking test:
2011-12-29 13:21:51.535061 2011-12-29 13:21:51.689910 7fd8f6b61700 mds.0.1 ms_handle_reset on 10.3.14.170:0/923031963 2011-12-29 13:21:54.617497 7fd8f6b61700 mds.0.1 ms_handle_reset on 10.3.14.174:0/113717421 *** Caught signal (Segmentation fault) ** in thread 7fd8f6b61700 ceph version 0.39-171-gdcedda8 (commit:dcedda84d0e1f69af985c301276c67c1b11e7efc) 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x917f54] 2: (()+0xfb40) [0x7fd8fa5deb40] 3: (CInode::authority()+0x46) [0x71dac6] 4: (Locker::try_eval(SimpleLock*, bool*)+0x2a) [0x677b9a] 5: (Locker::wrlock_finish(SimpleLock*, Mutation*, bool*)+0x384) [0x681664] 6: (Locker::_drop_non_rdlocks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x19d) [0x681ced] 7: (Locker::drop_locks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x94) [0x6827d4] 8: (Locker::file_update_finish(CInode*, Mutation*, bool, client_t, Capability*, MClientCaps*)+0x1ce) [0x688c9e] 9: (C_Locker_FileUpdate_finish::finish(int)+0x34) [0x69e9c4] 10: (Context::complete(int)+0x12) [0x49f962] 11: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e34be] 12: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d74e6] 13: (Journaler::C_Flush::finish(int)+0x1d) [0x7e36ed] 14: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xd6d) [0x7b35dd] 15: (MDS::handle_core_message(Message*)+0xebf) [0x4cbeaf] 16: (MDS::_dispatch(Message*)+0x3c) [0x4cc00c] 17: (MDS::ms_dispatch(Message*)+0xa5) [0x4ce605] 18: (SimpleMessenger::dispatch_entry()+0x99a) [0x81b77a] 19: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x4969bc] 20: (Thread::_entry_func(void*)+0x12) [0x8162c2] 21: (()+0x7971) [0x7fd8fa5d6971] 22: (clone()+0x6d) [0x7fd8f8e6592d]
Updated by Mark Nelson over 12 years ago
- File ceph_1682_debug.txt ceph_1682_debug.txt added
This probably isn't all that useful for anyone who knows the code well, but I threw together a quick run down of places where close_dirfrags gets called while browsing through the code. Like Sage said might be best to just try running the mds through valgrind and see if anything turns up.
Updated by Sage Weil over 12 years ago
hit this again:
2012-01-06T20:22:15.808 DEBUG:teuthology.run_tasks:Unwinding manager <contextlib.GeneratorContextManager object at 0x26fc050> 2012-01-06T20:22:15.808 INFO:teuthology.task.ceph:Shutting down mds daemons... 2012-01-06T20:22:15.810 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Terminated) ** 2012-01-06T20:22:15.810 INFO:teuthology.task.ceph.mds.0.err: in thread 7f4e88892780. Shutting down. 2012-01-06T20:22:15.815 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2012-01-06T20:22:15.815 INFO:teuthology.task.ceph.mds.0.err: in thread 7f4e849f5700 2012-01-06T20:22:15.817 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.39-263-g3c60e80 (commit:3c60e8046d0e64c0df01a6fced0d65f9788da8d8) 2012-01-06T20:22:15.817 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x918ef4] 2012-01-06T20:22:15.817 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7f4e88472b40] 2012-01-06T20:22:15.818 INFO:teuthology.task.ceph.mds.0.err: 3: (CInode::authority()+0x46) [0x71dbc6] 2012-01-06T20:22:15.818 INFO:teuthology.task.ceph.mds.0.err: 4: (Locker::try_eval(SimpleLock*, bool*)+0x2a) [0x677c9a] 2012-01-06T20:22:15.818 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::wrlock_finish(SimpleLock*, Mutation*, bool*)+0x384) [0x681764] 2012-01-06T20:22:15.818 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::_drop_non_rdlocks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x19d) [0x681ded] 2012-01-06T20:22:15.818 INFO:teuthology.task.ceph.mds.0.err: 7: (Locker::drop_locks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x94) [0x6828d4] 2012-01-06T20:22:15.819 INFO:teuthology.task.ceph.mds.0.err: 8: (Locker::file_update_finish(CInode*, Mutation*, bool, client_t, Capability*, MClientCaps*)+0x1ce) [0x688d9e] 2012-01-06T20:22:15.819 INFO:teuthology.task.ceph.mds.0.err: 9: (C_Locker_FileUpdate_finish::finish(int)+0x34) [0x69eac4] 2012-01-06T20:22:15.819 INFO:teuthology.task.ceph.mds.0.err: 10: (Context::complete(int)+0x12) [0x49fa72] 2012-01-06T20:22:15.819 INFO:teuthology.task.ceph.mds.0.err: 11: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e35de] 2012-01-06T20:22:15.819 INFO:teuthology.task.ceph.mds.0.err: 12: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d7606] 2012-01-06T20:22:15.819 INFO:teuthology.task.ceph.mds.0.err: 13: (Journaler::C_Flush::finish(int)+0x1d) [0x7e380d] 2012-01-06T20:22:15.820 INFO:teuthology.task.ceph.mds.0.err: 14: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xd6f) [0x7b36ff] 2012-01-06T20:22:15.820 INFO:teuthology.task.ceph.mds.0.err: 15: (MDS::handle_core_message(Message*)+0xebf) [0x4cbf8f] 2012-01-06T20:22:15.820 INFO:teuthology.task.ceph.mds.0.err: 16: (MDS::_dispatch(Message*)+0x3c) [0x4cc0ec] 2012-01-06T20:22:15.820 INFO:teuthology.task.ceph.mds.0.err: 17: (MDS::ms_dispatch(Message*)+0xa5) [0x4ce6e5] 2012-01-06T20:22:15.834 INFO:teuthology.task.ceph.mds.0.err: 18: (SimpleMessenger::dispatch_entry()+0x99a) [0x81c2da] 2012-01-06T20:22:15.835 INFO:teuthology.task.ceph.mds.0.err: 19: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x496acc] 2012-01-06T20:22:15.835 INFO:teuthology.task.ceph.mds.0.err: 20: (Thread::_entry_func(void*)+0x12) [0x816e22] 2012-01-06T20:22:15.835 INFO:teuthology.task.ceph.mds.0.err: 21: (()+0x7971) [0x7f4e8846a971] 2012-01-06T20:22:15.835 INFO:teuthology.task.ceph.mds.0.err: 22: (clone()+0x6d) [0x7f4e86cf992d] 2012-01-06T20:22:15.841 INFO:teuthology.task.ceph.mds.0.err:daemon-helper: command crashed with signal 11
on job
kernel: branch: master nuke-on-error: true overrides: ceph: branch: testing btrfs: 1 coverage: true log-whitelist: - clocks not synchronized roles: - - mon.0 - mds.0 - osd.0 - osd.1 - - mon.1 - client.1 - - mon.2 - client.0 tasks: - ceph: null - kclient: null - locktest: - client.0 - client.1
ubuntu@teuthology:/var/lib/teuthworker/archive/testing-2012-01-06/6533
Updated by Sage Weil over 12 years ago
- Target version deleted (
v0.40) - Translation missing: en.field_position deleted (
1090) - Translation missing: en.field_position set to 216
Updated by Sage Weil over 12 years ago
happened again on /var/lib/teuthworker/archive/nightly_coverage_2012-01-13-a/7335
2012-01-13T02:51:06.298 INFO:teuthology.task.ceph:Shutting down mds daemons... 2012-01-13T02:51:06.300 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Terminated) ** 2012-01-13T02:51:06.300 INFO:teuthology.task.ceph.mds.0.err: in thread 7fe6353cc780. Shutting down. 2012-01-13T02:51:06.311 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Segmentation fault) ** 2012-01-13T02:51:06.311 INFO:teuthology.task.ceph.mds.0.err: in thread 7fe63152f700 2012-01-13T02:51:06.313 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.39-323-g845aa53 (commit:845aa534e3e0ddc4f652879c473f011fff9c573b) 2012-01-13T02:51:06.313 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x91cfe4] 2012-01-13T02:51:06.313 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7fe634facb40] 2012-01-13T02:51:06.313 INFO:teuthology.task.ceph.mds.0.err: 3: (CInode::authority()+0x46) [0x71e706] 2012-01-13T02:51:06.313 INFO:teuthology.task.ceph.mds.0.err: 4: (Locker::try_eval(SimpleLock*, bool*)+0x2a) [0x6787da] 2012-01-13T02:51:06.314 INFO:teuthology.task.ceph.mds.0.err: 5: (Locker::wrlock_finish(SimpleLock*, Mutation*, bool*)+0x384) [0x6822a4] 2012-01-13T02:51:06.314 INFO:teuthology.task.ceph.mds.0.err: 6: (Locker::_drop_non_rdlocks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x19d) [0x68292d] 2012-01-13T02:51:06.314 INFO:teuthology.task.ceph.mds.0.err: 7: (Locker::drop_locks(Mutation*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x94) [0x683414] 2012-01-13T02:51:06.314 INFO:teuthology.task.ceph.mds.0.err: 8: (Locker::file_update_finish(CInode*, Mutation*, bool, client_t, Capability*, MClientCaps*)+0x1ce) [0x6898de] 2012-01-13T02:51:06.314 INFO:teuthology.task.ceph.mds.0.err: 9: (C_Locker_FileUpdate_finish::finish(int)+0x34) [0x69f604] 2012-01-13T02:51:06.315 INFO:teuthology.task.ceph.mds.0.err: 10: (Context::complete(int)+0x12) [0x4a0252] 2012-01-13T02:51:06.315 INFO:teuthology.task.ceph.mds.0.err: 11: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x14e) [0x7e42ce] 2012-01-13T02:51:06.315 INFO:teuthology.task.ceph.mds.0.err: 12: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x206) [0x7d82f6] 2012-01-13T02:51:06.315 INFO:teuthology.task.ceph.mds.0.err: 13: (Journaler::C_Flush::finish(int)+0x1d) [0x7e44fd] 2012-01-13T02:51:06.315 INFO:teuthology.task.ceph.mds.0.err: 14: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xeb3) [0x7b4393] 2012-01-13T02:51:06.316 INFO:teuthology.task.ceph.mds.0.err: 15: (MDS::handle_core_message(Message*)+0xedf) [0x4c418f] 2012-01-13T02:51:06.316 INFO:teuthology.task.ceph.mds.0.err: 16: (MDS::_dispatch(Message*)+0x3c) [0x4c6cdc] 2012-01-13T02:51:06.316 INFO:teuthology.task.ceph.mds.0.err: 17: (MDS::ms_dispatch(Message*)+0xa9) [0x4c92c9] 2012-01-13T02:51:06.316 INFO:teuthology.task.ceph.mds.0.err: 18: (SimpleMessenger::dispatch_entry()+0x99a) [0x81ceba] 2012-01-13T02:51:06.316 INFO:teuthology.task.ceph.mds.0.err: 19: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x4972ac] 2012-01-13T02:51:06.316 INFO:teuthology.task.ceph.mds.0.err: 20: (Thread::_entry_func(void*)+0x12) [0x817a02] 2012-01-13T02:51:06.317 INFO:teuthology.task.ceph.mds.0.err: 21: (()+0x7971) [0x7fe634fa4971] 2012-01-13T02:51:06.317 INFO:teuthology.task.ceph.mds.0.err: 22: (clone()+0x6d) [0x7fe63383392d] 2012-01-13T02:51:06.464 INFO:teuthology.task.ceph.mds.0.err:daemon-helper: command crashed with signal 11 kernel: sha1: 28fe722b3fbdd8f891ef7c07151b1272f8e936f2 nuke-on-error: true overrides: ceph: btrfs: 1 coverage: true log-whitelist: - clocks not synchronized sha1: 845aa534e3e0ddc4f652879c473f011fff9c573b roles: - - mon.0 - mon.1 - mon.2 - mds.0 - osd.0 - osd.1 - - client.1 - - client.0 tasks: - chef: null - ceph: null - kclient: null - locktest: - client.0 - client.1
Updated by Sage Weil about 12 years ago
- Status changed from New to Resolved
calling this resolved too.
Updated by John Spray over 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
1)
Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.