Actions
Bug #259
closedMDS crash during log initialize
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Running with the latest unstable (83d1ea6636dd432dcbb6a0c6046d551bee7be5c6) my MDS'es crash while initializing their logfile.
Crash occurs on both MDS'es and there is enough free space available to create a new logfile.
The rotation goes fine, that said, mds.1.log is created and the symlink too, but then the MDS crashes with the following loglines:
root@node14:/var/log/ceph# cat mds.1.log 10.07.07_11:01:59.300353 --- 6627 opened log /var/log/ceph/mds.1.log --- ceph version 0.21~rc (83d1ea6636dd432dcbb6a0c6046d551bee7be5c6) 10.07.07_11:01:59.301495 7fa6f6058720 mds-1.0 168 MDSCacheObject 10.07.07_11:01:59.301579 7fa6f6058720 mds-1.0 1616 CInode 10.07.07_11:01:59.301587 7fa6f6058720 mds-1.0 16 elist<>::item *7=112 10.07.07_11:01:59.301594 7fa6f6058720 mds-1.0 352 inode_t 10.07.07_11:01:59.301601 7fa6f6058720 mds-1.0 56 nest_info_t 10.07.07_11:01:59.301608 7fa6f6058720 mds-1.0 32 frag_info_t 10.07.07_11:01:59.301614 7fa6f6058720 mds-1.0 40 SimpleLock *5=200 10.07.07_11:01:59.301620 7fa6f6058720 mds-1.0 48 ScatterLock *3=144 10.07.07_11:01:59.301627 7fa6f6058720 mds-1.0 464 CDentry 10.07.07_11:01:59.301634 7fa6f6058720 mds-1.0 16 elist<>::item 10.07.07_11:01:59.301640 7fa6f6058720 mds-1.0 40 SimpleLock 10.07.07_11:01:59.301646 7fa6f6058720 mds-1.0 1528 CDir 10.07.07_11:01:59.301653 7fa6f6058720 mds-1.0 16 elist<>::item *2=32 10.07.07_11:01:59.301659 7fa6f6058720 mds-1.0 192 fnode_t 10.07.07_11:01:59.301666 7fa6f6058720 mds-1.0 56 nest_info_t *2 10.07.07_11:01:59.301672 7fa6f6058720 mds-1.0 32 frag_info_t *2 10.07.07_11:01:59.301681 7fa6f6058720 mds-1.0 168 Capability 10.07.07_11:01:59.301688 7fa6f6058720 mds-1.0 32 xlist<>::item *2=64 10.07.07_11:01:59.302113 7fa6f6057710 mds-1.0 MDS::ms_get_authorizer type=mon 10.07.07_11:01:59.302203 7fa6f3949710 mds-1.0 ms_handle_connect on 213.189.18.214:6789/0 10.07.07_11:01:59.303193 7fa6f3949710 monclient(hunting): found mon1 10.07.07_11:01:59.303528 7fa6f6058720 mds-1.0 beacon_send up:boot seq 1 (currently up:boot) 10.07.07_11:01:59.304045 7fa6f3949710 mds-1.0 handle_mds_map epoch 6 from mon1 10.07.07_11:01:59.304077 7fa6f3949710 mds-1.0 my compat compat={},rocompat={},incompat={1=base v0.20} 10.07.07_11:01:59.304095 7fa6f3949710 mds-1.0 mdsmap compat compat={},rocompat={},incompat={1=base v0.20} 10.07.07_11:01:59.304104 7fa6f3949710 mds-1.0 map says i am 213.189.18.214:6800/6627 mds-1 state down:dne 10.07.07_11:01:59.304122 7fa6f3949710 mds-1.0 not in map yet 10.07.07_11:01:59.369430 7fa6f3949710 mds-1.0 handle_mds_map epoch 7 from mon1 10.07.07_11:01:59.369482 7fa6f3949710 mds-1.0 my compat compat={},rocompat={},incompat={1=base v0.20} 10.07.07_11:01:59.369492 7fa6f3949710 mds-1.0 mdsmap compat compat={},rocompat={},incompat={1=base v0.20} 10.07.07_11:01:59.369501 7fa6f3949710 mds-1.0 map says i am 213.189.18.214:6800/6627 mds-1 state up:standby 10.07.07_11:01:59.369514 7fa6f3949710 mds-1.0 handle_mds_map standby 10.07.07_11:02:00.221097 7fa6f3949710 mds-1.0 handle_mds_map epoch 8 from mon1 10.07.07_11:02:00.221156 7fa6f3949710 mds-1.0 my compat compat={},rocompat={},incompat={1=base v0.20} 10.07.07_11:02:00.221166 7fa6f3949710 mds-1.0 mdsmap compat compat={},rocompat={},incompat={1=base v0.20} 10.07.07_11:02:00.221175 7fa6f3949710 mds0.0 map says i am 213.189.18.214:6800/6627 mds0 state up:creating 10.07.07_11:02:00.221250 7fa6f3949710 mds0.3 handle_mds_map i am now mds0.3 10.07.07_11:02:00.221260 7fa6f3949710 mds0.3 handle_mds_map state change up:standby --> up:creating 10.07.07_11:02:00.221267 7fa6f3949710 mds0.3 boot_create 10.07.07_11:02:00.221279 7fa6f3949710 mds0.3 boot_create creating fresh journal 10.07.07_11:02:00.221290 7fa6f3949710 mds0.log create empty log root@node14:/var/log/ceph#
debug mds was set to 20.
Backtrace for both MDS'es is the same:
Core was generated by `/usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf'. Program terminated with signal 11, Segmentation fault. #0 Logger::set (this=0x0, key=5013, v=4194304) at common/Logger.cc:347 347 common/Logger.cc: No such file or directory. in common/Logger.cc (gdb) bt #0 Logger::set (this=0x0, key=5013, v=4194304) at common/Logger.cc:347 #1 0x0000000000617235 in MDLog::create (this=0x14a7580, c=0x14b3eb0) at mds/MDLog.cc:128 #2 0x000000000049523b in MDS::boot_create (this=0x14a6010) at mds/MDS.cc:957 #3 0x000000000049f7b9 in MDS::handle_mds_map (this=0x14a6010, m=0x7f1e200010d0) at mds/MDS.cc:829 #4 0x00000000004a07d5 in MDS::_dispatch (this=0x14a6010, m=0x7f1e200010d0) at mds/MDS.cc:1391 #5 0x00000000004a206d in MDS::ms_dispatch (this=0x14a6010, m=0x7f1e200010d0) at mds/MDS.cc:1319 #6 0x000000000047e949 in Messenger::ms_deliver_dispatch (this=0x1493be0) at msg/Messenger.h:97 #7 SimpleMessenger::dispatch_entry (this=0x1493be0) at msg/SimpleMessenger.cc:342 #8 0x0000000000474bdc in SimpleMessenger::DispatchThread::entry (this=0x1494068) at msg/SimpleMessenger.h:534 #9 0x0000000000487aba in Thread::_entry_func (arg=0x1) at ./common/Thread.h:39 #10 0x00007f1e279029ca in start_thread () from /lib/libpthread.so.0 #11 0x00007f1e26b226cd in clone () from /lib/libc.so.6 #12 0x0000000000000000 in ?? ()
The core file is attached.
Tried another fresh mkcephfs, but didn't work either, the MDS keep crashing.
Files
Actions