Project

General

Profile

Actions

Bug #274

closed

OSD crash during rsync

Added by Wido den Hollander almost 14 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Trying to replicate issue #272 and #273 i started a rsync to sync kernel.org and the Ubuntu releases (running at the same time).

After running for about 30 minutes, mds0 crashed, after some time the client connected to mds1, which then also crashed:

[ 1054.417160] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 socket closed
[ 1055.219164] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1056.040417] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1058.040212] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1062.050203] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1091.172976] ceph: mds0 reconnect start
[ 1091.191737] ceph: mds0 reconnect success
[ 1115.162053] ceph: mds0 recovery completed
[ 1615.304132] ceph: mds0 [2001:16f8:10:2::c3c3:2e5c]:6800 socket closed
[ 1616.076977] ceph: mds0 [2001:16f8:10:2::c3c3:2e5c]:6800 connection failed

Backtrace of mds0:

Core was generated by `/usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
448    /usr/include/c++/4.4/bits/stl_tree.h: No such file or directory.
    in /usr/include/c++/4.4/bits/stl_tree.h
(gdb) bt
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
#1  _Rb_tree_impl (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_tree.h:435
#2  _Rb_tree (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:591
#3  multiset (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_multiset.h:130
#4  MDSCacheObject (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at mds/mdstypes.h:1141
#5  CInode (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at mds/CInode.h:333
#6  0x00000000004d8b82 in Server::prepare_new_inode (this=<value optimized out>, mdr=0x7ff58c71d340, 
    dir=<value optimized out>, useino=..., mode=4294967295, layout=0x0) at mds/Server.cc:1572
#7  0x00000000004dbceb in Server::handle_client_openc (this=0x198fc70, mdr=0x7ff58c71d340) at mds/Server.cc:2255
#8  0x00000000004df177 in Server::handle_client_request (this=0x198fc70, req=0x7ff584767470) at mds/Server.cc:1075
#9  0x00000000004a1d01 in MDS::_dispatch (this=0x1985660, m=0x7ff584767470) at mds/MDS.cc:1423
#10 0x00000000004a22a1 in MDS::ms_dispatch (this=0x1985660, m=0x7ff584767470) at mds/MDS.cc:1309
#11 0x000000000047ebb9 in Messenger::ms_deliver_dispatch (this=0x1984ac0) at msg/Messenger.h:97
#12 SimpleMessenger::dispatch_entry (this=0x1984ac0) at msg/SimpleMessenger.cc:342
#13 0x0000000000474e3c in SimpleMessenger::DispatchThread::entry (this=0x1984f48) at msg/SimpleMessenger.h:534
#14 0x0000000000487d2a in Thread::_entry_func (arg=0x0) at ./common/Thread.h:39
#15 0x00007ff5961d79ca in start_thread () from /lib/libpthread.so.0
#16 0x00007ff5953f76cd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb) 

The backtrace of mds1

Core was generated by `/usr/bin/cmds -i 1 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
b#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
448            this->_M_header._M_left = &this->_M_header;
(gdb) bt
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
#1  _Rb_tree_impl (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_tree.h:435
#2  _Rb_tree (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:591
#3  multiset (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_multiset.h:130
#4  MDSCacheObject (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at mds/mdstypes.h:1141
#5  CInode (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at mds/CInode.h:333
#6  0x00000000004d8b82 in Server::prepare_new_inode (this=<value optimized out>, mdr=0x7f5dd4aa19c0, 
    dir=<value optimized out>, useino=..., mode=4294967295, layout=0x0) at mds/Server.cc:1572
#7  0x00000000004dbceb in Server::handle_client_openc (this=0x18d7c70, mdr=0x7f5dd4aa19c0) at mds/Server.cc:2255
#8  0x00000000004df177 in Server::handle_client_request (this=0x18d7c70, req=0x1c70560) at mds/Server.cc:1075
#9  0x00000000004a1d01 in MDS::_dispatch (this=0x18cd660, m=0x1c70560) at mds/MDS.cc:1423
#10 0x00000000004a22a1 in MDS::ms_dispatch (this=0x18cd660, m=0x1c70560) at mds/MDS.cc:1309
#11 0x000000000047ebb9 in Messenger::ms_deliver_dispatch (this=0x18ccac0) at msg/Messenger.h:97
#12 SimpleMessenger::dispatch_entry (this=0x18ccac0) at msg/SimpleMessenger.cc:342
#13 0x0000000000474e3c in SimpleMessenger::DispatchThread::entry (this=0x18ccf48) at msg/SimpleMessenger.h:534
#14 0x0000000000487d2a in Thread::_entry_func (arg=0x0) at ./common/Thread.h:39
#15 0x00007f5de76709ca in start_thread () from /lib/libpthread.so.0
#16 0x00007f5de688f6fd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb) 

The logfiles and the core files are uploaded at: http://zooi.widodh.nl/ceph/mds_crash_during_rsync.tar

Actions

Also available in: Atom PDF