Bug #274: OSD crash during rsync - CephFS - Ceph

Actions

Copy link

Bug #274

closed

OSD crash during rsync

Added by Wido den Hollander almost 14 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Trying to replicate issue #272 and #273 i started a rsync to sync kernel.org and the Ubuntu releases (running at the same time).

After running for about 30 minutes, mds0 crashed, after some time the client connected to mds1, which then also crashed:

[ 1054.417160] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 socket closed
[ 1055.219164] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1056.040417] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1058.040212] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1062.050203] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1091.172976] ceph: mds0 reconnect start
[ 1091.191737] ceph: mds0 reconnect success
[ 1115.162053] ceph: mds0 recovery completed
[ 1615.304132] ceph: mds0 [2001:16f8:10:2::c3c3:2e5c]:6800 socket closed
[ 1616.076977] ceph: mds0 [2001:16f8:10:2::c3c3:2e5c]:6800 connection failed

Backtrace of mds0:

Core was generated by `/usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
448    /usr/include/c++/4.4/bits/stl_tree.h: No such file or directory.
    in /usr/include/c++/4.4/bits/stl_tree.h
(gdb) bt
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
#1  _Rb_tree_impl (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_tree.h:435
#2  _Rb_tree (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:591
#3  multiset (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_multiset.h:130
#4  MDSCacheObject (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at mds/mdstypes.h:1141
#5  CInode (this=0x0, c=0x1970300, f=..., l=..., auth=<value optimized out>) at mds/CInode.h:333
#6  0x00000000004d8b82 in Server::prepare_new_inode (this=<value optimized out>, mdr=0x7ff58c71d340, 
    dir=<value optimized out>, useino=..., mode=4294967295, layout=0x0) at mds/Server.cc:1572
#7  0x00000000004dbceb in Server::handle_client_openc (this=0x198fc70, mdr=0x7ff58c71d340) at mds/Server.cc:2255
#8  0x00000000004df177 in Server::handle_client_request (this=0x198fc70, req=0x7ff584767470) at mds/Server.cc:1075
#9  0x00000000004a1d01 in MDS::_dispatch (this=0x1985660, m=0x7ff584767470) at mds/MDS.cc:1423
#10 0x00000000004a22a1 in MDS::ms_dispatch (this=0x1985660, m=0x7ff584767470) at mds/MDS.cc:1309
#11 0x000000000047ebb9 in Messenger::ms_deliver_dispatch (this=0x1984ac0) at msg/Messenger.h:97
#12 SimpleMessenger::dispatch_entry (this=0x1984ac0) at msg/SimpleMessenger.cc:342
#13 0x0000000000474e3c in SimpleMessenger::DispatchThread::entry (this=0x1984f48) at msg/SimpleMessenger.h:534
#14 0x0000000000487d2a in Thread::_entry_func (arg=0x0) at ./common/Thread.h:39
#15 0x00007ff5961d79ca in start_thread () from /lib/libpthread.so.0
#16 0x00007ff5953f76cd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb)

The backtrace of mds1

Core was generated by `/usr/bin/cmds -i 1 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
b#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
448            this->_M_header._M_left = &this->_M_header;
(gdb) bt
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
#1  _Rb_tree_impl (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_tree.h:435
#2  _Rb_tree (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:591
#3  multiset (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_multiset.h:130
#4  MDSCacheObject (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at mds/mdstypes.h:1141
#5  CInode (this=0x0, c=0x18b8300, f=..., l=..., auth=<value optimized out>) at mds/CInode.h:333
#6  0x00000000004d8b82 in Server::prepare_new_inode (this=<value optimized out>, mdr=0x7f5dd4aa19c0, 
    dir=<value optimized out>, useino=..., mode=4294967295, layout=0x0) at mds/Server.cc:1572
#7  0x00000000004dbceb in Server::handle_client_openc (this=0x18d7c70, mdr=0x7f5dd4aa19c0) at mds/Server.cc:2255
#8  0x00000000004df177 in Server::handle_client_request (this=0x18d7c70, req=0x1c70560) at mds/Server.cc:1075
#9  0x00000000004a1d01 in MDS::_dispatch (this=0x18cd660, m=0x1c70560) at mds/MDS.cc:1423
#10 0x00000000004a22a1 in MDS::ms_dispatch (this=0x18cd660, m=0x1c70560) at mds/MDS.cc:1309
#11 0x000000000047ebb9 in Messenger::ms_deliver_dispatch (this=0x18ccac0) at msg/Messenger.h:97
#12 SimpleMessenger::dispatch_entry (this=0x18ccac0) at msg/SimpleMessenger.cc:342
#13 0x0000000000474e3c in SimpleMessenger::DispatchThread::entry (this=0x18ccf48) at msg/SimpleMessenger.h:534
#14 0x0000000000487d2a in Thread::_entry_func (arg=0x0) at ./common/Thread.h:39
#15 0x00007f5de76709ca in start_thread () from /lib/libpthread.so.0
#16 0x00007f5de688f6fd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb)

The logfiles and the core files are uploaded at: http://zooi.widodh.nl/ceph/mds_crash_during_rsync.tar

Actions

Copy link

Updated by Wido den Hollander almost 14 years ago

Starting the MDS'es again also fails. When doing so, the client reconnects to mds1, which then crashes after a few seconds. Then, the client connects to mds0, which then also crashes.

The rsync processes were still running at the moment.

The backtraces:

mds0

Core was generated by `/usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf'.
Program terminated with signal 6, Aborted.
#0  0x00007f8011559a75 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007f8011559a75 in raise () from /lib/libc.so.6
#1  0x00007f801155d5c0 in abort () from /lib/libc.so.6
#2  0x00007f8011e0e8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#3  0x00007f8011e0cd16 in ?? () from /usr/lib/libstdc++.so.6
#4  0x00007f8011e0cd43 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x00007f8011e0ce3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x000000000047e220 in raw_posix_aligned (this=<value optimized out>, pm=<value optimized out>) at ./include/buffer.h:236
#7  ceph::buffer::create_page_aligned (this=<value optimized out>, pm=<value optimized out>) at ./include/buffer.h:311
#8  SimpleMessenger::Pipe::read_message (this=<value optimized out>, pm=<value optimized out>)
    at msg/SimpleMessenger.cc:1760
#9  0x0000000000482af3 in SimpleMessenger::Pipe::reader (this=0x7f80080028c0) at msg/SimpleMessenger.cc:1450
#10 0x0000000000474e8d in SimpleMessenger::Pipe::Reader::entry (this=<value optimized out>) at msg/SimpleMessenger.h:171
#11 0x0000000000487d2a in Thread::_entry_func (arg=0x66b0) at ./common/Thread.h:39
#12 0x00007f80123ec9ca in start_thread () from /lib/libpthread.so.0
#13 0x00007f801160c6cd in clone () from /lib/libc.so.6
#14 0x0000000000000000 in ?? ()
(gdb)

mds1

Core was generated by `/usr/bin/cmds -i 1 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
448            this->_M_header._M_left = &this->_M_header;
(gdb) bt
#0  std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_Rb_tree_impl<std::less<int>, false>::_M_initialize (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:448
#1  _Rb_tree_impl (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_tree.h:435
#2  _Rb_tree (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>) at /usr/include/c++/4.4/bits/stl_tree.h:591
#3  multiset (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>)
    at /usr/include/c++/4.4/bits/stl_multiset.h:130
#4  MDSCacheObject (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>) at mds/mdstypes.h:1141
#5  CInode (this=0x0, c=0x1e1a300, f=..., l=..., auth=<value optimized out>) at mds/CInode.h:333
#6  0x00000000004d8b82 in Server::prepare_new_inode (this=<value optimized out>, mdr=0x2117010, dir=<value optimized out>, 
    useino=..., mode=4294967295, layout=0x0) at mds/Server.cc:1572
#7  0x00000000004dbceb in Server::handle_client_openc (this=0x1e39c70, mdr=0x2117010) at mds/Server.cc:2255
#8  0x00000000004df177 in Server::handle_client_request (this=0x1e39c70, req=0x7f05cd7e9650) at mds/Server.cc:1075
#9  0x00000000004a1d01 in MDS::_dispatch (this=0x1e2f660, m=0x7f05cd7e9650) at mds/MDS.cc:1423
#10 0x00000000004a22a1 in MDS::ms_dispatch (this=0x1e2f660, m=0x7f05cd7e9650) at mds/MDS.cc:1309
#11 0x000000000047ebb9 in Messenger::ms_deliver_dispatch (this=0x1e2eac0) at msg/Messenger.h:97
#12 SimpleMessenger::dispatch_entry (this=0x1e2eac0) at msg/SimpleMessenger.cc:342
#13 0x0000000000474e3c in SimpleMessenger::DispatchThread::entry (this=0x1e2ef48) at msg/SimpleMessenger.h:534
#14 0x0000000000487d2a in Thread::_entry_func (arg=0x0) at ./common/Thread.h:39
#15 0x00007f05de6839ca in start_thread () from /lib/libpthread.so.0
#16 0x00007f05dd8a26fd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb)

The logfiles and corefiles are uploaded at: http://zooi.widodh.nl/ceph/mds_crash_during_rsync.1.tar

Actions

Copy link

Updated by Wido den Hollander almost 14 years ago

Ofcourse, the title had to be "MDS crash during rsync".

I just tried to sync kernel.org again, (after a mkcephfs), but it killed the MDS'es again, but with a different backtrace:

mds0

Core was generated by `/usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
#0  MDSCacheObject (this=0x1976230, dname=..., first=..., last=...) at mds/mdstypes.h:1141
1141    mds/mdstypes.h: No such file or directory.
    in mds/mdstypes.h
(gdb) bt
#0  MDSCacheObject (this=0x1976230, dname=..., first=..., last=...) at mds/mdstypes.h:1141
#1  CDentry (this=0x1976230, dname=..., first=..., last=...) at mds/CDentry.h:171
#2  CDir::add_null_dentry (this=0x1976230, dname=..., first=..., last=...) at mds/CDir.cc:228
#3  0x00000000004cbde1 in Server::prepare_null_dentry (this=0xe29c70, mdr=0x7f9828410050, dir=0x1976230, dname=..., okexist=<value optimized out>) at mds/Server.cc:1558
#4  0x00000000004cd139 in Server::rdlock_path_xlock_dentry (this=<value optimized out>, mdr=0x7f9828410050, n=0, rdlocks=..., wrlocks=<value optimized out>, xlocks=<value optimized out>, okexist=true, mustexist=false, alwaysxlock=true)
    at mds/Server.cc:1841
#5  0x00000000004d091a in Server::handle_client_rename (this=0xe29c70, mdr=0x7f9828410050) at mds/Server.cc:4038
#6  0x00000000004df177 in Server::handle_client_request (this=0xe29c70, req=0x7f982053c5c0) at mds/Server.cc:1075
#7  0x00000000004a1d01 in MDS::_dispatch (this=0xe1f660, m=0x7f982053c5c0) at mds/MDS.cc:1423
#8  0x00000000004a22a1 in MDS::ms_dispatch (this=0xe1f660, m=0x7f982053c5c0) at mds/MDS.cc:1309
#9  0x000000000047ebb9 in Messenger::ms_deliver_dispatch (this=0xe1eac0) at msg/Messenger.h:97
#10 SimpleMessenger::dispatch_entry (this=0xe1eac0) at msg/SimpleMessenger.cc:342
#11 0x0000000000474e3c in SimpleMessenger::DispatchThread::entry (this=0xe1ef48) at msg/SimpleMessenger.h:534
#12 0x0000000000487d2a in Thread::_entry_func (arg=0xd0) at ./common/Thread.h:39
#13 0x00007f98393049ca in start_thread () from /lib/libpthread.so.0
#14 0x00007f98385246cd in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
(gdb)

The client gave these messages in it's dmesg:

root@client02:~# dmesg 
[ 1276.437351] ceph: msg_new can't allocate 4096 bytes
[ 1276.437821] ceph: msg_new can't create type 0 front 4096
[ 1276.437878] ceph: msgpool osd_op alloc failed
[ 1276.437958] ceph: msg_new can't allocate 4096 bytes
[ 1276.438003] ceph: msg_new can't create type 0 front 4096
[ 1276.438526] ceph: msgpool osd_op alloc failed
[ 1276.438692] ceph: msg_new can't create type 0 front 4096
[ 1276.438747] ceph: msgpool osd_op alloc failed
[ 1276.438804] ceph: msg_new can't create type 0 front 512
[ 1276.439442] ceph: msgpool osd_op_reply alloc failed
[ 1276.439624] ceph: msg_new can't create type 0 front 4096
[ 1276.439678] ceph: msgpool osd_op alloc failed
[ 1330.238165] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 socket closed
[ 1331.040228] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1332.040365] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1334.040193] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1338.050220] ceph: mds0 [2001:16f8:10:2::c3c3:3f9b]:6800 connection failed
[ 1363.810614] ceph: mds0 reconnect start
[ 1363.837346] ceph: mds0 reconnect success
[ 1372.627841] ceph: mds0 recovery completed
root@client02:~#

At this point i killed the rsync, so it didn't switch over to mds1.

I uploaded the core-files and log to: http://zooi.widodh.nl/ceph/mds_crash_during_rsync.2.tar

Since my MDS'es and MON's were making my system OOM, i had been tuning a bit with the overcommit settings:

vm.overcommit_memory = 2
vm.overcommit_ratio = 10
vm.oom_kill_allocating_task = 1

When reverting this to

vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.oom_kill_allocating_task = 0

The MDS'es seem to run more stable, i just resumed the rsync and the MDS'es are still online.

It seems that these settings have a lot of effect on the MDS'es, is this true?

Actions

Copy link

Updated by Sage Weil almost 14 years ago

Most of the crashes are definitely OOM. (The prepare_null_dentry one i'm not 100% sure about, but does look that way.) Making cmds throw a more informative exception.

The mds uses lots of memory, and the default allocator can fragment quite a bit. I don't think overcommit is a good idea, at least currently. We may switch to tcmalloc (soon!), which is way more efficient and faster, but that'll happen after 0.21.

Were you overcommitting on the client side, too? I'm surprised to see OOM errors there, too.

Actions

Copy link

Updated by Sage Weil almost 14 years ago

Status changed from New to 4

Actions

Copy link

Updated by Wido den Hollander almost 14 years ago

Yes, i was overcommitting at the client side too, since this was running OOM too. But it seems these values were a bit to strict (at the moment at least).

I'll leave it for now

Actions

Copy link