Bug #24835: osd daemon spontaneous segfault - RADOS - Ceph

Actions

Copy link

Bug #24835

closed

osd daemon spontaneous segfault

Added by Christian Schlittchen almost 6 years ago. Updated about 4 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Brad Hubbard

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v13.2.0

ceph-qa-suite:

Component(RADOS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

We experience spontaneous segmentation faults of osd daemons in our mimic production cluster:

Jul  9 14:54:13 ceph-osd-a-02 ceph-osd[34879]: *** Caught signal (Segmentation fault) **
Jul  9 14:54:13 ceph-osd-a-02 kernel: [1211417.797932] show_signal_msg: 5 callbacks suppressed
Jul  9 14:54:13 ceph-osd-a-02 kernel: [1211417.797938] msgr-worker-2[34887]: segfault at 0 ip 00007ff3d231dbcb sp 00007ff3cc5a5680 error 4 in libtcmalloc.so.4.3.0[7ff3d22f9000+45000]
Jul  9 14:54:13 ceph-osd-a-02 ceph-osd[34879]:  in thread 7ff3cc5b0700 thread_name:msgr-worker-2
Jul  9 14:54:13 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Main process exited, code=killed, status=11/SEGV
Jul  9 14:54:13 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Failed with result 'signal'.
Jul  9 14:54:33 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Service hold-off time over, scheduling restart.
Jul  9 14:54:33 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Scheduled restart job, restart counter is at 1.
Jul  9 14:54:33 ceph-osd-a-02 systemd[1]: Stopped Ceph object storage daemon osd.20.

Files

ceph-osd.20.log.1 (101 KB) ceph-osd.20.log.1

Christian Schlittchen, 07/10/2018 12:13 PM

Actions

Copy link

Updated by Greg Farnum almost 6 years ago

Project changed from Ceph to RADOS

Can you provide the backtrace out of the OSD log? Or even the whole log?

Actions

Copy link

Updated by Christian Schlittchen almost 6 years ago

File ceph-osd.20.log.1 ceph-osd.20.log.1 added

The log (attached) does not contain any information on the crash. It shows only the automatic restart of the crashed osd-daemon at 14:54.

Actions

Copy link

Updated by Josh Durgin almost 6 years ago

Status changed from New to Need More Info

Unfortunately there's not much to go on - if this happens again perhaps you can grab a core file or a crash dump will show up in the log

Actions

Copy link

Updated by Christian Schlittchen over 5 years ago

After spending a week trying to get Ubuntu/systemd to allow a core dump to be created, we finally have two different stack traces:

===

Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 93 
--setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f63f7aea727 in raise () from 
/lib/x86_64-linux-gnu/libpthread.so.0
[Current thread is 1 (Thread 0x7f63f2f17700 (LWP 723488))]
(gdb) bt
#0  0x00007f63f7aea727 in raise () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x000056472671f1ce in reraise_fatal (signum=11) at 
./src/global/signal_handler.cc:74
#2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:138
#3  <signal handler called>
#4  0x00007f63f9452780 in MMgrReport::encode_payload(unsigned long) () 
from /usr/lib/ceph/libceph-common.so.0
#5  0x00007f63f94600f9 in Message::encode(unsigned long, int) () from 
/usr/lib/ceph/libceph-common.so.0
#6  0x00007f63f94ff29e in AsyncConnection::prepare_send_message(unsigned 
long, Message*, ceph::buffer::list&) () from 
/usr/lib/ceph/libceph-common.so.0
#7  0x00007f63f9504f67 in AsyncConnection::handle_write() () from 
/usr/lib/ceph/libceph-common.so.0
#8  0x00007f63f951f777 in EventCenter::process_events(unsigned int, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) ()
    from /usr/lib/ceph/libceph-common.so.0
#9  0x00007f63f95220dc in ?? () from /usr/lib/ceph/libceph-common.so.0
#10 0x00007f63f73e7733 in ?? () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#11 0x00007f63f7adf6db in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007f63f6aa388f in clone () from /lib/x86_64-linux-gnu/libc.so.6

and


Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 105 
--setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3c1349bbcb in 
tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, 
unsigned long, int) ()
    from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
[Current thread is 1 (Thread 0x7f3c0df2f700 (LWP 723833))]
(gdb) bt
#0  0x00007f3c1349bbcb in 
tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, 
unsigned long, int) ()
    from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#1  0x00007f3c1349bc9b in 
tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, 
unsigned long) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#2  0x00007f3c134aa6f5 in tc_free () from 
/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007f3c143cb466 in ceph::BackTrace::print(std::ostream&) const () 
from /usr/lib/ceph/libceph-common.so.0
#4  0x000055c84f15f17a in handle_fatal_signal (signum=11) at 
./src/global/signal_handler.cc:119
#5  <signal handler called>
#6  0x00007f3c1349bbcb in 
tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, 
unsigned long, int) ()
    from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#7  0x00007f3c1349bc9b in 
tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, 
unsigned long) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#8  0x00007f3c134aa6f5 in tc_free () from 
/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#9  0x00007f3c143b3b17 in std::__cxx11::_List_base<ceph::buffer::ptr, 
std::allocator<ceph::buffer::ptr> >::_M_clear() () from 
/usr/lib/ceph/libceph-common.so.0
#10 0x00007f3c1449cec1 in ceph::buffer::list::clear() () from 
/usr/lib/ceph/libceph-common.so.0
#11 0x00007f3c14527099 in AsyncConnection::process() () from 
/usr/lib/ceph/libceph-common.so.0
#12 0x00007f3c14537415 in EventCenter::process_events(unsigned int, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) ()
    from /usr/lib/ceph/libceph-common.so.0
#13 0x00007f3c1453a0dc in ?? () from /usr/lib/ceph/libceph-common.so.0
#14 0x00007f3c123ff733 in ?? () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#15 0x00007f3c12af76db in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#16 0x00007f3c11abb88f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Hope that helps.

Actions

Copy link

Updated by Soenke Schippmann over 5 years ago

The problem still persists with Mimic 13.2.1 (on the same cluster as above). Errors in ceph::buffer::list appear to happen at least every second day on one of our OSDs, which is a little annoying.

Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 18 --setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f991b068bcb in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
   from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
[Current thread is 1 (Thread 0x7f9915afc700 (LWP 5385))]
(gdb) bt
#0  0x00007f991b068bcb in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
   from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#1  0x00007f991b068c9b in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#2  0x00007f991b0776f5 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007f9919facac9 in __cxa_demangle () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f991bf98f50 in ceph::BackTrace::print(std::ostream&) const () from /usr/lib/ceph/libceph-common.so.0
#5  0x000055e97a6fdcfa in handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:119
#6  <signal handler called>
#7  0x00007f991b068bcb in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
   from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#8  0x00007f991b068c9b in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#9  0x00007f991b0776f5 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#10 0x000055e97a173c47 in __gnu_cxx::new_allocator<std::_List_node<ceph::buffer::ptr> >::deallocate (this=0x55e9868af668, __p=<optimized out>)
    at /usr/include/c++/7/ext/new_allocator.h:125
#11 std::allocator_traits<std::allocator<std::_List_node<ceph::buffer::ptr> > >::deallocate (__a=..., __n=1, __p=<optimized out>)
    at /usr/include/c++/7/bits/alloc_traits.h:462
#12 std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_put_node (this=0x55e9868af668, __p=<optimized out>)
    at /usr/include/c++/7/bits/stl_list.h:387
#13 std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear (this=this@entry=0x55e9868af668) at /usr/include/c++/7/bits/list.tcc:80
#14 0x000055e97a1e7868 in std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~_List_base (this=0x55e9868af668, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/stl_list.h:442
#15 std::__cxx11::list<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~list (this=0x55e9868af668, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/stl_list.h:733
#16 ceph::buffer::list::~list (this=0x55e9868af668, __in_chrg=<optimized out>) at ./src/include/buffer.h:360
#17 Message::~Message (this=this@entry=0x55e9868af600, __in_chrg=<optimized out>) at ./src/msg/Message.h:319
#18 0x000055e97a1e7947 in MOSDPing::~MOSDPing (this=0x55e9868af600, __in_chrg=<optimized out>) at ./src/messages/MOSDPing.h:75
#19 MOSDPing::~MOSDPing (this=0x55e9868af600, __in_chrg=<optimized out>) at ./src/messages/MOSDPing.h:75
#20 0x00007f991c0259b2 in RefCountedObject::put() const () from /usr/lib/ceph/libceph-common.so.0
#21 0x00007f991c0e56af in AsyncConnection::write_message(Message*, ceph::buffer::list&, bool) () from /usr/lib/ceph/libceph-common.so.0
#22 0x00007f991c0eadd9 in AsyncConnection::handle_write() () from /usr/lib/ceph/libceph-common.so.0
#23 0x00007f991c105ac7 in EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) ()
   from /usr/lib/ceph/libceph-common.so.0
#24 0x00007f991c10842c in ?? () from /usr/lib/ceph/libceph-common.so.0
#25 0x00007f9919fcc733 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#26 0x00007f991a6c46db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#27 0x00007f991968888f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Today we got another segfault we have not seen so far:

Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 22 --setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3a817a40ad in tc_malloc () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
[Current thread is 1 (Thread 0x7f3a76fe5700 (LWP 5458))]
(gdb) bt
#0  0x00007f3a817a40ad in tc_malloc () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#1  0x00007f3a826c5c7d in ceph::BackTrace::print(std::ostream&) const () from /usr/lib/ceph/libceph-common.so.0
#2  0x000055797bebdcfa in handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:119
#3  <signal handler called>
#4  0x00007f3a817a674d in tc_newarray () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#5  0x00007f3a826fe022 in ?? () from /usr/lib/ceph/libceph-common.so.0
#6  0x00007f3a826fe60a in ConfFile::normalize_key_name(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /usr/lib/ceph/libceph-common.so.0
#7  0x00007f3a8297bf34 in md_config_t::get_val_generic(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const ()
   from /usr/lib/ceph/libceph-common.so.0
#8  0x000055797b9d33bb in md_config_t::get_val<double> (this=this@entry=0x55797d672000, key="osd_max_pg_per_osd_hard_ratio") at ./src/common/config.h:398
#9  0x000055797b986096 in OSD::resume_creating_pg (this=this@entry=0x55797d89c000) at ./src/osd/OSD.cc:4044
#10 0x000055797b9878cc in OSD::tick_without_osd_lock (this=0x55797d89c000) at ./src/osd/OSD.cc:4866
#11 0x000055797b998ed9 in Context::complete (this=0x55797d31f330, r=<optimized out>) at ./src/include/Context.h:77
#12 0x00007f3a826f38ab in SafeTimer::timer_thread() () from /usr/lib/ceph/libceph-common.so.0
#13 0x00007f3a826f4e6d in SafeTimerThread::entry() () from /usr/lib/ceph/libceph-common.so.0
#14 0x00007f3a80df16db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#15 0x00007f3a7fdb588f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Actions

Copy link

Updated by Brad Hubbard over 5 years ago

Soenke,

Could you upload a coredump for each of the different backtraces as well as details of your environment (this is easily achieved by uploading an sosreport)?

You can use ceph-post-file and post the ID here if you like.

Actions

Copy link

Updated by Brad Hubbard over 5 years ago

Assignee set to Brad Hubbard

Actions

Copy link

Updated by Soenke Schippmann over 5 years ago

Hi Brad,

thanks for investigating this issue and sorry for my late response, I was on holidays.

The file IDs as requested:

core.msgr-worker (my first backtrace): 90e0a562-bd36-413c-b9c5-28450e3fcd2c
core.safe_timer (second backrace): 67717cd0-e2f1-4b71-9ac7-1c6a626fb57a
sosreport: 679be828-1483-4c78-af49-63958eff80b9

Actions

Copy link

Updated by Brad Hubbard over 5 years ago

Status changed from Need More Info to New

Looking into this. Will update when I have analysed these coredumps.

In the meantime, if you get any that have a different stack trace to these please upload them. This looks like some sort of memory corruption and quite often in those cases you van get multiple different stacks and any further information helps.

Actions

Copy link

#10

Updated by Soenke Schippmann over 5 years ago

We do have some more core dumps with different stack traces.

Coredump: ebb8eff9-b0d6-4321-b85b-d31be87ed7c2

Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 195 --setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4a2b887bcb in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
   from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
[Current thread is 1 (Thread 0x7f4a0489e700 (LWP 2306126))]
(gdb) bt
#0  0x00007f4a2b887bcb in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
   from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#1  0x00007f4a2b887c9b in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#2  0x00007f4a2b8966f5 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007f4a2c7b7db6 in ceph::BackTrace::print(std::ostream&) const () from /usr/lib/ceph/libceph-common.so.0
#4  0x000055f5c4806cfa in handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:119
#5  <signal handler called>
#6  0x00007f4a2b887bcb in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) ()
   from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#7  0x00007f4a2b887c9b in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#8  0x00007f4a2b8966f5 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#9  0x00007f4a2c7a1bbf in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) () from /usr/lib/ceph/libceph-common.so.0
#10 0x00007f4a2c942ae6 in ?? () from /usr/lib/ceph/libceph-common.so.0
#11 0x00007f4a2c942de5 in operator<<(std::ostream&, hobject_t const&) () from /usr/lib/ceph/libceph-common.so.0
#12 0x000055f5c45926f5 in MOSDRepOp::print (this=0x55f5ccefb000, out=...) at ./src/messages/MOSDRepOp.h:169
#13 0x00007f4a2c9075cb in AsyncConnection::send_message(Message*) () from /usr/lib/ceph/libceph-common.so.0
#14 0x000055f5c42ad74b in OSDService::send_message_osd_cluster (this=0x55f5c62e76b0, peer=170, m=0x55f5ccefb000, from_epoch=<optimized out>) at ./src/osd/OSD.cc:873
#15 0x000055f5c4580ded in ReplicatedBackend::issue_op (this=this@entry=0x55f5ca25ad80, soid=..., at_version=..., tid=tid@entry=6962, reqid=..., pg_trim_to=..., 
    pg_roll_forward_to=..., new_temp_oid=..., discard_temp_oid=..., log_entries=..., hset_hist=..., op=<optimized out>, op_t=...) at ./src/osd/ReplicatedBackend.cc:983
#16 0x000055f5c4591275 in ReplicatedBackend::submit_transaction (this=0x55f5ca25ad80, soid=..., delta_stats=..., at_version=..., _t=..., trim_to=..., roll_forward_to=..., 
    _log_entries=std::vector of length 1, capacity 1 = {...}, hset_history=..., on_all_commit=0x55f5caae34c0, tid=6962, reqid=..., orig_op=...)
    at ./src/osd/ReplicatedBackend.cc:469
#17 0x000055f5c4408a93 in PrimaryLogPG::issue_repop (this=this@entry=0x55f5ca0e4800, repop=repop@entry=0x55f5d59f1800, ctx=ctx@entry=0x55f5ce9f5800)
    at ./src/osd/PrimaryLogPG.cc:10359
#18 0x000055f5c444ad2f in PrimaryLogPG::execute_ctx (this=this@entry=0x55f5ca0e4800, ctx=ctx@entry=0x55f5ce9f5800) at ./src/osd/PrimaryLogPG.cc:4014
#19 0x000055f5c444eb05 in PrimaryLogPG::do_op (this=this@entry=0x55f5ca0e4800, op=...) at ./src/osd/PrimaryLogPG.cc:2385
#20 0x000055f5c4452d13 in PrimaryLogPG::do_request (this=0x55f5ca0e4800, op=..., handle=...) at ./src/osd/PrimaryLogPG.cc:1817
#21 0x000055f5c42acf88 in OSD::dequeue_op (this=this@entry=0x55f5c62e6000, pg=..., op=..., handle=...) at ./src/osd/OSD.cc:8895
#22 0x000055f5c4526d92 in PGOpItem::run (this=<optimized out>, osd=0x55f5c62e6000, sdata=<optimized out>, pg=..., handle=...) at ./src/osd/OpQueueItem.cc:24
#23 0x000055f5c42cc7f8 in OpQueueItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f4a04899080) at ./src/osd/OpQueueItem.h:134
#24 OSD::ShardedOpWQ::_process (this=0x55f5c62e7290, thread_index=<optimized out>, hb=<optimized out>) at ./src/osd/OSD.cc:9899
#25 0x00007f4a2c7ede6e in ShardedThreadPool::shardedthreadpool_worker(unsigned int) () from /usr/lib/ceph/libceph-common.so.0
#26 0x00007f4a2c7efef0 in ShardedThreadPool::WorkThreadSharded::entry() () from /usr/lib/ceph/libceph-common.so.0
#27 0x00007f4a2aee36db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#28 0x00007f4a29ea788f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Actions

Copy link

#11

Updated by Soenke Schippmann over 5 years ago

Coredump: bf9b2d5c-96f5-4d30-b852-3888dda66a6b

Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 88 --setuser ceph --setgroup ceph'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fcab5f39e97 in raise () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fcab148e700 (LWP 5071))]
(gdb) bt
#0  0x00007fcab5f39e97 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fcab5f3b801 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fcab895e080 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /usr/lib/ceph/libceph-common.so.0
#3  0x00007fcab895e0f7 in ceph::__ceph_assert_fail(ceph::assert_data const&) () from /usr/lib/ceph/libceph-common.so.0
#4  0x00007fcab8b751f1 in ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, unsigned long) () from /usr/lib/ceph/libceph-common.so.0
#5  0x00007fcab8b75606 in ceph::HeartbeatMap::is_healthy() () from /usr/lib/ceph/libceph-common.so.0
#6  0x00007fcab8b75a74 in ceph::HeartbeatMap::check_touch_file() () from /usr/lib/ceph/libceph-common.so.0
#7  0x00007fcab8b60151 in CephContextServiceThread::entry() () from /usr/lib/ceph/libceph-common.so.0
#8  0x00007fcab70586db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#9  0x00007fcab601c88f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Actions

Copy link

#12

Updated by Soenke Schippmann over 5 years ago

Coredump: 258b1ec0-ebc6-43df-b35e-f16a780148b5

Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 134 --setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3370d98727 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
[Current thread is 1 (Thread 0x7f335b76a700 (LWP 6082))]
(gdb) bt
#0  0x00007f3370d98727 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x000056547bf36d4e in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:74
#2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:138
#3  <signal handler called>
#4  std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear (this=this@entry=0x56549827f1a0) at /usr/include/c++/7/bits/list.tcc:73
#5  0x000056547be52f7a in std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~_List_base (this=0x56549827f1a0, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/stl_list.h:442
#6  std::__cxx11::list<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~list (this=0x56549827f1a0, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/stl_list.h:733
#7  ceph::buffer::list::~list (this=0x56549827f1a0, __in_chrg=<optimized out>) at ./src/include/buffer.h:360
#8  BlueStore::Buffer::~Buffer (this=0x56549827f180, __in_chrg=<optimized out>) at ./src/os/bluestore/BlueStore.h:158
#9  std::default_delete<BlueStore::Buffer>::operator() (this=<optimized out>, __ptr=0x56549827f180) at /usr/include/c++/7/bits/unique_ptr.h:78
#10 std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> >::~unique_ptr (this=<optimized out>, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/unique_ptr.h:268
#11 std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > >::~pair (this=<optimized out>, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/stl_pair.h:198
#12 mempool::pool_allocator<(mempool::pool_index_t)4, std::_Rb_tree_node<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::destroy<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > (this=<optimized out>, 
    p=<optimized out>) at ./src/include/mempool.h:373
#13 std::allocator_traits<mempool::pool_allocator<(mempool::pool_index_t)4, std::_Rb_tree_node<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > > >::_S_destroy<mempool::pool_allocator<(mempool::pool_index_t)4, std::_Rb_tree_node<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > (
    __a=..., __p=<optimized out>) at /usr/include/c++/7/bits/alloc_traits.h:256
#14 std::allocator_traits<mempool::pool_allocator<(mempool::pool_index_t)4, std::_Rb_tree_node<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > > >::destroy<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > (__a=..., 
    __p=<optimized out>) at /usr/include/c++/7/bits/alloc_traits.h:356
#15 std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > >, std::_Select1st<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > >, std::less<unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::_M_destroy_node (this=0x56548f09d438, __p=0x5654c2b2d5f0)
    at /usr/include/c++/7/bits/stl_tree.h:650
#16 std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > >, std::_Select1st<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > >, std::less<unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::_M_drop_node (__p=0x5654c2b2d5f0, this=0x56548f09d438)
    at /usr/include/c++/7/bits/stl_tree.h:658
#17 std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > >, std::_Select1st<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > >, std::less<unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::_M_erase_aux (this=this@entry=0x56548f09d438, 
    __position=..., __position@entry={first = 176128, second = std::unique_ptr<BlueStore::Buffer> = {get() = 0x56549827f180}}) at /usr/include/c++/7/bits/stl_tree.h:2477
#18 0x000056547be53166 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > >, std::_Select1st<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > >, std::less<unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::_M_erase_aux (__last=
    {first = 192512, second = std::unique_ptr<BlueStore::Buffer> = {get() = 0x565488c22dc0}}, __first=
    {first = 192512, second = std::unique_ptr<BlueStore::Buffer> = {get() = 0x565488c22dc0}}, this=0x56548f09d438) at /usr/include/c++/7/bits/stl_tree.h:2491
#19 std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > >, std::_Select1st<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > >, std::less<unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::erase (this=this@entry=0x56548f09d438, 
    __x=@0x56549827f198: 176128) at /usr/include/c++/7/bits/stl_tree.h:2502
#20 0x000056547bdea2cd in std::map<unsigned int, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> >, std::less<unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > > >::erase (
    __x=@0x56549827f198: 176128, this=0x56548f09d438) at /usr/include/c++/7/bits/stl_map.h:1063
#21 BlueStore::BufferSpace::finish_write (this=0x56548f09d438, cache=0x56547e344d80, seq=581695) at ./src/os/bluestore/BlueStore.cc:1445
#22 0x000056547bdf4236 in BlueStore::_txc_finish (this=this@entry=0x56547e6f4000, txc=txc@entry=0x5654d719b000) at ./src/os/bluestore/BlueStore.cc:8675
#23 0x000056547be033ba in BlueStore::_txc_state_proc (this=this@entry=0x56547e6f4000, txc=0x5654d719b000) at ./src/os/bluestore/BlueStore.cc:8489
#24 0x000056547be045e4 in BlueStore::_kv_finalize_thread (this=0x56547e6f4000) at ./src/os/bluestore/BlueStore.cc:9211
#25 0x000056547be5c3ad in BlueStore::KVFinalizeThread::entry (this=<optimized out>) at ./src/os/bluestore/BlueStore.h:1784
#26 0x00007f3370d8d6db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#27 0x00007f336fd5188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Actions

Copy link

#13

Updated by Brad Hubbard over 5 years ago

Thanks Soenke,

These should help to isolate the problem.

Actions

Copy link

#14

Updated by Brad Hubbard over 5 years ago

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007fcab5f3b801 in __GI_abort () at abort.c:79
#2  0x00007fcab895e080 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>) at ./src/common/assert.cc:66
#3  0x00007fcab895e0f7 in ceph::__ceph_assert_fail (ctx=...) at ./src/common/assert.cc:71
#4  0x00007fcab8b751f1 in ceph::HeartbeatMap::_check (this=this@entry=0x55aae7faa640, h=0x55aaf169c190, who=who@entry=0x7fcab8dd8639 "is_healthy", now=now@entry=2953111) at ./src/common/HeartbeatMap.cc:82
#5  0x00007fcab8b75606 in ceph::HeartbeatMap::is_healthy (this=this@entry=0x55aae7faa640) at ./src/common/HeartbeatMap.cc:142
#6  0x00007fcab8b75a74 in ceph::HeartbeatMap::check_touch_file (this=0x55aae7faa640) at ./src/common/HeartbeatMap.cc:170
#7  0x00007fcab8b60151 in CephContextServiceThread::entry (this=0x55aae7faf1d0) at ./src/common/ceph_context.cc:159
#8  0x00007fcab70586db in start_thread (arg=0x7fcab148e700) at pthread_create.c:463
#9  0x00007fcab601c88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) f 4
#4  0x00007fcab8b751f1 in ceph::HeartbeatMap::_check (this=this@entry=0x55aae7faa640, h=0x55aaf169c190, who=who@entry=0x7fcab8dd8639 "is_healthy", now=now@entry=2953111) at ./src/common/HeartbeatMap.cc:82
82          assert(0 == "hit suicide timeout");

This one hit the suicide timeout. That's expected if a thread can't flush IO to the device. This may have happened because of the bug you are seeing but is not representative of the bug itself.

Actions

Copy link

#15

Updated by Brad Hubbard over 5 years ago

Starting with the bluestore bufferlist destructor crash.

(gdb) bt 7
#0  raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x000056547bf36d4e in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:74
#2  handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:138
#3  <signal handler called>
#4  std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear (this=this@entry=0x56549827f1a0) at /usr/include/c++/7/bits/list.tcc:73
#5  0x000056547be52f7a in std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~_List_base (this=0x56549827f1a0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/stl_list.h:442
#6  std::__cxx11::list<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~list (this=0x56549827f1a0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/stl_list.h:733
(More stack frames follow...)

So we've crashed whilst deleting a list ceph buffer pointers.

(gdb) f 4
#4  std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear (this=this@entry=0x56549827f1a0) at /usr/include/c++/7/bits/list.tcc:73
73      in /usr/include/c++/7/bits/list.tcc
(gdb) x/i $pc
=> 0x56547b9acc30 <std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear()+16>:     mov    (%rbx),%rbp
(gdb) inf reg rbx
rbx            0x0      0

We crashed because we tried to dereference an address which was 0x0. How did that happen?

(gdb) disass
Dump of assembler code for function std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear():
   0x000056547b9acc20 <+0>:     push   %r12
   0x000056547b9acc22 <+2>:     push   %rbp
   0x000056547b9acc23 <+3>:     push   %rbx
   0x000056547b9acc24 <+4>:     mov    (%rdi),%rbx
   0x000056547b9acc27 <+7>:     cmp    %rdi,%rbx
   0x000056547b9acc2a <+10>:    je     0x56547b9acc4c <std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear()+44>
   0x000056547b9acc2c <+12>:    mov    %rdi,%r12
   0x000056547b9acc2f <+15>:    nop
=> 0x000056547b9acc30 <+16>:    mov    (%rbx),%rbp
   0x000056547b9acc33 <+19>:    lea    0x10(%rbx),%rdi
   0x000056547b9acc37 <+23>:    callq  0x56547b8dba60 <_ZN4ceph6buffer3ptr7releaseEv@plt>
   0x000056547b9acc3c <+28>:    mov    %rbx,%rdi
   0x000056547b9acc3f <+31>:    mov    %rbp,%rbx
   0x000056547b9acc42 <+34>:    callq  0x56547b8dd110 <_ZdlPv@plt>
   0x000056547b9acc47 <+39>:    cmp    %r12,%rbp
   0x000056547b9acc4a <+42>:    jne    0x56547b9acc30 <std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::_M_clear()+16>
   0x000056547b9acc4c <+44>:    pop    %rbx
   0x000056547b9acc4d <+45>:    pop    %rbp
   0x000056547b9acc4e <+46>:    pop    %r12
   0x000056547b9acc50 <+48>:    retq   
End of assembler dump.

The above code is basically a loop where rbx is assigned the next value in a
list at offset '+31'. We then hit the jump instruction at '+42' and jump back to
'+16' and dereference the 'rbx' register and this is where we crash after
multiple iterations of the loop. How many? 88.

We can simulate that in gdb with the following.

(gdb) p
*this->_M_impl->_M_node->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next->_M_next
$98 = {
    _M_next = 0x0,
    _M_prev = 0x5654b7ae9fe8
}

How did I know it was 88 iterations?

(gdb) f 6
#6  std::__cxx11::list<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >::~list (this=0x56549827f1a0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/stl_list.h:733
733     in /usr/include/c++/7/bits/stl_list.h
(gdb) whatis this
type = std::list<ceph::buffer::ptr> * const
(gdb) p *this         
$105 = std::__cxx11::list = {
  [0] = {          
    _raw = 0x0,    
    _off = 1805377952,
    _len = 32563      
  },
...
  [87] = {
    _raw = 0x1,
    _off = 1927580928,
    _len = 32563
  },
  [88] = Cannot access memory at address 0x10

It's complaining about offset 0x10 because it's trying to access the
'_M_storage' member which is at offset 0x10 as per '+19' in the assembly above
and the following.

(gdb) whatis this->_M_impl->_M_node
type = std::_List_node<unsigned long>
(gdb) p ((std::_List_node<unsigned long>*)0x0)->_M_storage
Cannot access memory at address 0x10

Looking at the memory around the corruption there don't appear to be any hints
as to how the corruption got there.

Actions

Copy link

#16

Updated by Brad Hubbard over 5 years ago

For the rest of the coredumps adding the debuginfo for libtcmalloc really helps
to understand the problem as we end up with stacks like the following for the
'tp_osd_tp' coredump.

(gdb) bt
#0  tcmalloc::SLL_Next (t=0x0) at src/linked_list.h:45
#1  tcmalloc::SLL_PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=512, head=0x55f5c565b4b8) at src/linked_list.h:76
#2  tcmalloc::ThreadCache::FreeList::PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=512, this=0x55f5c565b4b8) at src/thread_cache.h:225
#3  tcmalloc::ThreadCache::ReleaseToCentralCache (this=this@entry=0x55f5c565b440, src=src@entry=0x55f5c565b4b8, cl=<optimized out>, N=N@entry=512) at src/thread_cache.cc:195
#4  0x00007f4a2b887c9b in tcmalloc::ThreadCache::ListTooLong (this=this@entry=0x55f5c565b440, list=0x55f5c565b4b8, cl=<optimized out>) at src/thread_cache.cc:157
#5  0x00007f4a2b8966f5 in tcmalloc::ThreadCache::Deallocate (cl=<optimized out>, ptr=0x55f5cebf4640, this=0x55f5c565b440) at src/thread_cache.h:387
#6  (anonymous namespace)::do_free_helper (invalid_free_fn=0x7f4a2b876ce0 <(anonymous namespace)::InvalidFree(void*)>, size_hint=0, use_hint=false, heap_must_be_valid=true, heap=0x55f5c565b440, ptr=0x55f5cebf4640)
    at src/tcmalloc.cc:1305
#7  (anonymous namespace)::do_free_with_callback (invalid_free_fn=0x7f4a2b876ce0 <(anonymous namespace)::InvalidFree(void*)>, size_hint=0, use_hint=false, ptr=0x55f5cebf4640) at src/tcmalloc.cc:1337
#8  (anonymous namespace)::do_free (ptr=0x55f5cebf4640) at src/tcmalloc.cc:1345
#9  tc_free (ptr=0x55f5cebf4640) at src/tcmalloc.cc:1610
#10 0x00007f4a2c7b7db6 in ceph::BackTrace::print (this=this@entry=0x7f4a048967a0, out=...) at ./src/common/BackTrace.cc:64
#11 0x000055f5c4806cfa in handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:119
#12 <signal handler called>
#13 tcmalloc::SLL_Next (t=0x0) at src/linked_list.h:45
#14 tcmalloc::SLL_PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=512, head=0x55f5c565b4b8) at src/linked_list.h:76
#15 tcmalloc::ThreadCache::FreeList::PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=512, this=0x55f5c565b4b8) at src/thread_cache.h:225
#16 tcmalloc::ThreadCache::ReleaseToCentralCache (this=this@entry=0x55f5c565b440, src=src@entry=0x55f5c565b4b8, cl=<optimized out>, N=N@entry=512) at src/thread_cache.cc:195
#17 0x00007f4a2b887c9b in tcmalloc::ThreadCache::ListTooLong (this=this@entry=0x55f5c565b440, list=0x55f5c565b4b8, cl=<optimized out>) at src/thread_cache.cc:157
#18 0x00007f4a2b8966f5 in tcmalloc::ThreadCache::Deallocate (cl=<optimized out>, ptr=0x55f5cebf4640, this=0x55f5c565b440) at src/thread_cache.h:387
#19 (anonymous namespace)::do_free_helper (invalid_free_fn=0x7f4a2b876ce0 <(anonymous namespace)::InvalidFree(void*)>, size_hint=0, use_hint=false, heap_must_be_valid=true, heap=0x55f5c565b440, ptr=0x55f5cebf4640)
    at src/tcmalloc.cc:1305
#20 (anonymous namespace)::do_free_with_callback (invalid_free_fn=0x7f4a2b876ce0 <(anonymous namespace)::InvalidFree(void*)>, size_hint=0, use_hint=false, ptr=0x55f5cebf4640) at src/tcmalloc.cc:1337
#21 (anonymous namespace)::do_free (ptr=0x55f5cebf4640) at src/tcmalloc.cc:1345
#22 tc_free (ptr=0x55f5cebf4640) at src/tcmalloc.cc:1610
#23 0x00007f4a2c7a1bbf in __gnu_cxx::new_allocator<char>::deallocate (this=<optimized out>, __p=<optimized out>) at /usr/include/c++/7/ext/new_allocator.h:125
#24 std::allocator_traits<std::allocator<char> >::deallocate (__a=..., __n=<optimized out>, __p=<optimized out>) at /usr/include/c++/7/bits/alloc_traits.h:462
#25 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_destroy (__size=<optimized out>, this=0x7f4a04897a40) at /usr/include/c++/7/bits/basic_string.h:226
#26 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose (this=0x7f4a04897a40) at /usr/include/c++/7/bits/basic_string.h:221
#27 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate (this=this@entry=0x7f4a04897a40, __pos=__pos@entry=30, __len1=__len1@entry=0, __s=__s@entry=0x0, __len2=__len2@entry=1)
    at /usr/include/c++/7/bits/basic_string.tcc:327
#28 0x00007f4a2c942ae6 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::push_back (__c=48 '0', this=0x7f4a04897a40) at /usr/include/c++/7/bits/basic_string.h:1334
#29 append_out_escaped (in="rbd_data.1c95c36b8b4567.", '0' <repeats 12 times>, "cf53", out=out@entry=0x7f4a04897a40) at ./src/common/hobject.cc:212
#30 0x00007f4a2c942de5 in operator<< (out=..., o=...) at ./src/common/hobject.cc:257
#31 0x000055f5c45926f5 in MOSDRepOp::print (this=0x55f5ccefb000, out=...) at ./src/messages/MOSDRepOp.h:169
#32 0x00007f4a2c9075cb in operator<< (m=..., out=...) at ./src/msg/Message.h:501
#33 AsyncConnection::send_message (this=0x55f5cc01ae00, m=0x55f5ccefb000) at ./src/msg/async/AsyncConnection.cc:1925
#34 0x000055f5c42ad74b in OSDService::send_message_osd_cluster (this=0x55f5c62e76b0, peer=170, m=0x55f5ccefb000, from_epoch=<optimized out>) at ./src/osd/OSD.cc:873
#35 0x000055f5c4580ded in ReplicatedBackend::issue_op (this=this@entry=0x55f5ca25ad80, soid=..., at_version=..., tid=tid@entry=6962, reqid=..., pg_trim_to=..., pg_roll_forward_to=..., new_temp_oid=..., discard_temp_oid=...,
    log_entries=..., hset_hist=..., op=<optimized out>, op_t=...) at ./src/osd/ReplicatedBackend.cc:983
#36 0x000055f5c4591275 in ReplicatedBackend::submit_transaction (this=0x55f5ca25ad80, soid=..., delta_stats=..., at_version=..., _t=..., trim_to=..., roll_forward_to=..., _log_entries=std::vector of length 1, capacity 1 = {...},
    hset_history=..., on_all_commit=0x55f5caae34c0, tid=6962, reqid=..., orig_op=...) at ./src/osd/ReplicatedBackend.cc:469
#37 0x000055f5c4408a93 in PrimaryLogPG::issue_repop (this=this@entry=0x55f5ca0e4800, repop=repop@entry=0x55f5d59f1800, ctx=ctx@entry=0x55f5ce9f5800) at ./src/osd/PrimaryLogPG.cc:10359
#38 0x000055f5c444ad2f in PrimaryLogPG::execute_ctx (this=this@entry=0x55f5ca0e4800, ctx=ctx@entry=0x55f5ce9f5800) at ./src/osd/PrimaryLogPG.cc:4014
#39 0x000055f5c444eb05 in PrimaryLogPG::do_op (this=this@entry=0x55f5ca0e4800, op=...) at ./src/osd/PrimaryLogPG.cc:2385
#40 0x000055f5c4452d13 in PrimaryLogPG::do_request (this=0x55f5ca0e4800, op=..., handle=...) at ./src/osd/PrimaryLogPG.cc:1817
#41 0x000055f5c42acf88 in OSD::dequeue_op (this=this@entry=0x55f5c62e6000, pg=..., op=..., handle=...) at ./src/osd/OSD.cc:8895
#42 0x000055f5c4526d92 in PGOpItem::run (this=<optimized out>, osd=0x55f5c62e6000, sdata=<optimized out>, pg=..., handle=...) at ./src/osd/OpQueueItem.cc:24
#43 0x000055f5c42cc7f8 in OpQueueItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f4a04899080) at ./src/osd/OpQueueItem.h:134
#44 OSD::ShardedOpWQ::_process (this=0x55f5c62e7290, thread_index=<optimized out>, hb=<optimized out>) at ./src/osd/OSD.cc:9899
#45 0x00007f4a2c7ede6e in ShardedThreadPool::shardedthreadpool_worker (this=0x55f5c62e6a10, thread_index=14) at ./src/common/WorkQueue.cc:339
#46 0x00007f4a2c7efef0 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at ./src/common/WorkQueue.h:690
#47 0x00007f4a2aee36db in start_thread (arg=0x7f4a0489e700) at pthread_create.c:463
#48 0x00007f4a29ea788f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

So we can see tcmalloc is walking through a range trying to delete the storage
to free memory. Interestingly this thread segfaulted twice, once before the
signal handler (which caught the segfault signal), and then again in the signal
handler when we tried again to free some memory and hit the same issue.

(gdb) f 14
#14 tcmalloc::SLL_PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=512, head=0x55f5c565b4b8) at src/linked_list.h:76
76          tmp = SLL_Next(tmp);
(gdb) down
#13 tcmalloc::SLL_Next (t=0x0) at src/linked_list.h:45
45        return *(reinterpret_cast<void**>(t));
(gdb) x/i $pc
=> 0x7f4a2b887bcb <tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int)+219>:     mov    (%rdx),%rdx
(gdb) i reg rdx
rdx            0x0      0
(gdb) p t
$1 = (void *) 0x0

So that address is not expected to ever be zero.

(gdb) f 14
#14 tcmalloc::SLL_PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=512, head=0x55f5c565b4b8) at src/linked_list.h:76
76          tmp = SLL_Next(tmp);
(gdb) l
71          return;
72        }
73
74        void *tmp = *head;
75        for (int i = 1; i < N; ++i) {
76          tmp = SLL_Next(tmp);
77        }
78
79        *start = *head;
80        *end = tmp;
(gdb) p i
$5 = 71

Basically this code is just walking through a list of pointers by dereferencing
each in turn and encountered the null pointer on the 71st iteration. We can
simulate that by hand (I'll spare you the entire sequence).

(gdb) x head
0x55f5c565b4b8: 0x55f5cebf4640
(gdb) x 0x55f5cebf4640
0x55f5cebf4640: 0x55f5cbeae360
...
(gdb) x 0x55f5ca2519a0
0x55f5ca2519a0: 0x55f5c8d62300
(gdb) x 0x55f5c8d62300
0x55f5c8d62300: 0x0

Once again, looking at the memory around this corrupted value does not give any
clues to what caused the corruption.

Actions

Copy link

#17

Updated by Brad Hubbard over 5 years ago

The 'msgr-worker-1.5278' is almost identical to 'tp_osd_tp' except this time 'i' = 499 so doing that manually is beyond me at this stage so I wrote the
following script.

(gdb) show user
User command "walk_derefs":
  set $t = *head
  set $i = 1
  while ($i)
    printf "%i\t:\t%p\n", $i, $t
    set $t = *((void **)$t)
    set $i = $i + 1
  end

(gdb) walk_derefs
1       :       0x55e9973a3e20
2       :       0x55e998935ee0
3       :       0x55e99f87c1a0
4       :       0x55e9a23b06e0
...
495     :       0x55e9ad2dfa60
496     :       0x55e98c399c40
497     :       0x55e9ac065fe0
498     :       0x55e9a3b44a00
499     :       (nil)
Cannot access memory at address 0x0
(gdb) x 0x55e9a3b44a00
0x55e9a3b44a00: 0x00000000

Once again no luck in exposing a culprit.

Actions

Copy link

#18

Updated by Brad Hubbard over 5 years ago

The 'safe_timer.5246' is again similar but this time tcmalloc is 'popping' a
single value rather than a range.

(gdb) f 11
#11 tcmalloc::SLL_Next (t=0x5f78616d5f6e6f6d) at src/linked_list.h:45
45        return *(reinterpret_cast<void**>(t));
(gdb) p t
$7 = (void *) 0x5f78616d5f6e6f6d
(gdb) x/i $pc
=> 0x7f3a817a674d <tc_new(size_t)+141>: mov    (%rbx),%rax
(gdb) i r rbx
rbx            0x5f78616d5f6e6f6d       6879355553188835181
(gdb) x 0x5f78616d5f6e6f6d
   0x5f78616d5f6e6f6d:  Cannot access memory at address 0x5f78616d5f6e6f6d

So we segfaulted because we tried to access the memory at 0x5f78616d5f6e6f6d
which looks like it could be text.

$ echo 0x5f78616d5f6e6f6d|xxd -r|rev
mon_max_

That looks to be a bit of a smoking gun that we are looking for a problem in
code that manipulates config variable names.

If we trace that variable backwards a few frames we can find where the value
came from.

(gdb) f 14
#14 tcmalloc::ThreadCache::Allocate (cl=3, size=32, this=<optimized out>) at src/thread_cache.h:365
365       return list->Pop();
(gdb) p *list
$11 = {list_ = 0x5f78616d5f6e6f6d, length_ = 143, lowater_ = 0, max_length_ = 336, length_overages_ = 0}
(gdb) p &list
Address requested for identifier "list" which is in register $rcx
(gdb) i r rcx
rcx            0x55797cc12e38   93980272438840
(gdb) x/10s 0x55797cc12e30
0x55797cc12e30: "\002" 
0x55797cc12e32: "" 
0x55797cc12e33: "" 
0x55797cc12e34: "" 
0x55797cc12e35: "" 
0x55797cc12e36: "" 
0x55797cc12e37: "" 
0x55797cc12e38: "mon_max_\217" 
0x55797cc12e42: "" 
0x55797cc12e43: ""

So something has written at least part of a configuration variable name over
the top of tcmallocs freelist structure. It's not clear that
ConfFile::normalize_key_name is the actual culprit here (since we could be
hitting corruption left by another function or thread) but the evidence points
to that code, or code doing similar config variable name manipulation and that's
where I'll be turning my attention now.

Actions

Copy link

#19

Updated by Brad Hubbard over 5 years ago

I'd say the cause of most, if not all, of these crashes is memory corruption caused by code responsible for manipulating config variable names. Could you upload your ceph.conf to see what variables are not at there default values and can you let me know if you are setting/manipulating configuration values in any other way?

Actions

Copy link

#20

Updated by Christian Schlittchen over 5 years ago

Our ceph.conf:

[global]
fsid = 1f3b3198-08b1-418c-a279-7050a2eb1ce3
mon_initial_members = ceph-mon-a, ceph-mon-b, ceph-mon-m
mon_host = 10.120.3.180,10.120.3.181,10.120.3.182
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
cluster network = 10.119.1.0/24
public network = 10.120.0.0/22

mon osd full ratio = .80
mon osd backfillfull ratio = .70
mon osd nearfull ratio = .60

osd journal size = 10000
osd pool default min size = 1
osd pool default size = 3
osd pool default pg num = 128
osd pool default pgp num = 128

# Unterstuetzte rbd-features Stand 06/2018: layering=1, striping=2, exclusive-lock=4
# see https://tracker.ceph.com/issues/15000
# http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-August/020089.html
rbd_default_features = 7

[client.rgw.ceph-mon-a]
rgw_frontends = civetweb port=7480s ssl_certificate=/etc/ceph/private/ceph-mon.manage.zfn.uni-bremen.de.keyandcert.pem

[client.rgw.ceph-mon-b]
rgw_frontends = civetweb port=7480s ssl_certificate=/etc/ceph/private/ceph-mon.manage.zfn.uni-bremen.de.keyandcert.pem

[client.rgw.ceph-mon-m]
rgw_frontends = civetweb port=7480s ssl_certificate=/etc/ceph/private/ceph-mon.manage.zfn.uni-bremen.de.keyandcert.pem

Actions

Copy link

#21

Updated by Soenke Schippmann over 5 years ago

We do use some configuration set by "ceph config set" or "ceph config-key set":

# ceph config-key ls
    "config/mgr/mgr/dashboard/RGW_API_ACCESS_KEY",
    "config/mgr/mgr/dashboard/RGW_API_HOST",
    "config/mgr/mgr/dashboard/RGW_API_PORT",
    "config/mgr/mgr/dashboard/RGW_API_SCHEME",
    "config/mgr/mgr/dashboard/RGW_API_SECRET_KEY",
    "config/mgr/mgr/dashboard/RGW_API_USER_ID",
    "config/mgr/mgr/dashboard/ceph-mon-a/server_addr",
    "config/mgr/mgr/dashboard/ceph-mon-b/server_addr",
    "config/mgr/mgr/dashboard/ceph-mon-m/server_addr",
    "config/mgr/mgr/dashboard/password",
    "config/mgr/mgr/dashboard/server_port",
    "config/mgr/mgr/dashboard/username",
    "config/mgr/mgr/influx/hostname",
    "config/mgr/mgr/influx/password",
    "config/mgr/mgr/influx/username",
    "config/mgr/mgr/restful/ceph-mon-a/server_addr",
    "config/mgr/mgr/restful/ceph-mon-b/server_addr",
    "config/mgr/mgr/restful/ceph-mon-m/server_addr",
    "config/mgr/mgr/restful/server_port",
    "config/mgr/mgr/selftest/ceph-mon-a/testkey",
    "config/mgr/mgr/selftest/testkey",
    "mgr/dashboard/crt",
    "mgr/dashboard/key",
    "mgr/restful/crt",
    "mgr/restful/key",
    "mgr/selftest/testjsonkey",
    "mgr/selftest/testkey"

Hope this helps.

Actions

Copy link

#22