Actions
Bug #460
closedOSD crash: ReplicatedPG::push_to_replica / Rb_tree
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
After my cluster recovered from the latest crashes, I wanted to check if my RBD data was still in tact.
This caused osd0 to crash:
Core was generated by `/usr/bin/cosd -i 0 -c /etc/ceph/ceph.conf'. Program terminated with signal 11, Segmentation fault. #0 std::_Rb_tree<snapid_t, std::pair<snapid_t const, unsigned long>, std::_Select1st<std::pair<snapid_t const, unsigned long> >, std::less<snapid_t>, std::allocator<std::pair<snapid_t const, unsigned long> > >::_M_begin (this=0x2218000, snapset=..., soid=<value optimized out>, missing=..., data_subset=..., clone_subsets=...) at /usr/include/c++/4.4/bits/stl_tree.h:482 482 { return static_cast<_Link_type>(this->_M_impl._M_header._M_parent); } (gdb) bt #0 std::_Rb_tree<snapid_t, std::pair<snapid_t const, unsigned long>, std::_Select1st<std::pair<snapid_t const, unsigned long> >, std::less<snapid_t>, std::allocator<std::pair<snapid_t const, unsigned long> > >::_M_begin (this=0x2218000, snapset=..., soid=<value optimized out>, missing=..., data_subset=..., clone_subsets=...) at /usr/include/c++/4.4/bits/stl_tree.h:482 #1 std::_Rb_tree<snapid_t, std::pair<snapid_t const, unsigned long>, std::_Select1st<std::pair<snapid_t const, unsigned long> >, std::less<snapid_t>, std::allocator<std::pair<snapid_t const, unsigned long> > >::lower_bound (this=0x2218000, snapset=..., soid=<value optimized out>, missing=..., data_subset=..., clone_subsets=...) at /usr/include/c++/4.4/bits/stl_tree.h:745 #2 std::map<snapid_t, unsigned long, std::less<snapid_t>, std::allocator<std::pair<snapid_t const, unsigned long> > >::lower_bound (this=0x2218000, snapset=..., soid=<value optimized out>, missing=..., data_subset=..., clone_subsets=...) at /usr/include/c++/4.4/bits/stl_map.h:701 #3 std::map<snapid_t, unsigned long, std::less<snapid_t>, std::allocator<std::pair<snapid_t const, unsigned long> > >::operator[] (this=0x2218000, snapset=..., soid=<value optimized out>, missing=..., data_subset=..., clone_subsets=...) at /usr/include/c++/4.4/bits/stl_map.h:447 #4 ReplicatedPG::calc_clone_subsets (this=0x2218000, snapset=..., soid=<value optimized out>, missing=..., data_subset=..., clone_subsets=...) at osd/ReplicatedPG.cc:2613 #5 0x000000000049571e in ReplicatedPG::push_to_replica (this=0x2218000, obc=<value optimized out>, soid=..., peer=8) at osd/ReplicatedPG.cc:2831 #6 0x0000000000496083 in ReplicatedPG::recover_object_replicas (this=0x2218000, soid=...) at osd/ReplicatedPG.cc:3682 #7 0x00000000004964ab in ReplicatedPG::recover_replicas (this=0x2218000, max=<value optimized out>) at osd/ReplicatedPG.cc:3715 #8 0x000000000049f0ba in ReplicatedPG::start_recovery_ops (this=0x2218000, max=1) at osd/ReplicatedPG.cc:3524 #9 0x00000000004d7c6c in OSD::do_recovery (this=0x1332000, pg=0x2218000) at osd/OSD.cc:4332 #10 0x00000000005c6c0f in ThreadPool::worker (this=0x13325f8) at common/WorkQueue.cc:44 #11 0x00000000004fd9ed in ThreadPool::WorkThread::entry() () #12 0x000000000046e82a in Thread::_entry_func (arg=0x2218000) at ./common/Thread.h:39 #13 0x00007fcfa13459ca in start_thread () from /lib/libpthread.so.0 #14 0x00007fcfa02fd6fd in clone () from /lib/libc.so.6 #15 0x0000000000000000 in ?? ()
Restarting the OSD caused the OSD to crash again withing a few seconds.
The core, binary and logs are available on logger.pcextreme.nl:/srv/ceph/issues/osd_crash_rb_tree
Actions