Actions
Bug #21834
closedFilestore OSD Segfault in thread 7f084dffc700 thread_name:tp_fstore_op
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Description
I just zapped and deleted one of my OSDs and added it again, because of the segfault described here: http://tracker.ceph.com/issues/21826
During backfilling, this OSD again segfaulted a couple of times (seems stable now) with a similar but different trace:
Oct 18 23:03:28 node1.ceph ceph-osd: *** Caught signal (Segmentation fault) ** Oct 18 23:03:28 node1.ceph ceph-osd: in thread 7f084dffc700 thread_name:tp_fstore_op Oct 18 23:03:28 node1.ceph ceph-osd: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) Oct 18 23:03:28 node1.ceph ceph-osd: 1: (()+0xa29511) [0x55a2b4cf0511] Oct 18 23:03:28 node1.ceph ceph-osd: 2: (()+0xf5e0) [0x7f0859eef5e0] Oct 18 23:03:28 node1.ceph ceph-osd: 3: (()+0x1cdff) [0x7f085c8cfdff] Oct 18 23:03:28 node1.ceph ceph-osd: 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions const&, rocksdb::BlockIter*, rocksdb::BlockBasedTable::CachableEntry<rocksdb::BlockBasedTable::IndexReader>*)+0x466) [0x55a2b5063ca6] Oct 18 23:03:28 node1.ceph ceph-osd: 5: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x297) [0x55a2b50646f7] Oct 18 23:03:28 node1.ceph ceph-osd: 6: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*, bool, int)+0x2a4) [0x55a2b5124554] Oct 18 23:03:28 node1.ceph ceph-osd: 7: (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*, unsigned long*)+0x810) [0x55a2b5027c40] Oct 18 23:03:28 node1.ceph ceph-osd: 8: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*, bool*)+0x5a4) [0x55a2b50d2ab4] Oct 18 23:03:28 node1.ceph ceph-osd: 9: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*)+0x19) [0x55a2b50d3039] Oct 18 23:03:28 node1.ceph ceph-osd: 10: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, std::string*)+0x95) [0x55a2b50d7055] Oct 18 23:03:28 node1.ceph ceph-osd: 11: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, std::string*)+0x4a) [0x55a2b50d64fa] Oct 18 23:03:28 node1.ceph ceph-osd: 12: (RocksDBStore::get(std::string const&, std::string const&, ceph::buffer::list*)+0x142) [0x55a2b4c46342] Oct 18 23:03:28 node1.ceph ceph-osd: 13: (DBObjectMap::_lookup_map_header(DBObjectMap::MapHeaderLock const&, ghobject_t const&)+0x5ef) [0x55a2b4c6fa0f] Oct 18 23:03:28 node1.ceph ceph-osd: 14: (DBObjectMap::clear(ghobject_t const&, SequencerPosition const*)+0x75) [0x55a2b4c70f15] Oct 18 23:03:28 node1.ceph ceph-osd: 15: (FileStore::lfn_unlink(coll_t const&, ghobject_t const&, SequencerPosition const&, bool)+0x23b) [0x55a2b4afeccb] Oct 18 23:03:28 node1.ceph ceph-osd: 16: (FileStore::_remove(coll_t const&, ghobject_t const&, SequencerPosition const&)+0x83) [0x55a2b4aff753] Oct 18 23:03:28 node1.ceph ceph-osd: 17: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x1ce7) [0x55a2b4b14b27] Oct 18 23:03:28 node1.ceph ceph-osd: 18: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x55a2b4b18b0b] Oct 18 23:03:28 node1.ceph ceph-osd: 19: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x3fa) [0x55a2b4b18f3a] Oct 18 23:03:28 node1.ceph ceph-osd: 20: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x55a2b4d35b0e] Oct 18 23:03:28 node1.ceph ceph-osd: 21: (ThreadPool::WorkThread::entry()+0x10) [0x55a2b4d369f0] Oct 18 23:03:28 node1.ceph ceph-osd: 22: (()+0x7e25) [0x7f0859ee7e25] Oct 18 23:03:28 node1.ceph ceph-osd: 23: (clone()+0x6d) [0x7f0858fdb34d] Oct 18 23:03:28 node1.ceph ceph-osd: 2017-10-18 23:03:28.617529 7f084dffc700 -1 *** Caught signal (Segmentation fault) ** Oct 18 23:03:28 node1.ceph ceph-osd: in thread 7f084dffc700 thread_name:tp_fstore_op Oct 18 23:03:28 node1.ceph ceph-osd: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) Oct 18 23:03:28 node1.ceph ceph-osd: 1: (()+0xa29511) [0x55a2b4cf0511] Oct 18 23:03:28 node1.ceph ceph-osd: 2: (()+0xf5e0) [0x7f0859eef5e0] Oct 18 23:03:28 node1.ceph ceph-osd: 3: (()+0x1cdff) [0x7f085c8cfdff] Oct 18 23:03:28 node1.ceph ceph-osd: 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions const&, rocksdb::BlockIter*, rocksdb::BlockBasedTable::CachableEntry<rocksdb::BlockBasedTable::IndexReader>*)+0x466) [0x55a2b5063ca6] Oct 18 23:03:28 node1.ceph ceph-osd: 5: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x297) [0x55a2b50646f7] Oct 18 23:03:28 node1.ceph ceph-osd: 6: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*, bool, int)+0x2a4) [0x55a2b5124554] Oct 18 23:03:28 node1.ceph ceph-osd: 7: (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*, unsigned long*)+0x810) [0x55a2b5027c40] Oct 18 23:03:28 node1.ceph ceph-osd: 8: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*, bool*)+0x5a4) [0x55a2b50d2ab4] Oct 18 23:03:28 node1.ceph ceph-osd: 9: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*)+0x19) [0x55a2b50d3039] Oct 18 23:03:28 node1.ceph ceph-osd: 10: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, std::string*)+0x95) [0x55a2b50d7055] Oct 18 23:03:28 node1.ceph ceph-osd: 11: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, std::string*)+0x4a) [0x55a2b50d64fa] Oct 18 23:03:28 node1.ceph ceph-osd: 12: (RocksDBStore::get(std::string const&, std::string const&, ceph::buffer::list*)+0x142) [0x55a2b4c46342] Oct 18 23:03:28 node1.ceph ceph-osd: 13: (DBObjectMap::_lookup_map_header(DBObjectMap::MapHeaderLock const&, ghobject_t const&)+0x5ef) [0x55a2b4c6fa0f] Oct 18 23:03:28 node1.ceph ceph-osd: 14: (DBObjectMap::clear(ghobject_t const&, SequencerPosition const*)+0x75) [0x55a2b4c70f15] Oct 18 23:03:28 node1.ceph ceph-osd: 15: (FileStore::lfn_unlink(coll_t const&, ghobject_t const&, SequencerPosition const&, bool)+0x23b) [0x55a2b4afeccb] Oct 18 23:03:28 node1.ceph ceph-osd: 16: (FileStore::_remove(coll_t const&, ghobject_t const&, SequencerPosition const&)+0x83) [0x55a2b4aff753] Oct 18 23:03:28 node1.ceph ceph-osd: 17: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x1ce7) [0x55a2b4b14b27] Oct 18 23:03:28 node1.ceph ceph-osd: 18: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x55a2b4b18b0b] Oct 18 23:03:28 node1.ceph ceph-osd: 19: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x3fa) [0x55a2b4b18f3a] Oct 18 23:03:28 node1.ceph ceph-osd: 20: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x55a2b4d35b0e] Oct 18 23:03:28 node1.ceph ceph-osd: 21: (ThreadPool::WorkThread::entry()+0x10) [0x55a2b4d369f0] Oct 18 23:03:28 node1.ceph ceph-osd: 22: (()+0x7e25) [0x7f0859ee7e25] Oct 18 23:03:28 node1.ceph ceph-osd: 23: (clone()+0x6d) [0x7f0858fdb34d] Oct 18 23:03:28 node1.ceph ceph-osd: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Oct 18 23:03:28 node1.ceph ceph-osd: 0> 2017-10-18 23:03:28.617529 7f084dffc700 -1 *** Caught signal (Segmentation fault) ** Oct 18 23:03:28 node1.ceph ceph-osd: in thread 7f084dffc700 thread_name:tp_fstore_op Oct 18 23:03:28 node1.ceph ceph-osd: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) Oct 18 23:03:28 node1.ceph ceph-osd: 1: (()+0xa29511) [0x55a2b4cf0511] Oct 18 23:03:28 node1.ceph ceph-osd: 2: (()+0xf5e0) [0x7f0859eef5e0] Oct 18 23:03:28 node1.ceph ceph-osd: 3: (()+0x1cdff) [0x7f085c8cfdff] Oct 18 23:03:28 node1.ceph ceph-osd: 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions const&, rocksdb::BlockIter*, rocksdb::BlockBasedTable::CachableEntry<rocksdb::BlockBasedTable::IndexReader>*)+0x466) [0x55a2b5063ca6] Oct 18 23:03:28 node1.ceph ceph-osd: 5: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x297) [0x55a2b50646f7] Oct 18 23:03:28 node1.ceph ceph-osd: 6: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*, bool, int)+0x2a4) [0x55a2b5124554] Oct 18 23:03:28 node1.ceph ceph-osd: 7: (rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*, unsigned long*)+0x810) [0x55a2b5027c40] Oct 18 23:03:28 node1.ceph ceph-osd: 8: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*, bool*)+0x5a4) [0x55a2b50d2ab4] Oct 18 23:03:28 node1.ceph ceph-osd: 9: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*)+0x19) [0x55a2b50d3039] Oct 18 23:03:28 node1.ceph ceph-osd: 10: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, std::string*)+0x95) [0x55a2b50d7055] Oct 18 23:03:28 node1.ceph ceph-osd: 11: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, std::string*)+0x4a) [0x55a2b50d64fa] Oct 18 23:03:28 node1.ceph ceph-osd: 12: (RocksDBStore::get(std::string const&, std::string const&, ceph::buffer::list*)+0x142) [0x55a2b4c46342] Oct 18 23:03:28 node1.ceph ceph-osd: 13: (DBObjectMap::_lookup_map_header(DBObjectMap::MapHeaderLock const&, ghobject_t const&)+0x5ef) [0x55a2b4c6fa0f] Oct 18 23:03:28 node1.ceph ceph-osd: 14: (DBObjectMap::clear(ghobject_t const&, SequencerPosition const*)+0x75) [0x55a2b4c70f15] Oct 18 23:03:28 node1.ceph ceph-osd: 15: (FileStore::lfn_unlink(coll_t const&, ghobject_t const&, SequencerPosition const&, bool)+0x23b) [0x55a2b4afeccb] Oct 18 23:03:28 node1.ceph ceph-osd: 16: (FileStore::_remove(coll_t const&, ghobject_t const&, SequencerPosition const&)+0x83) [0x55a2b4aff753] Oct 18 23:03:28 node1.ceph ceph-osd: 17: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x1ce7) [0x55a2b4b14b27] Oct 18 23:03:28 node1.ceph ceph-osd: 18: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x55a2b4b18b0b] Oct 18 23:03:28 node1.ceph ceph-osd: 19: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x3fa) [0x55a2b4b18f3a] Oct 18 23:03:28 node1.ceph ceph-osd: 20: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x55a2b4d35b0e] Oct 18 23:03:28 node1.ceph ceph-osd: 21: (ThreadPool::WorkThread::entry()+0x10) [0x55a2b4d369f0] Oct 18 23:03:28 node1.ceph ceph-osd: 22: (()+0x7e25) [0x7f0859ee7e25] Oct 18 23:03:28 node1.ceph ceph-osd: 23: (clone()+0x6d) [0x7f0858fdb34d] Oct 18 23:03:28 node1.ceph ceph-osd: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Oct 18 23:03:28 node1.ceph systemd: ceph-osd@1.service: main process exited, code=killed, status=11/SEGV Oct 18 23:03:28 node1.ceph systemd: Unit ceph-osd@1.service entered failed state. Oct 18 23:03:28 node1.ceph systemd: ceph-osd@1.service failed. Oct 18 23:03:48 node1.ceph systemd: ceph-osd@1.service holdoff time over, scheduling restart. Oct 18 23:03:48 node1.ceph systemd: Starting Ceph object storage daemon osd.1... Oct 18 23:03:48 node1.ceph systemd: Started Ceph object storage daemon osd.1. Oct 18 23:03:48 node1.ceph ceph-osd: 2017-10-18 23:03:48.827569 7f1da201fd00 -1 Public network was set, but cluster network was not set Oct 18 23:03:48 node1.ceph ceph-osd: 2017-10-18 23:03:48.827576 7f1da201fd00 -1 Using public network also for cluster network Oct 18 23:03:48 node1.ceph ceph-osd: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal Oct 18 23:03:53 node1.ceph ceph-osd: 2017-10-18 23:03:53.334206 7f1da201fd00 -1 osd.1 27929 log_to_monitors {default=true}
Updated by Sage Weil over 6 years ago
- Status changed from New to Duplicate
COmment out hte jemalloc line in /etc/{default,sysconfig}/ceph
Updated by Sage Weil over 6 years ago
- Related to Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc added
Actions