Project

General

Profile

Bug #21826

Filestore OSDs start segfaulting

Added by Patrick Fruh over 4 years ago. Updated about 4 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Since upgrading to luminous 12.1.0 some of my filestore OSDs regularily start segfaulting/flapping during the nightly scrubbing time and are automatically set to out.
I then have to delete them from the cluster and create them as new OSDs, since they won't start anymore (same segfaults).

(I have not upgraded them to bluestore since I don't know if the bug I reported at http://tracker.ceph.com/issues/21416 is fixed yet)

As you can see in the logs, there are 2 different segfaults, these 2 are the ones which happen every couple of days on one or some of my OSDs of my 6 hosts during the nightly scrubbing time.

Oct 18 05:36:22 node1.ceph systemd[1]: Starting Ceph object storage daemon osd.1...
Oct 18 05:36:22 node1.ceph systemd[1]: Started Ceph object storage daemon osd.1.
Oct 18 05:36:22 node1.ceph ceph-osd[4844]: 2017-10-18 05:36:22.822452 7f6e99fc8d00 -1 Public network was set, but cluster network was not set
Oct 18 05:36:22 node1.ceph ceph-osd[4844]: 2017-10-18 05:36:22.822462 7f6e99fc8d00 -1     Using public network also for cluster network
Oct 18 05:36:22 node1.ceph ceph-osd[4844]: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: *** Caught signal (Segmentation fault) **
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: in thread 7f6e99fc8d00 thread_name:ceph-osd
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 1: (()+0xa29511) [0x562c98e7c511]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 2: (()+0xf5e0) [0x7f6e971cb5e0]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 3: (()+0x1cdff) [0x7f6e99babdff]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 4: (rocksdb::Arena::AllocateNewBlock(unsigned long)+0x5b) [0x562c9921987b]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 5: (rocksdb::Arena::AllocateFallback(unsigned long, bool)+0x47) [0x562c99219937]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 6: (rocksdb::DoGenerateLevelFilesBrief(rocksdb::LevelFilesBrief*, std::vector<rocksdb::FileMetaData*, std::allocator<rocksdb::FileMetaData*> > const&, rocksdb::Arena*)+0x139) [0x562c991a8139]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 7: (rocksdb::VersionStorageInfo::GenerateLevelFilesBrief()+0x94) [0x562c991b0134]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 8: (rocksdb::Version::PrepareApply(rocksdb::MutableCFOptions const&, bool)+0x76) [0x562c991b3416]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 9: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, rocksdb::autovector<rocksdb::VersionEdit*, 8ul> const&, rocksdb::InstrumentedMutex*, rocksdb::Directory*, bool, rocksdb::ColumnFamilyOptions const*)+0x102a) [0x562c991b625a]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 10: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x1ab3) [0x562c99184b03]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 11: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0x7e6) [0x562c991858d6]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0xed3) [0x562c99186b93]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 13: (rocksdb::DB::Open(rocksdb::Options const&, std::string const&, rocksdb::DB**)+0x186) [0x562c99187dd6]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 14: (RocksDBStore::do_open(std::ostream&, bool)+0x8db) [0x562c98dd78fb]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 15: (RocksDBStore::create_and_open(std::ostream&)+0x7a) [0x562c98dd91fa]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 16: (FileStore::mount()+0x274f) [0x562c98c913ef]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 17: (OSD::init()+0x3ba) [0x562c9895ac0a]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 18: (main()+0x2def) [0x562c98861fff]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 19: (__libc_start_main()+0xf5) [0x7f6e961e0c05]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 20: (()+0x4ad796) [0x562c98900796]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 2017-10-18 05:36:24.376128 7f6e99fc8d00 -1 *** Caught signal (Segmentation fault) **
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: in thread 7f6e99fc8d00 thread_name:ceph-osd
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 1: (()+0xa29511) [0x562c98e7c511]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 2: (()+0xf5e0) [0x7f6e971cb5e0]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 3: (()+0x1cdff) [0x7f6e99babdff]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 4: (rocksdb::Arena::AllocateNewBlock(unsigned long)+0x5b) [0x562c9921987b]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 5: (rocksdb::Arena::AllocateFallback(unsigned long, bool)+0x47) [0x562c99219937]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 6: (rocksdb::DoGenerateLevelFilesBrief(rocksdb::LevelFilesBrief*, std::vector<rocksdb::FileMetaData*, std::allocator<rocksdb::FileMetaData*> > const&, rocksdb::Arena*)+0x139) [0x562c991a8139]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 7: (rocksdb::VersionStorageInfo::GenerateLevelFilesBrief()+0x94) [0x562c991b0134]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 8: (rocksdb::Version::PrepareApply(rocksdb::MutableCFOptions const&, bool)+0x76) [0x562c991b3416]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 9: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, rocksdb::autovector<rocksdb::VersionEdit*, 8ul> const&, rocksdb::InstrumentedMutex*, rocksdb::Directory*, bool, rocksdb::ColumnFamilyOptions const*)+0x102a) [0x562c991b625a]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 10: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x1ab3) [0x562c99184b03]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 11: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0x7e6) [0x562c991858d6]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0xed3) [0x562c99186b93]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 13: (rocksdb::DB::Open(rocksdb::Options const&, std::string const&, rocksdb::DB**)+0x186) [0x562c99187dd6]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 14: (RocksDBStore::do_open(std::ostream&, bool)+0x8db) [0x562c98dd78fb]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 15: (RocksDBStore::create_and_open(std::ostream&)+0x7a) [0x562c98dd91fa]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 16: (FileStore::mount()+0x274f) [0x562c98c913ef]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 17: (OSD::init()+0x3ba) [0x562c9895ac0a]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 18: (main()+0x2def) [0x562c98861fff]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 19: (__libc_start_main()+0xf5) [0x7f6e961e0c05]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 20: (()+0x4ad796) [0x562c98900796]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: -185> 2017-10-18 05:36:22.822452 7f6e99fc8d00 -1 Public network was set, but cluster network was not set
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: -184> 2017-10-18 05:36:22.822462 7f6e99fc8d00 -1     Using public network also for cluster network
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 0> 2017-10-18 05:36:24.376128 7f6e99fc8d00 -1 *** Caught signal (Segmentation fault) **
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: in thread 7f6e99fc8d00 thread_name:ceph-osd
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 1: (()+0xa29511) [0x562c98e7c511]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 2: (()+0xf5e0) [0x7f6e971cb5e0]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 3: (()+0x1cdff) [0x7f6e99babdff]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 4: (rocksdb::Arena::AllocateNewBlock(unsigned long)+0x5b) [0x562c9921987b]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 5: (rocksdb::Arena::AllocateFallback(unsigned long, bool)+0x47) [0x562c99219937]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 6: (rocksdb::DoGenerateLevelFilesBrief(rocksdb::LevelFilesBrief*, std::vector<rocksdb::FileMetaData*, std::allocator<rocksdb::FileMetaData*> > const&, rocksdb::Arena*)+0x139) [0x562c991a8139]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 7: (rocksdb::VersionStorageInfo::GenerateLevelFilesBrief()+0x94) [0x562c991b0134]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 8: (rocksdb::Version::PrepareApply(rocksdb::MutableCFOptions const&, bool)+0x76) [0x562c991b3416]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 9: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, rocksdb::autovector<rocksdb::VersionEdit*, 8ul> const&, rocksdb::InstrumentedMutex*, rocksdb::Directory*, bool, rocksdb::ColumnFamilyOptions const*)+0x102a) [0x562c991b625a]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 10: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x1ab3) [0x562c99184b03]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 11: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0x7e6) [0x562c991858d6]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0xed3) [0x562c99186b93]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 13: (rocksdb::DB::Open(rocksdb::Options const&, std::string const&, rocksdb::DB**)+0x186) [0x562c99187dd6]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 14: (RocksDBStore::do_open(std::ostream&, bool)+0x8db) [0x562c98dd78fb]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 15: (RocksDBStore::create_and_open(std::ostream&)+0x7a) [0x562c98dd91fa]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 16: (FileStore::mount()+0x274f) [0x562c98c913ef]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 17: (OSD::init()+0x3ba) [0x562c9895ac0a]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 18: (main()+0x2def) [0x562c98861fff]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 19: (__libc_start_main()+0xf5) [0x7f6e961e0c05]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: 20: (()+0x4ad796) [0x562c98900796]
Oct 18 05:36:24 node1.ceph ceph-osd[4844]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Oct 18 05:36:24 node1.ceph systemd[1]: ceph-osd@1.service: main process exited, code=killed, status=11/SEGV
Oct 18 05:36:24 node1.ceph systemd[1]: Unit ceph-osd@1.service entered failed state.
Oct 18 05:36:24 node1.ceph systemd[1]: ceph-osd@1.service failed.
Oct 18 05:36:44 node1.ceph systemd[1]: ceph-osd@1.service holdoff time over, scheduling restart.
Oct 18 05:36:44 node1.ceph systemd[1]: Starting Ceph object storage daemon osd.1...
Oct 18 05:36:44 node1.ceph systemd[1]: Started Ceph object storage daemon osd.1.
Oct 18 05:36:44 node1.ceph ceph-osd[4935]: 2017-10-18 05:36:44.561720 7f03c0fd7d00 -1 Public network was set, but cluster network was not set
Oct 18 05:36:44 node1.ceph ceph-osd[4935]: 2017-10-18 05:36:44.561730 7f03c0fd7d00 -1     Using public network also for cluster network
Oct 18 05:36:44 node1.ceph ceph-osd[4935]: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: *** Caught signal (Segmentation fault) **
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: in thread 7f03b13fe700 thread_name:ceph-osd
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 1: (()+0xa29511) [0x56470aaa6511]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 2: (()+0xf5e0) [0x7f03be1da5e0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 3: (()+0x1cdff) [0x7f03c0bbadff]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions const&, rocksdb::BlockIter*, rocksdb::BlockBasedTable::CachableEntry<rocksdb::BlockBasedTable::IndexReader>*)+0x466) [0x56470ae19ca6]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 5: (rocksdb::BlockBasedTable::Open(rocksdb::ImmutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::BlockBasedTableOptions const&, rocksdb::InternalKeyComparator const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool, bool, int)+0xe3c) [0x56470ae1f66c]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 6: (rocksdb::BlockBasedTableFactory::NewTableReader(rocksdb::TableReaderOptions const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool) const+0x51) [0x56470ae13771]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 7: (rocksdb::TableCache::GetTableReader(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, bool, unsigned long, bool, rocksdb::HistogramImpl*, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool, int, bool)+0x215) [0x56470aed9495]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 8: (rocksdb::TableCache::FindTable(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Cache::Handle**, bool, bool, rocksdb::HistogramImpl*, bool, int, bool)+0x2b0) [0x56470aed9aa0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 9: (std::_Function_handler<void (), rocksdb::VersionBuilder::Rep::LoadTableHandlers(rocksdb::InternalStats*, int, bool)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0xa2) [0x56470aedf032]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 10: (()+0xb52b0) [0x7f03bdb5e2b0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 11: (()+0x7e25) [0x7f03be1d2e25]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 12: (clone()+0x6d) [0x7f03bd2c634d]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 2017-10-18 05:36:45.442100 7f03b13fe700 -1 *** Caught signal (Segmentation fault) **
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: in thread 7f03b13fe700 thread_name:ceph-osd
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 1: (()+0xa29511) [0x56470aaa6511]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 2: (()+0xf5e0) [0x7f03be1da5e0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 3: (()+0x1cdff) [0x7f03c0bbadff]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions const&, rocksdb::BlockIter*, rocksdb::BlockBasedTable::CachableEntry<rocksdb::BlockBasedTable::IndexReader>*)+0x466) [0x56470ae19ca6]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 5: (rocksdb::BlockBasedTable::Open(rocksdb::ImmutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::BlockBasedTableOptions const&, rocksdb::InternalKeyComparator const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool, bool, int)+0xe3c) [0x56470ae1f66c]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 6: (rocksdb::BlockBasedTableFactory::NewTableReader(rocksdb::TableReaderOptions const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool) const+0x51) [0x56470ae13771]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 7: (rocksdb::TableCache::GetTableReader(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, bool, unsigned long, bool, rocksdb::HistogramImpl*, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool, int, bool)+0x215) [0x56470aed9495]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 8: (rocksdb::TableCache::FindTable(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Cache::Handle**, bool, bool, rocksdb::HistogramImpl*, bool, int, bool)+0x2b0) [0x56470aed9aa0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 9: (std::_Function_handler<void (), rocksdb::VersionBuilder::Rep::LoadTableHandlers(rocksdb::InternalStats*, int, bool)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0xa2) [0x56470aedf032]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 10: (()+0xb52b0) [0x7f03bdb5e2b0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 11: (()+0x7e25) [0x7f03be1d2e25]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 12: (clone()+0x6d) [0x7f03bd2c634d]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: -180> 2017-10-18 05:36:44.561720 7f03c0fd7d00 -1 Public network was set, but cluster network was not set
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: -179> 2017-10-18 05:36:44.561730 7f03c0fd7d00 -1     Using public network also for cluster network
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 0> 2017-10-18 05:36:45.442100 7f03b13fe700 -1 *** Caught signal (Segmentation fault) **
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: in thread 7f03b13fe700 thread_name:ceph-osd
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 1: (()+0xa29511) [0x56470aaa6511]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 2: (()+0xf5e0) [0x7f03be1da5e0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 3: (()+0x1cdff) [0x7f03c0bbadff]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions const&, rocksdb::BlockIter*, rocksdb::BlockBasedTable::CachableEntry<rocksdb::BlockBasedTable::IndexReader>*)+0x466) [0x56470ae19ca6]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 5: (rocksdb::BlockBasedTable::Open(rocksdb::ImmutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::BlockBasedTableOptions const&, rocksdb::InternalKeyComparator const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool, bool, int)+0xe3c) [0x56470ae1f66c]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 6: (rocksdb::BlockBasedTableFactory::NewTableReader(rocksdb::TableReaderOptions const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool) const+0x51) [0x56470ae13771]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 7: (rocksdb::TableCache::GetTableReader(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, bool, unsigned long, bool, rocksdb::HistogramImpl*, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool, int, bool)+0x215) [0x56470aed9495]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 8: (rocksdb::TableCache::FindTable(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Cache::Handle**, bool, bool, rocksdb::HistogramImpl*, bool, int, bool)+0x2b0) [0x56470aed9aa0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 9: (std::_Function_handler<void (), rocksdb::VersionBuilder::Rep::LoadTableHandlers(rocksdb::InternalStats*, int, bool)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0xa2) [0x56470aedf032]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 10: (()+0xb52b0) [0x7f03bdb5e2b0]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 11: (()+0x7e25) [0x7f03be1d2e25]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: 12: (clone()+0x6d) [0x7f03bd2c634d]
Oct 18 05:36:45 node1.ceph ceph-osd[4935]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Oct 18 05:36:45 node1.ceph systemd[1]: ceph-osd@1.service: main process exited, code=killed, status=11/SEGV
Oct 18 05:36:45 node1.ceph systemd[1]: Unit ceph-osd@1.service entered failed state.
Oct 18 05:36:45 node1.ceph systemd[1]: ceph-osd@1.service failed.
Oct 18 05:37:05 node1.ceph systemd[1]: ceph-osd@1.service holdoff time over, scheduling restart.
Oct 18 05:37:05 node1.ceph systemd[1]: start request repeated too quickly for ceph-osd@1.service
Oct 18 05:37:05 node1.ceph systemd[1]: Failed to start Ceph object storage daemon osd.1.
Oct 18 05:37:05 node1.ceph systemd[1]: Unit ceph-osd@1.service entered failed state.
Oct 18 05:37:05 node1.ceph systemd[1]: ceph-osd@1.service failed.


Related issues

Related to bluestore - Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc Closed 07/10/2017

History

#1 Updated by Sage Weil over 4 years ago

  • Related to Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc added

#2 Updated by Sage Weil over 4 years ago

  • Status changed from New to Duplicate

Disable jemalloc in /etc/{sysconfig,default}/ceph

#3 Updated by Adam Kupczyk about 4 years ago

Note.
SIGSEGV from (rocksdb::Arena::AllocateNewBlock(...)+0x5b) is related to invoking malloc_usable_size.

SIGSEGV from (rocksdb::BlockBasedTable::NewIndexIterator(..)+0x466) is caused by invoking "input_iter->SetStatus(s);".

Also available in: Atom PDF