Project

General

Profile

Actions

Bug #38375

open

OSD segmentation fault on rbd create

Added by Ryan Farrington about 5 years ago. Updated over 2 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Random segfault when attempting to create an rbd on an erasure encoded pool.

ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)

ceph-post-file: b06c638e-7b96-43dc-be78-b827171e71de

2019-02-18 02:43:55.364337 9c8617e0 -1 *** Caught signal (Segmentation fault) **
 in thread 9c8617e0 thread_name:tp_osd_tp

 ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
 1: (()+0x793188) [0xc44188]
 2: (()+0x24fd0) [0xb68ddfd0]
 3: (hobject_t::hobject_t(hobject_t const&)+0x9) [0x88d036]
 4: (std::_Rb_tree_iterator<std::pair<hobject_t const, interval_map<unsigned long long, ceph::buffer::list, bl_split_merge> > > std::_Rb_tree<hobject_t, std::pair<hobject_t const, interval_map<unsigned long long, ceph::buffer::list, bl_$
 5: (()+0x5f081a) [0xaa181a]
 6: (ECTransaction::generate_transactions(ECTransaction::WritePlan&, std::shared_ptr<ceph::ErasureCodeInterface>&, pg_t, bool, ECUtil::stripe_info_t const&, std::map<hobject_t, interval_map<unsigned long long, ceph::buffer::list, bl_spl$
 7: (ECBackend::try_reads_to_commit()+0x3ef) [0xa89fe0]
 8: (ECBackend::check_ops()+0x13) [0xa8ceb0]
 9: (ECBackend::start_rmw(ECBackend::Op*, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&)+0xa15) [0xa93e6a]
 10: (ECBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, $
 11: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0x58b) [0x95f1b4]
 12: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0xc89) [0x99353e]
 13: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x25f9) [0x99635a]
 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xae1) [0x967aea]
 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x261) [0x85d4fa]
 16: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x4b) [0xa28e34]
 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x961) [0x87a2c6]
 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5bb) [0xc76e84]
 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x9) [0xc78da6]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

I uploaded the osd log and the objdump for the ceph-osd

Actions #1

Updated by Brad Hubbard about 5 years ago

  • Project changed from Ceph to RADOS
  • Category changed from chef to Correctness/Safety
  • Priority changed from Normal to High
Actions #2

Updated by Neha Ojha over 3 years ago

  • Status changed from New to Need More Info

Seems like we have lost the ceph-post-file due one of the lab incidents.

Actions #3

Updated by Neha Ojha over 2 years ago

  • Priority changed from High to Normal
Actions #4

Updated by Ryan Farrington over 2 years ago

I do not have the files to reupload so might be worth closing this out as I have moved on to another release and this issue is not present.

Actions

Also available in: Atom PDF