Project

General

Profile

Bug #51644

don't assert on bogus CEPH_OSD_ZERO request

Added by Jeff Layton over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I was testing some changes to the kclient, and was able to crash the OSD with a stack trace like this:

Jul 13 09:44:35 cephadm1 ceph-osd[3809]: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-5828-gbe>
                                         /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-5828-gbe>

                                          ceph version 17.0.0-5828-gbe8cd9ca (be8cd9caa37babf5884c275ec525e7b316e3669a) quincy (dev)
                                          1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x5616b94fe596]
                                          2: /usr/bin/ceph-osd(+0x5bf7b7) [0x5616b94fe7b7]
                                          3: (PrimaryLogPG::do_osd_ops(PrimaryLogPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0xf979) [0x5616b97637a9]
                                          4: (PrimaryLogPG::prepare_transaction(PrimaryLogPG::OpContext*)+0x177) [0x5616b9768e07]
                                          5: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x31d) [0x5616b976aeed]
                                          6: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2dbb) [0x5616b977497b]
                                          7: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xd1c) [0x5616b977bb6c]
                                          8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x5616b9604ec9]
                                          9: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5616b9867c48]
                                          10: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xc28) [0x5616b96219b8]
                                          11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x5616b9cbe5b4]
                                          12: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5616b9cbf954]
                                          13: (Thread::_entry_func(void*)+0xd) [0x5616b9ca567d]
                                          14: /lib64/libpthread.so.0(+0x814a) [0x7faba84ad14a]
                                          15: clone()

Instead of asserting in this situation, we should just have it return -EINVAL.

History

#1 Updated by Jeff Layton over 2 years ago

  • Description updated (diff)

#2 Updated by Jeff Layton over 2 years ago

  • Pull request ID set to 42308

#3 Updated by Jeff Layton over 2 years ago

Building an image to test the fix now.

#4 Updated by Neha Ojha over 2 years ago

  • Status changed from New to Fix Under Review

#5 Updated by Kefu Chai over 2 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF