Bug #51644
don't assert on bogus CEPH_OSD_ZERO request
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I was testing some changes to the kclient, and was able to crash the OSD with a stack trace like this:
Jul 13 09:44:35 cephadm1 ceph-osd[3809]: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-5828-gbe> /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-5828-gbe> ceph version 17.0.0-5828-gbe8cd9ca (be8cd9caa37babf5884c275ec525e7b316e3669a) quincy (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x5616b94fe596] 2: /usr/bin/ceph-osd(+0x5bf7b7) [0x5616b94fe7b7] 3: (PrimaryLogPG::do_osd_ops(PrimaryLogPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0xf979) [0x5616b97637a9] 4: (PrimaryLogPG::prepare_transaction(PrimaryLogPG::OpContext*)+0x177) [0x5616b9768e07] 5: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x31d) [0x5616b976aeed] 6: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2dbb) [0x5616b977497b] 7: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xd1c) [0x5616b977bb6c] 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x5616b9604ec9] 9: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x5616b9867c48] 10: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xc28) [0x5616b96219b8] 11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x5616b9cbe5b4] 12: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5616b9cbf954] 13: (Thread::_entry_func(void*)+0xd) [0x5616b9ca567d] 14: /lib64/libpthread.so.0(+0x814a) [0x7faba84ad14a] 15: clone()
Instead of asserting in this situation, we should just have it return -EINVAL.
History
#1 Updated by Jeff Layton over 2 years ago
- Description updated (diff)
#2 Updated by Jeff Layton over 2 years ago
- Pull request ID set to 42308
#3 Updated by Jeff Layton over 2 years ago
Building an image to test the fix now.
#4 Updated by Neha Ojha over 2 years ago
- Status changed from New to Fix Under Review
#5 Updated by Kefu Chai over 2 years ago
- Status changed from Fix Under Review to Resolved