Project

General

Profile

Actions

Bug #46381

open

pg down error and osd failed by :Objecter::_op_submit_with_budget

Added by Amine Liu almost 4 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
_op_submit_with_budget
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ENV:
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.2.1511 (Core)
Release: 7.2.1511
Codename: Core
ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
ssd cache tier + data

logs:

2020-07-06 18:40:14.446452 7f83acac5700 1 -- 172.26.224.197:6806/2876125 >> 172.26.224.165:0/500874 conn(0x7f83d0798000 sd=1393 :6806 s=STATE_OPEN pgs=12559 cs=1 l=1). tx 0x7f83cf667c00 osd_ping(ping_reply e45014 stamp 2020-07-06 18:40:14.448379) v2
2020-07-06 18:40:14.446495 7f83acac5700 10 osd.24 45014 note_peer_epoch osd.14 has 45014
2020-07-06 18:40:14.446504 7f83acac5700 20 osd.24 45014 share_map_peer 0x7f83d07b0800 already has epoch 45014
2020-07-06 18:40:14.483186 7f8385487700 -1 ** Caught signal (Aborted) *
in thread 7f8385487700 thread_name:tp_osd_tp

ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
1: (()+0x91341a) [0x7f83b5e8b41a]
2: (()+0xf100) [0x7f83b3ec1100]
3: (gsignal()+0x37) [0x7f83b24835f7]
4: (abort()+0x148) [0x7f83b2484ce8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x267) [0x7f83b5f8b797]
6: (Throttle::take(long)+0x2fc) [0x7f83b5f6f73c]
7: (Objecter::_op_submit_with_budget(Objecter::Op*, ceph::shunique_lock<boost::shared_mutex>&, unsigned long*, int*)+0x2d2) [0x7f83b5b48652]
8: (Objecter::op_submit(Objecter::Op*, unsigned long*, int*)+0xed) [0x7f83b5b488dd]
9: (ReplicatedPG::_copy_some(std::shared_ptr<ObjectContext>, std::shared_ptr<ReplicatedPG::CopyOp>)+0xfb0) [0x7f83b5a42420]
10: (ReplicatedPG::start_copy(ReplicatedPG::CopyCallback*, std::shared_ptr<ObjectContext>, hobject_t, object_locator_t, unsigned long, unsigned int, bool, unsigned int, unsigned int)+0x682) [0x7f83b5a433c2]
11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0xa34a) [0x7f83b5a7ccca]
12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0xbf) [0x7f83b5a8a83f]
13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x8f0) [0x7f83b5a8b6f0]
14: (ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)+0x3740) [0x7f83b5a90560]
15: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x747) [0x7f83b5a4be57]
16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x41d) [0x7f83b5900a8d]
17: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>&)+0x6d) [0x7f83b5900cdd]
18: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x869) [0x7f83b5905809]
19: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x887) [0x7f83b5f7b557]
20: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f83b5f7d4c0]
21: (()+0x7dc5) [0x7f83b3eb9dc5]
22: (clone()+0x6d) [0x7f83b2544ced]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Files

osd.24.debug_07061841.txt (77.5 KB) osd.24.debug_07061841.txt osd error log Amine Liu, 07/07/2020 06:41 AM
sxceph5_pg_down2.png (181 KB) sxceph5_pg_down2.png ceph status Amine Liu, 07/07/2020 06:46 AM
Actions #1

Updated by Greg Farnum almost 3 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (Objecter)
Actions

Also available in: Atom PDF