Project

General

Profile

Actions

Bug #22882

closed

Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()

Added by Jason Dillaman over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Thread 50 (Thread 0x7fbd75ff3700 (LWP 26238)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007fbdc5fe08ad in Cond::Wait (mutex=..., this=0x7fbd64041ea0) at /build/ceph-12.2.2-558-g5445703/src/common/Cond.h:48
#2  Throttle::_wait (this=this@entry=0x55683eaa6250, c=c@entry=1) at /build/ceph-12.2.2-558-g5445703/src/common/Throttle.cc:111
#3  0x00007fbdc5fe16d1 in Throttle::get (this=this@entry=0x55683eaa6250, c=c@entry=1, m=m@entry=0) at /build/ceph-12.2.2-558-g5445703/src/common/Throttle.cc:179
#4  0x00007fbdceb2f977 in Objecter::_throttle_op (this=this@entry=0x55683eaa5c60, op=op@entry=0x7fbd6413e090, sul=..., op_budget=op_budget@entry=0)
    at /build/ceph-12.2.2-558-g5445703/src/osdc/Objecter.cc:3296
---Type <return> to continue, or q <return> to quit---
#5  0x00007fbdceb4c7d8 in Objecter::_take_op_budget (sul=..., op=0x7fbd6413e090, this=0x55683eaa5c60) at /build/ceph-12.2.2-558-g5445703/src/osdc/Objecter.h:1980
#6  Objecter::_op_submit_with_budget (this=this@entry=0x55683eaa5c60, op=op@entry=0x7fbd6413e090, sul=..., ptid=ptid@entry=0x7fbd5c053dc0, ctx_budget=ctx_budget@entry=0x0)
    at /build/ceph-12.2.2-558-g5445703/src/osdc/Objecter.cc:2276
#7  0x00007fbdceb4f444 in Objecter::_send_linger (this=this@entry=0x55683eaa5c60, info=info@entry=0x7fbd5c0539d0, sul=...) at /build/ceph-12.2.2-558-g5445703/src/osdc/Objecter.cc:583
#8  0x00007fbdceb4fdff in Objecter::_linger_ops_resend (this=this@entry=0x55683eaa5c60, lresend=std::map with 1 elements = {...}, ul=...)
    at /build/ceph-12.2.2-558-g5445703/src/osdc/Objecter.cc:2095
#9  0x00007fbdceb5047d in Objecter::ms_handle_reset (this=0x55683eaa5c60, con=<optimized out>) at /build/ceph-12.2.2-558-g5445703/src/osdc/Objecter.cc:4405
#10 0x00007fbdc6061c6d in Messenger::ms_deliver_handle_reset (con=0x7fbd6401d3f0, this=0x55683ea1a0f0) at /build/ceph-12.2.2-558-g5445703/src/msg/Messenger.h:741
#11 DispatchQueue::entry (this=0x55683ea1a270) at /build/ceph-12.2.2-558-g5445703/src/msg/DispatchQueue.cc:182
#12 0x00007fbdc615907d in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.2-558-g5445703/src/msg/DispatchQueue.h:101
#13 0x00007fbdd93f16ba in start_thread (arg=0x7fbd75ff3700) at pthread_create.c:333
#14 0x00007fbdd91273dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:74
#15 0x0000000000000000 in ?? ()

The objecter in this case did not have 1024 in-flight ops, so it does appear that there is potentially a leak in tracking the op budget (potentially when socket failures are introduced).

Actions

Also available in: Atom PDF