Project

General

Profile

Actions

Bug #37733

closed

os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviors

Added by bing lin over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

1. osd has been mark down because of on heartbeat
2. gdb attach, found thread hung by _lock_lock_wait

(gdb) t 2
[Switching to thread 2 (Thread 0x7f6af1c87700 (LWP 1543))]
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135     2:      movl    %edx, %eax

3. cond.
_data.__lock is 2, this will cause hung
(gdb) p cond

$7 = {_M_cond = {__data = {__lock = 2, __futex = 0, __total_seq = 18446744073709551615, __wakeup_seq = 94435500207862, __woken_seq = 0, __mutex = 0xe7d39a28b922f100, __nwaiters = 0, __broadcast_seq = 0},

    __size = "\002\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377\366\256qz\343U", '\000' <repeats 11 times>, "\361\"\271(\232\323\347\000\000\000\000\000\000\000", __align = 2}}

4. i have a test, the main logic is: when destory a cond, then cond.__data.__lock will be set to 1. means that we should not using this cond until we recall pthread_cond_init. if we access the destoried cond, then casue __lock_lock_wait


Related issues 2 (0 open2 closed)

Copied to bluestore - Backport #38142: luminous: os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviorsResolvedPrashant DActions
Copied to bluestore - Backport #38143: mimic: os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviorsResolvedPrashant DActions
Actions #1

Updated by Sage Weil over 5 years ago

  • Status changed from New to Fix Under Review
Actions #2

Updated by Sage Weil over 5 years ago

  • Priority changed from Normal to High
  • Backport set to luminous,mimic
Actions #3

Updated by Kefu Chai over 5 years ago

  • Pull request ID set to 25631
Actions #4

Updated by Neha Ojha about 5 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #5

Updated by Nathan Cutler about 5 years ago

  • Copied to Backport #38142: luminous: os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviors added
Actions #6

Updated by Nathan Cutler about 5 years ago

  • Copied to Backport #38143: mimic: os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviors added
Actions #7

Updated by Nathan Cutler about 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF