Actions
Bug #37733
closedos/bluestore: fixup access a destroy cond cause deadlock or undefine behaviors
Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Description
1. osd has been mark down because of on heartbeat
2. gdb attach, found thread hung by _lock_lock_wait(gdb) t 2
[Switching to thread 2 (Thread 0x7f6af1c87700 (LWP 1543))]
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135 2: movl %edx, %eax
3. cond._data.__lock is 2, this will cause hung
(gdb) p cond
$7 = {_M_cond = {__data = {__lock = 2, __futex = 0, __total_seq = 18446744073709551615, __wakeup_seq = 94435500207862, __woken_seq = 0, __mutex = 0xe7d39a28b922f100, __nwaiters = 0, __broadcast_seq = 0},
__size = "\002\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377\366\256qz\343U", '\000' <repeats 11 times>, "\361\"\271(\232\323\347\000\000\000\000\000\000\000", __align = 2}}
4. i have a test, the main logic is: when destory a cond, then cond.__data.__lock will be set to 1. means that we should not using this cond until we recall pthread_cond_init. if we access the destoried cond, then casue __lock_lock_wait
Updated by Sage Weil over 5 years ago
- Status changed from New to Fix Under Review
Updated by Sage Weil over 5 years ago
- Priority changed from Normal to High
- Backport set to luminous,mimic
Updated by Neha Ojha about 5 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler about 5 years ago
- Copied to Backport #38142: luminous: os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviors added
Updated by Nathan Cutler about 5 years ago
- Copied to Backport #38143: mimic: os/bluestore: fixup access a destroy cond cause deadlock or undefine behaviors added
Updated by Nathan Cutler about 5 years ago
- Status changed from Pending Backport to Resolved
Actions