Bug #21171
closedbluestore: aio submission deadlock
0%
Description
- thread a holds deferred_submit_lock, blocks on aio submission (queue is full)
- thread b holds deferred_lock, blocks taking deferred_submit_lock
- aio completion handler blocks on deferred_lock, cannot drain aio queue.
Updated by Sage Weil over 6 years ago
- Status changed from 12 to Fix Under Review
Updated by Joao Eduardo Luis over 6 years ago
Sage, is there an identifiable behavior when this happens? Do the osds die, or is IO simply forever blocked?
Updated by Sage Weil over 6 years ago
- Related to Bug #21246: bluestore: hang while replaying deferred ios from journal added
Updated by Sage Weil over 6 years ago
- Related to Bug #21180: Bluestore throttler causes down OSD added
Updated by Sage Weil over 6 years ago
There wsa also an aio submission bug that dropped ios on the floor. it was consistently reproducible with
make ceph_test_objectstore && rm -rf bluestore*test*dir c && CEPH_ARGS="--log-file c --no-log-to-stderr --debug-bluestore 20 --debug-bdev 20 --bdev-debug-aio --bdev-aio-max-queue-depth 16 --bluestore-cache-trim-interval .05" bin/ceph_test_objectstore --gtest_filter=*Syn*/2 --gtest_filter=ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2
on an nvme. that bug is also fixed by the pr.
Updated by Sage Weil over 6 years ago
- Status changed from Fix Under Review to Pending Backport
https://github.com/ceph/ceph/pull/17601 is teh backport
Updated by Sage Weil over 6 years ago
- Related to Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC added
Updated by Nathan Cutler over 6 years ago
- Copied to Backport #21325: luminous: bluestore: aio submission deadlock added
Updated by Sage Weil over 6 years ago
- Status changed from Pending Backport to Resolved
Updated by Sage Weil over 6 years ago
- Related to Bug #19511: bluestore overwhelms aio queue added
Updated by Bob Bobington over 6 years ago
Since my issue (http://tracker.ceph.com/issues/21314) was marked as a dupe of this and I haven't received a response to the updates on that issue in a week, thought I'd add here as well: The fixes given haven't lead to any improvement for me. I still consistently see the same problems.
I've tried applying this fix as well as adding some of the workarounds suggested but my OSDs still crash with the same messages.
Updated by Sage Weil over 6 years ago
- Related to Bug #20222: v12.0.3 Luminous bluestore 'tp_osd_tp thread tp_osd_tp' had timed out after 60 added
Updated by Sage Weil over 6 years ago
- Related to Bug #21475: 12.2.0 bluestore - OSD down/crash " internal heartbeat not healthy, dropping ping request added