Bug #56549
ImageWatcher race condition
0%
Description
ImageWatcher hits race condition when snap_remove process happens during a notify request lock for exclusive lock.
1. RBD mirror detects newer snapshot and queues the removal of old snapshot for operation #1 (exclusive lock).
2. The current lock owner doesn't support snap removal.
3. Another operation #2 is queued (exclusive lock) request is made for same image. This portion of the code checks for is_lock_owner and isn't so it continues code path to request exclusive lock.
4. Operation #1 is processed. It gets exclusive lock and performs snap removal.
5. Operation #2 is processed. The handle portion of the callback still expects to be non lock owner via another is_lock_owner check. But due to timing of #4 it currently is lock owner and ceph_assert/crash happens.
Related issues
History
#1 Updated by Christopher Hoffman about 1 year ago
- Pull request ID set to 47116
#2 Updated by Christopher Hoffman about 1 year ago
- Status changed from In Progress to Fix Under Review
#3 Updated by Ilya Dryomov about 1 year ago
- Backport set to octopus,pacific,quincy
#4 Updated by Ilya Dryomov about 1 year ago
- Status changed from Fix Under Review to Pending Backport
#5 Updated by Backport Bot about 1 year ago
- Copied to Backport #56617: octopus: ImageWatcher race condition added
#6 Updated by Backport Bot about 1 year ago
- Copied to Backport #56618: pacific: ImageWatcher race condition added
#7 Updated by Backport Bot about 1 year ago
- Copied to Backport #56619: quincy: ImageWatcher race condition added
#8 Updated by Backport Bot about 1 year ago
- Tags set to backport_processed
#9 Updated by Christopher Hoffman 8 months ago
- Status changed from Pending Backport to Resolved