Bug #49226
closedlibrbd: refuse to release exclusive lock when removing
0%
Description
Commit [1] changed PreRemoveRequest to request exclusive lock from the peer instead of giving up and proceeding without exclusive lock. This caused one of the test cases that sometimes runs concurrent "rbd rm" against the same image to fail intermittently, most often on assert
template <typename I> class C_RemoveObject : public C_AsyncObjectThrottle<I> { public: C_RemoveObject(AsyncObjectThrottle<I> &throttle, I *image_ctx, uint64_t object_no) : C_AsyncObjectThrottle<I>(throttle, *image_ctx), m_object_no(object_no) { } int send() override { I &image_ctx = this->m_image_ctx; ceph_assert(ceph_mutex_is_locked(image_ctx.owner_lock)); ceph_assert(image_ctx.exclusive_lock == nullptr || image_ctx.exclusive_lock->is_lock_owner()); <---- { std::shared_lock image_locker{image_ctx.image_lock}; if (image_ctx.object_map != nullptr && !image_ctx.object_map->object_may_exist(m_object_no)) { return 1; } }
because exclusive lock is now automatically transitioned to another "rbd rm" on its request.
The root cause is older and probably goes back to when synchronous librbd::remove() which held owner_lock across all operations including trim_image() was converted to a set of state machines, starting in 2017 [2]. Since then, any peer that requests exclusive lock (instead of trying once and backing off) is able to mess with image removal.
[1] https://github.com/ceph/ceph/commit/25c2ffe145becf6e32dd88682673f9761ee62fa8
[2] https://github.com/ceph/ceph/commit/10a012f1dee91b781d85be5b5121b473e5e257ef
Updated by Ilya Dryomov about 3 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 39375
Updated by Jason Dillaman about 3 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot about 3 years ago
- Copied to Backport #49257: octopus: librbd: refuse to release exclusive lock when removing added
Updated by Backport Bot about 3 years ago
- Copied to Backport #49258: nautilus: librbd: refuse to release exclusive lock when removing added
Updated by Ilya Dryomov about 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".