Bug #58673
closedWhen bucket index ops are cancelled it can leave behind zombie index entries
100%
Description
We discovered that there were a significant number of extra bucket index entries for some of our buckets and found that these entries all pointed to objects which no longer existed. In our case, we traced this back to a scenario where a particular client commonly issues multiple simultaneous delete requests for the same object keys. The first racing delete request succeeds, but the second on results in an ECANCELED error due to a failed cmpxattr check [1] set by a prepare_atomic_modification call [2]. The ECANCELED error causes the index op to be canceled [3], but the osd cls logic for index op cancellation doesn't remove the index entry. The zombie index entry is never cleaned up. It looks like this could possibly manifest itself in other scenarios as well, whenever an index op is canceled for an index entry that otherwise shouldn't exist and has no other pending modifications.
[1] https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_rados.cc#L5833
[2] https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_rados.cc#L5254
[3] https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_rados.cc#L5293