Bug #58673: When bucket index ops are cancelled it can leave behind zombie index entries - rgw - Ceph

Actions

Copy link

Bug #58673

closed

When bucket index ops are cancelled it can leave behind zombie index entries

Added by Cory Snyder about 1 year ago. Updated 9 months ago.

Status:

Resolved

Priority:

High

Assignee:

Casey Bodley

Target version:

% Done:

100%

Source:

Tags:

cls_rgw backport_processed

Backport:

quincy,pacific

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

50041

Crash signature (v1):

Crash signature (v2):

Description

We discovered that there were a significant number of extra bucket index entries for some of our buckets and found that these entries all pointed to objects which no longer existed. In our case, we traced this back to a scenario where a particular client commonly issues multiple simultaneous delete requests for the same object keys. The first racing delete request succeeds, but the second on results in an ECANCELED error due to a failed cmpxattr check [1] set by a prepare_atomic_modification call [2]. The ECANCELED error causes the index op to be canceled [3], but the osd cls logic for index op cancellation doesn't remove the index entry. The zombie index entry is never cleaned up. It looks like this could possibly manifest itself in other scenarios as well, whenever an index op is canceled for an index entry that otherwise shouldn't exist and has no other pending modifications.

[1] https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_rados.cc#L5833
[2] https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_rados.cc#L5254
[3] https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_rados.cc#L5293

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rgw

Custom queries

Bug #58673

When bucket index ops are cancelled it can leave behind zombie index entries

Updated by Casey Bodley about 1 year ago

Updated by J. Eric Ivancich about 1 year ago

Updated by Backport Bot about 1 year ago

Updated by Backport Bot about 1 year ago

Updated by Backport Bot about 1 year ago

Updated by Cory Snyder 11 months ago

Updated by Konstantin Shalygin 9 months ago