Bug #10739: Lock contention on rgw bucket index object (ondisk_read/write_lock) stuck op threads - Ceph - Ceph

Actions

Copy link

Bug #10739

closed

Lock contention on rgw bucket index object (ondisk_read/write_lock) stuck op threads

Added by Guang Yang about 9 years ago. Updated about 9 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

OSD

Target version:

% Done:

Source:

Community (dev)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

In our production cluster (with rgw), we came across a problem that all rgw process are stuck (all worker threads stuck waiting for response from OSD, start giving 500 response to clients). Dump objecter_requests (on rgw) shows the slow in flight ops were caused by one OSD, that OSD has 2 PGs doing backfilling and it has 2 bucket index objects.

At the OSD side, we have 8 op threads, it turns out at the time when this problem occurred, several (if not all of them) op threads took seconds (even tens of seconds) handling the bucket index op, as a result, the throughput of the op threads dropped, cascade to the entire cluster.

The op has the following locks during its lifetime (omit those which are not related with this bug):

Take PG lock (blocking) [mutex] #op_thread#
release this lock upon finish handling the OP #op_thread#
Try to take obc (object context, per object basis)’s write lock (non-blocking) [multiple-readers or multiple-writers but not intermixed] #op_thread#
release this lock upon all_applied and app_committed #another_op_thread#
Take obc’s ondisk_read_lock (blocking) [multiple-readers or multiple-writers but not intermixed] #op_thread# usually this is blocked by ondisk_write_lock
Prepare the transaction #op_thread#
Release obc’s ondisk_read_lock #op_thread#
Take obc’s ondisk_write_lock (blocking) [multiple-readers or multiple-writers but not intermixed] #op_thread#
release by the filestore OP thread upon finish applying the transaction #filestore_op_thread#

When filestore get slow (for whatever reason, usually when doing backfilling), the ondisk_write_lock would be held longer, and op thread would be blocked.

I am wondering if we can check the availability of ondisk_read_lock at step 2?

Actions

Copy link

Updated by Guang Yang about 9 years ago

Guang Yang wrote:

I am wondering if we can check the availability of ondisk_read_lock at step 2?

Pull request - https://github.com/ceph/ceph/pull/3610

Actions

Copy link

Updated by Samuel Just about 9 years ago

Recent changes already merged for hammer should prevent blocking the thread on the ondisk_read_lock by expanding the ObjectContext::rwstate lists mostly as you suggested.

Actions

Copy link