Bug #10739
closedLock contention on rgw bucket index object (ondisk_read/write_lock) stuck op threads
0%
Description
In our production cluster (with rgw), we came across a problem that all rgw process are stuck (all worker threads stuck waiting for response from OSD, start giving 500 response to clients). Dump objecter_requests (on rgw) shows the slow in flight ops were caused by one OSD, that OSD has 2 PGs doing backfilling and it has 2 bucket index objects.
At the OSD side, we have 8 op threads, it turns out at the time when this problem occurred, several (if not all of them) op threads took seconds (even tens of seconds) handling the bucket index op, as a result, the throughput of the op threads dropped, cascade to the entire cluster.
The op has the following locks during its lifetime (omit those which are not related with this bug):- Take PG lock (blocking) [mutex] #op_thread#
release this lock upon finish handling the OP #op_thread# - Try to take obc (object context, per object basis)’s write lock (non-blocking) [multiple-readers or multiple-writers but not intermixed] #op_thread#
release this lock upon all_applied and app_committed #another_op_thread# - Take obc’s ondisk_read_lock (blocking) [multiple-readers or multiple-writers but not intermixed] #op_thread# usually this is blocked by ondisk_write_lock
- Prepare the transaction #op_thread#
- Release obc’s ondisk_read_lock #op_thread#
- Take obc’s ondisk_write_lock (blocking) [multiple-readers or multiple-writers but not intermixed] #op_thread#
release by the filestore OP thread upon finish applying the transaction #filestore_op_thread#
When filestore get slow (for whatever reason, usually when doing backfilling), the ondisk_write_lock would be held longer, and op thread would be blocked.
I am wondering if we can check the availability of ondisk_read_lock at step 2?
Updated by Guang Yang about 9 years ago
Guang Yang wrote:
I am wondering if we can check the availability of ondisk_read_lock at step 2?
Pull request - https://github.com/ceph/ceph/pull/3610
Updated by Samuel Just about 9 years ago
Recent changes already merged for hammer should prevent blocking the thread on the ondisk_read_lock by expanding the ObjectContext::rwstate lists mostly as you suggested.
Updated by Guang Yang about 9 years ago
- Status changed from New to Duplicate
This scenario has been fixed by Sam's commit - a81f3e6e61abfc7eca7743a83bf4af810705b449