Bug #36473
closed
hung osd_repop, bluestore committed but failed to trigger repop_commit
Added by Sage Weil over 5 years ago.
Updated over 3 years ago.
Description
/a/sage-2018-10-16_18:31:27-rados-wip-sage-testing-2018-10-16-0724-distro-basic-smithi/3148851
Usually after the bluestre commit succeeds the op wq wakes up and calls something like
2018-10-16 20:57:10.468 7f26725c0700 10 osd.2 pg_epoch: 45 pg[1.6( v 45'14646 (45'14623,45'14646] local-lis/les=29/30 n=12628 ec=9/9 lis/c 29/29 les/c/f 30/30/0 26/29/29) [1,2] r=1 lpr=29 luod=0'0 lua=25'12628 crt=45'14646 lcod 45'14632 active mbc={}] repop_commit on op osd_repop(osd.1.0:64467 1.6 e45/29 1:64f1278e:::benchmark_data_smithi176_34325_object9600:head v 45'14634) v2, sending commit to osd.1
2018-10-16 20:57:10.468 7f26725c0700 20 osd.2 45 share_map_peer 0x55f95b2c5080 already has epoch 45
2018-10-16 20:57:10.468 7f26725c0700 1 -- 172.21.15.176:6806/31292 --> 172.21.15.176:6802/39730 -- osd_repop_reply(osd.1.0:64467 1.6 e45/29 ondisk, result = 0) v2 -- ?+0 0x55f97436a780 con 0x55f95b2c5080
but for this op it didn't:
2018-10-16 20:57:41.245 7f268c5f4700 20 slow op osd_repop(osd.1.0:64485 1.6 e45/29 1:64f950b4:::benchmark_data_smithi176_38726_object47030:head v 45'14652) initiated 2018-10-16 20:57:10.470116
2018-10-16 20:57:41.245 7f268c5f4700 -1 osd.2 45 get_health_metrics reporting 1 slow ops, oldest is osd_repop(osd.1.0:64485 1.6 e45/29 1:64f950b4:::benchmark_data_smithi176_38726_object47030:head v 45'14652)
maybe a race condition on the queue wakeup?
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _kv_finalize_thread kv_committed <0x55f976e3d200>
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _kv_finalize_thread deferred_stable <>
2018-10-16 20:57:10.472 7f26815de700 10 bluestore(/var/lib/ceph/osd/ceph-2) _txc_state_proc txc 0x55f976e3d200 kv_submitted
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_committed_kv txc 0x55f976e3d200
2018-10-16 20:57:10.472 7f26815de700 10 bluestore(/var/lib/ceph/osd/ceph-2) _txc_state_proc txc 0x55f976e3d200 finishing
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_finish 0x55f976e3d200 onodes 0x55f975723440
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_finish txc 0x55f976e3d200 done
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_finish osr 0x55f95b3ab8c0 q now empty
2018-10-16 20:57:10.472 7f26815de700 10 bluestore(/var/lib/ceph/osd/ceph-2) _txc_release_alloc(sync) 0x55f976e3d200 []
2018-10-16 20:57:10.472 7f26815de700 10 fbmap_alloc 0x55f95a18ea00 release done
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _kv_finalize_thread sleep
2018-10-16 20:57:10.472 7f26725c0700 20 osd.2 op_wq(6) _process empty q, waiting
2018-10-16 20:57:10.472 7f266e5b8700 20 osd.2 op_wq(6) _process empty q, waiting
So.. how could the _process threads not notice that bluestore queued something? Here is the OSD code:
sdata->shard_lock.Lock();
if (sdata->pqueue->empty() &&
!(is_smallest_thread_index && !sdata->context_queue.empty())) {
sdata->sdata_wait_lock.Lock();
if (!sdata->stop_waiting) {
dout(20) << __func__ << " empty q, waiting" << dendl;
what if
- thread A: above code checks for empty, and it is
- thread B: bluestore queues the finisher, takes wait_lock, signals
- thread A: takes wait_lock, waits
Maybe we need to retest for empty after taking wait_lock above?
- Status changed from 12 to Resolved
- Has duplicate Bug #36284: Bluestore might be hanging OSD added
Mimic (pr#22739 + pr#24269) introduced this race condition, which was fixed in Nautilus (pr#24761).
Was this evaluated for a backport?
- Status changed from Resolved to Pending Backport
- Backport set to mimic
Tagging for Mimic backport consideration.
- Pull request ID set to 24761
- Copied to Backport #45025: mimic: hung osd_repop, bluestore committed but failed to trigger repop_commit added
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Also available in: Atom
PDF