Project

General

Profile

Bug #36473

hung osd_repop, bluestore committed but failed to trigger repop_commit

Added by Sage Weil over 5 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/sage-2018-10-16_18:31:27-rados-wip-sage-testing-2018-10-16-0724-distro-basic-smithi/3148851

Usually after the bluestre commit succeeds the op wq wakes up and calls something like

2018-10-16 20:57:10.468 7f26725c0700 10 osd.2 pg_epoch: 45 pg[1.6( v 45'14646 (45'14623,45'14646] local-lis/les=29/30 n=12628 ec=9/9 lis/c 29/29 les/c/f 30/30/0 26/29/29) [1,2] r=1 lpr=29 luod=0'0 lua=25'12628 crt=45'14646 lcod 45'14632 active mbc={}] repop_commit on op osd_repop(osd.1.0:64467 1.6 e45/29 1:64f1278e:::benchmark_data_smithi176_34325_object9600:head v 45'14634) v2, sending commit to osd.1
2018-10-16 20:57:10.468 7f26725c0700 20 osd.2 45 share_map_peer 0x55f95b2c5080 already has epoch 45
2018-10-16 20:57:10.468 7f26725c0700  1 -- 172.21.15.176:6806/31292 --> 172.21.15.176:6802/39730 -- osd_repop_reply(osd.1.0:64467 1.6 e45/29 ondisk, result = 0) v2 -- ?+0 0x55f97436a780 con 0x55f95b2c5080

but for this op it didn't:
2018-10-16 20:57:41.245 7f268c5f4700 20 slow op osd_repop(osd.1.0:64485 1.6 e45/29 1:64f950b4:::benchmark_data_smithi176_38726_object47030:head v 45'14652) initiated 2018-10-16 20:57:10.470116
2018-10-16 20:57:41.245 7f268c5f4700 -1 osd.2 45 get_health_metrics reporting 1 slow ops, oldest is osd_repop(osd.1.0:64485 1.6 e45/29 1:64f950b4:::benchmark_data_smithi176_38726_object47030:head v 45'14652)

maybe a race condition on the queue wakeup?


Related issues

Duplicated by bluestore - Bug #36284: Bluestore might be hanging OSD Duplicate 10/01/2018
Copied to RADOS - Backport #45025: mimic: hung osd_repop, bluestore committed but failed to trigger repop_commit Rejected

History

#1 Updated by Sage Weil over 5 years ago

2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _kv_finalize_thread kv_committed <0x55f976e3d200>
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _kv_finalize_thread deferred_stable <>
2018-10-16 20:57:10.472 7f26815de700 10 bluestore(/var/lib/ceph/osd/ceph-2) _txc_state_proc txc 0x55f976e3d200 kv_submitted
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_committed_kv txc 0x55f976e3d200
2018-10-16 20:57:10.472 7f26815de700 10 bluestore(/var/lib/ceph/osd/ceph-2) _txc_state_proc txc 0x55f976e3d200 finishing
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_finish 0x55f976e3d200 onodes 0x55f975723440
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_finish  txc 0x55f976e3d200 done
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _txc_finish osr 0x55f95b3ab8c0 q now empty
2018-10-16 20:57:10.472 7f26815de700 10 bluestore(/var/lib/ceph/osd/ceph-2) _txc_release_alloc(sync) 0x55f976e3d200 []
2018-10-16 20:57:10.472 7f26815de700 10 fbmap_alloc 0x55f95a18ea00 release done
2018-10-16 20:57:10.472 7f26815de700 20 bluestore(/var/lib/ceph/osd/ceph-2) _kv_finalize_thread sleep
2018-10-16 20:57:10.472 7f26725c0700 20 osd.2 op_wq(6) _process empty q, waiting
2018-10-16 20:57:10.472 7f266e5b8700 20 osd.2 op_wq(6) _process empty q, waiting

So.. how could the _process threads not notice that bluestore queued something? Here is the OSD code:

  sdata->shard_lock.Lock();
  if (sdata->pqueue->empty() &&
     !(is_smallest_thread_index && !sdata->context_queue.empty())) {
    sdata->sdata_wait_lock.Lock();
    if (!sdata->stop_waiting) {
      dout(20) << __func__ << " empty q, waiting" << dendl;

what if
- thread A: above code checks for empty, and it is
- thread B: bluestore queues the finisher, takes wait_lock, signals
- thread A: takes wait_lock, waits

Maybe we need to retest for empty after taking wait_lock above?

#2 Updated by xie xingguo over 5 years ago

See https://github.com/ceph/ceph/pull/23317#issuecomment-423432234 (should be the same issue):

I also took a close look at the failure case at: http://qa-proxy.ceph.com/teuthology/xxg-2018-09-21_02:02:01-rados-wip-fix-polog-overtrim-distro-basic-smithi/3047956/, which revealed that bluestore were starting dropping callbacks(e.g., on_commit) from high-level callers.

#3 Updated by Sage Weil over 5 years ago

  • Status changed from 12 to Resolved

#4 Updated by Sage Weil over 5 years ago

  • Duplicated by Bug #36284: Bluestore might be hanging OSD added

#5 Updated by Dan Hill almost 4 years ago

Mimic (pr#22739 + pr#24269) introduced this race condition, which was fixed in Nautilus (pr#24761).

Was this evaluated for a backport?

#6 Updated by Dan Hill almost 4 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to mimic

Tagging for Mimic backport consideration.

#7 Updated by Nathan Cutler almost 4 years ago

  • Pull request ID set to 24761

24761

#8 Updated by Nathan Cutler almost 4 years ago

  • Copied to Backport #45025: mimic: hung osd_repop, bluestore committed but failed to trigger repop_commit added

#9 Updated by Nathan Cutler about 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF