Project

General

Profile

Bug #21480

bluestore: flush_commit is racy

Added by Sage Weil almost 2 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Target version:
-
Start date:
09/20/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

observed hang on 'osd bench' command:
/a/yuriw-2017-09-19_19:54:13-rados-wip-yuri-testing3-2017-09-19-1710-distro-basic-smithi/1648854

2017-09-19 22:29:11.978724 7f70bd261700  1 heartbeat_map is_healthy 'OSD::command_tp thread 0x7f709a9e7700' had suicide timed out after 900
2017-09-19 22:29:11.983003 7f709a9e7700 -1 *** Caught signal (Aborted) **
 in thread 7f709a9e7700 thread_name:tp_osd_cmd

 ceph version 13.0.0-1010-g1c941a3 (1c941a39eaa824e91551e0b37ebcca96e0f6f174) mimic (dev)
 1: (()+0xa39e89) [0x7f70c29c3e89]
 2: (()+0x10330) [0x7f70c0699330]
 3: (pthread_cond_wait()+0xc4) [0x7f70c0695404]
 4: (C_SaferCond::wait()+0x8c) [0x7f70c24f741c]
 5: (OSD::do_command(Connection*, unsigned long, std::vector<std::string, std::allocator<std::string> >&, ceph::buffer::list&)+0x1b7d) [0x7f70c24e4d7d]
 6: (OSD::CommandWQ::_process(OSD::Command*, ThreadPool::TPHandle&)+0x49) [0x7f70c25291a9]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa6e) [0x7f70c2a06d4e]
 8: (ThreadPool::WorkThread::entry()+0x10) [0x7f70c2a07c30]

The flush_commit appears to be racy because it sets state to KV_DONE without holding the osr lock, but uses the lock for the flush_commit() waiter.

the possibly good news is that there are only a handful of users of flush_commit(); maybe we can just drop it.


Related issues

Copied to bluestore - Backport #24260: luminous: bluestore: flush_commit is racy Resolved
Copied to bluestore - Backport #24261: mimic: bluestore: flush_commit is racy Resolved

History

#1 Updated by Sage Weil almost 2 years ago

08-rados-wip-sage2-testing-2017-10-25-1347-distro-basic-smithi/1773520

#2 Updated by Sage Weil over 1 year ago

Best path is I think tor emove flush_commit and adjust callers to do something else.

#3 Updated by Sage Weil over 1 year ago

  • Project changed from RADOS to bluestore

#4 Updated by Sage Weil over 1 year ago

Hrm, this is used by peering,

void PG::reset_interval_flush()
{
  dout(10) << "Clearing blocked outgoing recovery messages" << dendl;
  recovery_state.clear_blocked_outgoing();

  Context *c = new QueuePeeringEvt<IntervalFlush>(
    this, get_osdmap()->get_epoch(), IntervalFlush());
  if (!ch->flush_commit(c)) {
    dout(10) << "Beginning to block outgoing recovery messages" << dendl;
    recovery_state.begin_block_outgoing();
  } else {
    dout(10) << "Not blocking outgoing recovery messages" << dendl;
    delete c;
  }
}

...and possibly leading to real hangs, see /a/sage-2018-05-17_13:38:58-rados-wip-sage2-testing-2018-05-17-0701-distro-basic-smithi/2543814
2018-05-17 16:32:47.642 7f5cf9646700 10 osd.4 pg_epoch: 447 pg[7.9( v 446'765 (431'400,446'765] local-lis/les=413/414 n=765 ec=406/406 lis/c 413/413 les/c/f 414/427/0 413/413/406) [4,6,5] r=0 lpr=447 luod=0'0 crt=446'765 lcod 446'762 mlcod 0'0 active mbc={}] Beginning to block outgoing recovery messages
...

and peering sticks because we never get a callback.

#5 Updated by Sage Weil over 1 year ago

  • Priority changed from High to Urgent

#6 Updated by Sage Weil over 1 year ago

  • Status changed from Verified to Need Review
  • Backport set to mimic,luminous

#7 Updated by Sage Weil about 1 year ago

  • Status changed from Need Review to Pending Backport

#8 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #24260: luminous: bluestore: flush_commit is racy added

#9 Updated by Nathan Cutler about 1 year ago

#10 Updated by Igor Fedotov about 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF