Bug #21480
bluestore: flush_commit is racy
Status:
Resolved
Priority:
Urgent
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
observed hang on 'osd bench' command:
/a/yuriw-2017-09-19_19:54:13-rados-wip-yuri-testing3-2017-09-19-1710-distro-basic-smithi/1648854
2017-09-19 22:29:11.978724 7f70bd261700 1 heartbeat_map is_healthy 'OSD::command_tp thread 0x7f709a9e7700' had suicide timed out after 900 2017-09-19 22:29:11.983003 7f709a9e7700 -1 *** Caught signal (Aborted) ** in thread 7f709a9e7700 thread_name:tp_osd_cmd ceph version 13.0.0-1010-g1c941a3 (1c941a39eaa824e91551e0b37ebcca96e0f6f174) mimic (dev) 1: (()+0xa39e89) [0x7f70c29c3e89] 2: (()+0x10330) [0x7f70c0699330] 3: (pthread_cond_wait()+0xc4) [0x7f70c0695404] 4: (C_SaferCond::wait()+0x8c) [0x7f70c24f741c] 5: (OSD::do_command(Connection*, unsigned long, std::vector<std::string, std::allocator<std::string> >&, ceph::buffer::list&)+0x1b7d) [0x7f70c24e4d7d] 6: (OSD::CommandWQ::_process(OSD::Command*, ThreadPool::TPHandle&)+0x49) [0x7f70c25291a9] 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa6e) [0x7f70c2a06d4e] 8: (ThreadPool::WorkThread::entry()+0x10) [0x7f70c2a07c30]
The flush_commit appears to be racy because it sets state to KV_DONE without holding the osr lock, but uses the lock for the flush_commit() waiter.
the possibly good news is that there are only a handful of users of flush_commit(); maybe we can just drop it.
Related issues
History
#1 Updated by Sage Weil over 6 years ago
08-rados-wip-sage2-testing-2017-10-25-1347-distro-basic-smithi/1773520
#2 Updated by Sage Weil over 6 years ago
Best path is I think tor emove flush_commit and adjust callers to do something else.
#3 Updated by Sage Weil almost 6 years ago
- Project changed from RADOS to bluestore
#4 Updated by Sage Weil almost 6 years ago
Hrm, this is used by peering,
void PG::reset_interval_flush() { dout(10) << "Clearing blocked outgoing recovery messages" << dendl; recovery_state.clear_blocked_outgoing(); Context *c = new QueuePeeringEvt<IntervalFlush>( this, get_osdmap()->get_epoch(), IntervalFlush()); if (!ch->flush_commit(c)) { dout(10) << "Beginning to block outgoing recovery messages" << dendl; recovery_state.begin_block_outgoing(); } else { dout(10) << "Not blocking outgoing recovery messages" << dendl; delete c; } }
...and possibly leading to real hangs, see /a/sage-2018-05-17_13:38:58-rados-wip-sage2-testing-2018-05-17-0701-distro-basic-smithi/2543814
2018-05-17 16:32:47.642 7f5cf9646700 10 osd.4 pg_epoch: 447 pg[7.9( v 446'765 (431'400,446'765] local-lis/les=413/414 n=765 ec=406/406 lis/c 413/413 les/c/f 414/427/0 413/413/406) [4,6,5] r=0 lpr=447 luod=0'0 crt=446'765 lcod 446'762 mlcod 0'0 active mbc={}] Beginning to block outgoing recovery messages ...
and peering sticks because we never get a callback.
#5 Updated by Sage Weil almost 6 years ago
- Priority changed from High to Urgent
#6 Updated by Sage Weil almost 6 years ago
- Status changed from 12 to Fix Under Review
- Backport set to mimic,luminous
#7 Updated by Sage Weil almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
#8 Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #24260: luminous: bluestore: flush_commit is racy added
#9 Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #24261: mimic: bluestore: flush_commit is racy added
#10 Updated by Igor Fedotov over 5 years ago
- Status changed from Pending Backport to Resolved