Bug #13666
closed"osd/ReplicatedPG.cc: 2818: FAILED assert(0 == "out of order op")" with async msgr
0%
Description
Run: http://pulpito.ceph.com/teuthology-2015-10-29_21:00:11-rados-infernalis-distro-basic-multi/
Jobs: ['1131148', '1131285']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-10-29_21:00:11-rados-infernalis-distro-basic-multi/1131148/teuthology.log
Assertion: osd/ReplicatedPG.cc: 2818: FAILED assert(0 == "out of order op") ceph version 9.1.0-98-g3af21a1 (3af21a1432aabb86f82f9063aef7a6ab9865345e) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fcbb9c1894b] 2: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x183d) [0x7fcbb987640d] 3: (ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)+0x2962) [0x7fcbb9879182] 4: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6dd) [0x7fcbb981493d] 5: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3bd) [0x7fcbb968743d] 6: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>&)+0x5d) [0x7fcbb968765d] 7: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8b4) [0x7fcbb96ac614] 8: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x85f) [0x7fcbb9c0936f] 9: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7fcbb9c0b270] 10: (()+0x8182) [0x7fcbb8238182] 11: (clone()+0x6d) [0x7fcbb657f47d]
Updated by Yuri Weinstein over 8 years ago
- Related to Bug #13017: osd/ReplicatedPG.cc: 2752: FAILED assert(0 == "out of order op") added
Updated by Sage Weil over 8 years ago
- Subject changed from "osd/ReplicatedPG.cc: 2818: FAILED assert(0 == "out of order op")" in rados-infernalis-distro-basic-multi to "osd/ReplicatedPG.cc: 2818: FAILED assert(0 == "out of order op")" with async msgr
Updated by Haomai Wang over 8 years ago
Oh, I found why simple messenger won't happen with this. Async messenger reply ack together(https://github.com/ceph/ceph/blob/master/src/msg/async/AsyncConnection.cc#L847).
So I guess we need to reply ack ASAP
Updated by Greg Farnum over 8 years ago
I'm not understanding. How does batching up acks lead to this assert?
Updated by Haomai Wang over 8 years ago
For example, osd.0 send three messages to osd.1 successfully, and because of batch ack osd.1 doesn't reply ack to osd.0. Then inject socket error happen, osd.0 reconnect to osd.1 and send previous three messages again. Meanwhile osd.1 trimed pg log and make do_op not detect dup op(https://github.com/ceph/ceph/blob/master/src/osd/ReplicatedPG.cc#L1664). So osd.1 would hit this assert later. If we always reply ack message ASAP, it wouldn't happen gap between pg log trim and acked messages.
Updated by Greg Farnum over 8 years ago
Oh, so this is just a log trimming/op replay race that the async messenger is making a lot larger than usual. Got it. That's...pretty impressive.
Updated by Yuri Weinstein over 8 years ago
Run: http://pulpito.ceph.com/teuthology-2015-10-31_21:00:10-rados-infernalis-distro-basic-multi/
Jobs: ['1133613', '1133950']
Updated by Kefu Chai over 8 years ago
- Status changed from New to Fix Under Review
- Assignee set to Haomai Wang
Updated by Haomai Wang over 8 years ago
- Status changed from Fix Under Review to Resolved