Bug #8048
osd/ReplicatedPG: FAILED assert(!parent->get_log().get_missing().is_missing(soid))
0%
Description
From effectively the master branch: http://qa-proxy.ceph.com/teuthology/joshd-2014-04-08_17:21:20-rados-wip-6480-testing-basic-plana/179672/
osd/ReplicatedPG.cc: 7297: FAILED assert(!parent->get_log().get_missing().is_missing(soid)) ceph version 0.79-87-gc2f37bb (c2f37bb723178e6ae5fe5121baa06aba994025c1)
1: (ReplicatedBackend::sub_op_modify(std::tr1::shared_ptr<OpRequest>)+0x11f5) [0x7ddb75]
2: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x55c) [0x90bc8c]
3: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1ee) [0x7bd04e]
4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x61a3aa]
5: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x6353e8]
6: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x67aefc]
7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa57136]
8: (ThreadPool::WorkThread::entry()+0x10) [0xa58f40]
9: (()+0x7e9a) [0x7f8e9b07be9a]
10: (clone()+0x6d) [0x7f8e9963c3fd]
Associated revisions
osd/ReplicatedPG: check clones for degraded
We check whether the head is degraded, and we check whether a clone is
unreadable, but in the case where we have a cache op on a degraded object,
we don't check. That leads to an assert when the repop hits the replica
and the object is in the peer's missing set.
Fix this by adding a check on the clone when write_ordered is true. Note
that checking write_ordered is better than whether it is a cache op because
we want to preserve write ordering even for reads that are flagged by the
client.
Fixes: #8048
Signed-off-by: Sage Weil <sage@inktank.com>
History
#1 Updated by Sage Weil almost 10 years ago
- Status changed from New to 12
#2 Updated by Sage Weil almost 10 years ago
- Status changed from 12 to In Progress
- Assignee set to Sage Weil
#3 Updated by Sage Weil almost 10 years ago
- Status changed from In Progress to Fix Under Review
#4 Updated by Sage Weil almost 10 years ago
- Assignee deleted (
Sage Weil)
#5 Updated by Samuel Just almost 10 years ago
- Status changed from Fix Under Review to Resolved
#6 Updated by Dmitry Smirnov almost 10 years ago
Please have a look at the comments of bug #8008 -- there may be some additional information related to this issue. I suspect that issues #8008 and this one (#8048) may be related. There is a possibility that either those two bugs are not completely fixed or there is another bug causing OSD crash on attempt to repair inconsistent PG.