Bug #3268
closedosd: localize reads handling is incorrect
0%
Description
Found by code inspection when changing the way MOSDOp->rmw_flags is populated.
The only place that checks for CEPH_OSD_FLAG_LOCALIZE_READS is in ReplicatedPG on an error condition.
rmw_flags is checked for CEPH_OSD_FLAG_LOCALIZE_READS, but rmw_flags is cleared by OSD::init_osd_op_flags so it cannot include this flag.
From the history I can't see this check ever being reachable.
rmw_flags should just be used by OSD internal flags. The flags given by the client should be checked for CEPH_OSD_FLAG_LOCALIZE_READS instead.
Additionally, if the client does not set CEPH_OSD_FLAG_LOCALIZE_READS on an op, the OSD should reject it if it is not the primary.
Tests should be added for each of these cases.
Updated by Noah Watkins over 11 years ago
I'm starting on this bug now. Before fixing the flag handling described in the ticket, I want to make sure that the OSD is actually servicing requests so our localized read thrash is working.
I want to make sure I understand what is happening here on the OSD side. Looking at find_object_context, it seems to be the case that the error condition below, if (r), is ever to primary vs. replica conditions. So then the checks for localized reads are only occurring under some error condition that would prevent reads anyway.
So, in the non-error case, there also doesn't appear to be any primary/non-primary checking (e.g. set_read() looks to never be modified based on anything other than the operation type) that would prevent the replica from being served.
void ReplicatedPG::do_op(OpRequestRef op) ... 672 ObjectContext *obc; 673 bool can_create = op->may_write(); 674 snapid_t snapid; 675 int r = find_object_context( 676 hobject_t(m->get_oid(), 677 m->get_object_locator().key, 678 m->get_snapid(), 679 m->get_pg().ps(), 680 m->get_object_locator().get_pool()), 681 m->get_object_locator(), 682 &obc, can_create, &snapid); 683 if (r) { 684 if (r == -EAGAIN) { 685 // If we're not the primary of this OSD, and we have 686 // CEPH_OSD_FLAG_LOCALIZE_READS set, we just return -EAGAIN. Otherwise, 687 // we have to wait for the object. 688 if (is_primary() || (!(m->get_flags() & CEPH_OSD_FLAG_LOCALIZE_READS))) { 689 // missing the specific snap we need; requeue and wait. 690 assert(!can_create); // only happens on a read 691 hobject_t soid(m->get_oid(), m->get_object_locator().key, 692 snapid, m->get_pg().ps(), 693 info.pgid.pool()); 694 wait_for_missing_object(soid, op); 695 return; 696 } 697 } 698 osd->reply_op_error(op, r); 699 return; 700 } 701
Updated by Greg Farnum over 11 years ago
Yes, the OSDs will serve replica reads as things stand.