Project

General

Profile

Actions

Bug #3268

closed

osd: localize reads handling is incorrect

Added by Josh Durgin over 11 years ago. Updated over 10 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Found by code inspection when changing the way MOSDOp->rmw_flags is populated.
The only place that checks for CEPH_OSD_FLAG_LOCALIZE_READS is in ReplicatedPG on an error condition.
rmw_flags is checked for CEPH_OSD_FLAG_LOCALIZE_READS, but rmw_flags is cleared by OSD::init_osd_op_flags so it cannot include this flag.
From the history I can't see this check ever being reachable.

rmw_flags should just be used by OSD internal flags. The flags given by the client should be checked for CEPH_OSD_FLAG_LOCALIZE_READS instead.
Additionally, if the client does not set CEPH_OSD_FLAG_LOCALIZE_READS on an op, the OSD should reject it if it is not the primary.
Tests should be added for each of these cases.

Actions #1

Updated by Sage Weil over 11 years ago

  • Assignee set to Noah Watkins
Actions #2

Updated by Noah Watkins about 11 years ago

I'm starting on this bug now. Before fixing the flag handling described in the ticket, I want to make sure that the OSD is actually servicing requests so our localized read thrash is working.

I want to make sure I understand what is happening here on the OSD side. Looking at find_object_context, it seems to be the case that the error condition below, if (r), is ever to primary vs. replica conditions. So then the checks for localized reads are only occurring under some error condition that would prevent reads anyway.

So, in the non-error case, there also doesn't appear to be any primary/non-primary checking (e.g. set_read() looks to never be modified based on anything other than the operation type) that would prevent the replica from being served.

void ReplicatedPG::do_op(OpRequestRef op)
   ...
672   ObjectContext *obc;                                                                                                                                                                    
673   bool can_create = op->may_write();                                                                                                                                                     
674   snapid_t snapid;                                                                                                                                                                       
675   int r = find_object_context(                                                                                                                                                           
676       hobject_t(m->get_oid(),                                                                                                                                                            
677         m->get_object_locator().key,                                                                                                                                                     
678         m->get_snapid(),                                                                                                                                                                 
679         m->get_pg().ps(),                                                                                                                                                                
680         m->get_object_locator().get_pool()),                                                                                                                                             
681       m->get_object_locator(),                                                                                                                                                           
682       &obc, can_create, &snapid);                                                                                                                                                        
683   if (r) {                                                                                                                                                                               
684     if (r == -EAGAIN) {                                                                                                                                                                  
685       // If we're not the primary of this OSD, and we have                                                                                                                               
686       // CEPH_OSD_FLAG_LOCALIZE_READS set, we just return -EAGAIN. Otherwise,                                                                                                            
687       // we have to wait for the object.                                                                                                                                                 
688       if (is_primary() || (!(m->get_flags() & CEPH_OSD_FLAG_LOCALIZE_READS))) {                                                                                                          
689         // missing the specific snap we need; requeue and wait.                                                                                                                          
690         assert(!can_create); // only happens on a read                                                                                                                                   
691         hobject_t soid(m->get_oid(), m->get_object_locator().key,                                                                                                                        
692             snapid, m->get_pg().ps(),                                                                                                                                                    
693             info.pgid.pool());                                                                                                                                                           
694         wait_for_missing_object(soid, op);                                                                                                                                               
695         return;                                                                                                                                                                          
696       }                                                                                                                                                                                  
697     }                                                                                                                                                                                    
698     osd->reply_op_error(op, r);                                                                                                                                                          
699     return;                                                                                                                                                                              
700   }                                                                                                                                                                                      
701
Actions #3

Updated by Greg Farnum about 11 years ago

Yes, the OSDs will serve replica reads as things stand.

Actions #4

Updated by Sage Weil over 10 years ago

  • Status changed from New to Rejected
Actions

Also available in: Atom PDF