Project

General

Profile

Subtask #2745

Updated by Joao Eduardo Luis over 11 years ago

Three different "roles" on a monitor cluster regarding synchronization: Create the following functions: 

 * Leader - responsible for disabling sync_start() -- Start the Paxos trim while there's at least one on-going sync; also responsible synchronization process 
 # Clean up the store (i.e., remove all existing keys) 
 # Set a special key ‘monitor_synchronizing’ to deny any new sync request until the Paxos state is trimmed once true 
 # Create a MMonSync message with op type OP_START 
 # set a synchronization expired timer, to guarantee that we go over will eventually progress if mon.Y does not reply to messages for some predetermined threshold. reason; this timer should be reset each time a message is received 
 # send the message to mon.Y 

 * Sync Requester - The sync_handle_chunk() -- Handle the reception of a chunk from mon.Y 
 # If m->flags & FLAG_DENIED: We can no longer sync, or mon.Y is already handling a sync; Bootstrap the monitor in and restart the sync. 
 # Call sync_reset_timeout() 
 # Ack the reception of the message by sending a MMonSync with op type OP_CHUNK_ACK 
 # Create a transaction from the message’s bufferlist, and apply it to the store 
 # If the message has the ‘last_chunk’ flag set, then we need to be synchronized; will contact finish the Leader synchronization and bootstrap the monitor (this should force the whole Probe workflow again, and we should head to a consistent state) 

 * sync_handle_start() -- Handle the reception of a synchronization start request the green light from mon.X 
 # If we already have a ‘sync_peer_inst’ set: IF it is different from mon.X, reply with and OP_CHUNK_ACK with flags FLAG_DENIED; otherwise, continue. 
 # Call sync_reset_timeout() 
 # Keep mon.X’s instance for posterior use and to go indicate which monitor we are about to synchronize; save it on with ‘sync_peer_inst’. 
 # If not the sync, leader, request the leader to stop trimming. 
 # Else, stop trimming and will obtain call sync_start_chunks() 

 * sync_reset_timeout() -- Reset the synchronization expired timer 
 # if the synchronization expired timer does not exist, create it; otherwise, reset it 

 * sync_start_chunks() -- Start sending chunks to the requesting monitor. 
 # Create a consistent, up-to-date store state from MonitorDBStrore::Synchronizer instance 
 # Call sync_send_chunk() 

 * handle_sync_trim_disable_ack() -- Handle the acknowledgement of a quorum member. OP_TRIM_DISABLE 
 # assert(m->version == get_first_committed()) 
 # Keep m->version for posterior use, to check if any new commits went through while we were synchronizing. Also, we will have to renew the trim disable while we are synchronizing the store, so, we must check if the version returned on each ack is the same as the first m->version. 
 # If m->flags & FLAG_RENEW: assert(m->version == sync_first_version) 
 # Else, call sync_start_chunks() 

 * Sync Provider - A monitor belonging handle_sync_trim_disable() -- Handle the request to stop trimming coming from a Peon (this should only happen on the quorum that may be the Leader, against which and the Sync Requester will Leader must not be synchronized. 


 *Synchronization Implementation* 

 *Role-independent* mon.Y) 
 <pre> 
   set<string> get_sync_targets_names(); 
   void handle_sync(MMonSync *m); 
   void handle_sync_abort(MMonSync *m); 
   void reset_sync(); # If m->flags & FLAG_RENEW:  
 # If trimming is enabled: 
 </pre> # reply OP_TRIM_DISABLE_ACK with flag FLAG_DENIED 
 # Otherwise, Disable Trimming. 
 # If state == ACTIVE: Call sync_trim_disable_ack(m) 
 # Otherwise, wait_for_active(C_DisableTrimAck(this, m)) 

 *Leader-specific* * C_DisableTrimAck() -- Calls ‘sync_trim_disable_ack(m)’ 

 * sync_trim_disable_ack(m) -- Acknowledge that the trimming has been disabled 
 <pre> 
   Mutex trim_lock; 
   map<entity_inst_t, Context*> trim_timeouts; 
   Context *trim_enable_timer; 

   struct C_TrimTimeout; 
   struct C_TrimEnable; 

   void sync_send_heartbeat(entity_inst_t &other, bool reply); 
   void handle_sync_start(MMonSync *m); 
   void handle_sync_heartbeat(MMonSync *m); 
   void handle_sync_finish(MMonSync *m); 
   void sync_finish(entity_inst_t &entity, bool abort); 
   void sync_finish_abort(entity_inst_t &entity); # Reset a timer that after ‘N’ seconds will re-enable trimming. 
 </pre> # Reply with OP_TRIM_DISABLED_ACK with first V available 

 *Sync Provider-specific* * sync_send_chunk() -- helper function for sync_handle_start() and sync_handle_chunk_ack() with the objective of building and sending a chunk to mon.X 
 <pre> 
   struct SyncEntity; 
   SyncEntity get_sync_entity(entity_inst_t &entity, Monitor *mon); 

   struct C_SyncTimeout; 

   map<entity_inst_t, SyncEntity> sync_entities; 

   void sync_provider_cleanup(entity_inst_t &entity); 
   void handle_sync_start_chunks(MMonSync *m); 
   void handle_sync_heartbeat_reply(MMonSync *m); 
   void handle_sync_chunk_reply(MMonSync *m); 
   void sync_send_chunks(SyncEntity sync, pair<string,string> &last_key); 
   void sync_timeout(entity_inst_t &entity); # Iterate over each key/value on the store, building up a bufferlist with the encoded transactions to be sent over to mon.X; a config option ‘mon_sync_payload_max_size’ should be taken into account when populating these bufferlists 
 </pre> # send a MMonSync message of op type OP_CHUNK to mon.X 
 # Reset the synchronization expired timer 
 # If this is the last chunk, send an OP_TRIM_ENABLE to the Leader and return; otherwise, send an OP_TRIM_DISABLE with flag FLAG_RENEW to the Leader 

 *Sync Requester-specific* * sync_handle_chunk_ack() -- Handle the reception of a chunk ack from mon.X and behave accordingly 
 <pre> 
   struct C_SyncStartTimeout; 
   struct C_SyncStartRetry; 
   struct C_HeartbeatTimeout; 
   struct C_SyncFinishReplyTimeout; 

   SyncEntity sync_leader; 
   SyncEntity sync_provider; 

   void sync_requester_cleanup(); 
   void sync_requester_abort(); 
   void sync_start(entity_inst_t &entity); 
   void handle_sync_start_reply(MMonSync *m); 
   void handle_sync_chunk(MMonSync *m); 
   void handle_sync_finish_reply(MMonSync *m); 
   void sync_stop(); 
   void sync_abort(); # Reset the synchronization expired timer 
 </pre> # If there are any more keys to share, call sync_send_chunk(); otherwise, remove the synchronization expired timer and set a new timer to re-enable trimming after, say, 30 seconds, giving time for mon.X to bootstrap and obtain the newer Paxos versions before we trim them

Back