Project

General

Profile

Actions

Bug #61304

open

crimson: maintain Heartbeat::Session::epoch correctly in Heartbeat::maybe_share_osdmap

Added by Samuel Just 11 months ago. Updated 11 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

seastar::future<> Heartbeat::maybe_share_osdmap(
  crimson::net::ConnectionRef conn,
  Ref<MOSDPing> m)
{
  const osd_id_t from = m->get_source().num();
  const epoch_t osdmap_epoch = service.get_map()->get_epoch();
  const epoch_t peer_epoch = m->map_epoch;
  auto found = peers.find(from);
  if (found == peers.end()) {
    return seastar::now();
  }
  auto& peer = found->second;

  if (peer_epoch > peer.get_last_epoch_sent()) {
    logger().debug("{} updating session's last epoch sent " 
                   "from {} to peer's (id: {}) map epoch of {}",
                   __func__, peer.get_last_epoch_sent(),
                   from, peer_epoch);
    peer.set_last_epoch_sent(peer_epoch);
  }

  if (osdmap_epoch <= peer.get_last_epoch_sent()) {
    logger().info("{} latest epoch sent {} is already later " 
                  "than osdmap epoch of {}",
                  __func__ , peer.get_last_epoch_sent(),
                  osdmap_epoch);
    return seastar::now();
  }

  logger().info("{} peer id: {} epoch is {} while osdmap is {}",
                __func__ , from, m->map_epoch, osdmap_epoch);
  if (osdmap_epoch > m->map_epoch) {
    logger().debug("{} sharing osdmap epoch of {} with peer id {}",
                   __func__, osdmap_epoch, from);
    // Peer's newest map is m->map_epoch. Therfore it misses
    // the osdmaps in the range of `m->map_epoch` to `osdmap_epoch`.
    return service.send_incremental_map_to_osd(from, m->map_epoch);
  }
  return seastar::now();
}

The above code sets Heartbeat::Session::epoch based on the received peer's epoch rather than on the epoch we are sending (via peer.set_last_epoch_sent(peer_epoch); This will cause us to tend to resend maps more than once.

Actions #1

Updated by Samuel Just 11 months ago

  • Assignee set to Samuel Just
Actions #2

Updated by Samuel Just 11 months ago

  • Subject changed from crimson: share correct maps with peers to crimson: maintain Heartbeat::Session::epoch correctly in Heartbeat::maybe_share_osdmap
  • Description updated (diff)
Actions #3

Updated by Radoslaw Zarzynski 11 months ago

  • Project changed from RADOS to crimson
Actions

Also available in: Atom PDF