Project

General

Profile

Actions

Bug #61962

open

trim_maps - possible leak on `skip_maps`

Added by Matan Breizman 10 months ago. Updated 6 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy,reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Background:

OSD::trim_maps is trimming osdmaps from the superblock's oldest_map epoch up to the earlier between cluster trim lower bound or the osdmap cache's key lower bound (`min`).
When `skip_maps` is false, we will trim in small batches (`osd_target_transaction_size`). That said, the oldest_map may lag behind the trim_lower_bound for a while.
If `skip_maps` is true, trimming will occur unconditionally up to `min`. The target transaction size will not be taken into account.

Leak:

The leak can happen once `skip_maps` is true and we will move the oldest_map to `first` without actually trimming all the osdmaps between the current oldest_map epoch up to the `first` epoch of the MOSDMap message which is being handled.
oldest_map epoch is used to indicate the epoch which the last trimming has finished so we can continue trimming from this epoch later on (in the next trim_maps call).

The faulty trimming may occur when the `min` epoch is selected based on the osdmap cache lower bound (with `skip_maps`) and not based on the cluster trim lower bound.

For affected clusters:

trim_stale_maps command is introduced. See: https://github.com/ceph/ceph/pull/53227


Related issues 2 (2 open0 closed)

Copied to RADOS - Backport #63464: reef: trim_maps - possible leak on `skip_maps`NewMatan BreizmanActions
Copied to RADOS - Backport #63465: quincy: trim_maps - possible leak on `skip_maps`NewMatan BreizmanActions
Actions #1

Updated by Radoslaw Zarzynski 10 months ago

  • Status changed from New to In Progress

Bump up.

Actions #2

Updated by Radoslaw Zarzynski 9 months ago

Bump up.

Actions #3

Updated by Matan Breizman 9 months ago

  • Pull request ID set to 52545

The wip PR is currently blocked by https://github.com/ceph/ceph/pull/52339

Actions #4

Updated by Matan Breizman 9 months ago

  • Assignee set to Matan Breizman
Actions #5

Updated by Matan Breizman 8 months ago

  • Status changed from In Progress to Fix Under Review
Actions #6

Updated by Matan Breizman 8 months ago

  • Description updated (diff)
Actions #8

Updated by Matan Breizman 6 months ago

  • Backport set to quincy,reef

Will backport after running in main for a while.

Actions #9

Updated by Matan Breizman 6 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #10

Updated by Backport Bot 6 months ago

  • Copied to Backport #63464: reef: trim_maps - possible leak on `skip_maps` added
Actions #11

Updated by Backport Bot 6 months ago

  • Copied to Backport #63465: quincy: trim_maps - possible leak on `skip_maps` added
Actions #12

Updated by Backport Bot 6 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF