Bug #61962
opentrim_maps - possible leak on `skip_maps`
0%
Description
Background:
OSD::trim_maps is trimming osdmaps from the superblock's oldest_map epoch up to the earlier between cluster trim lower bound or the osdmap cache's key lower bound (`min`).
When `skip_maps` is false, we will trim in small batches (`osd_target_transaction_size`). That said, the oldest_map may lag behind the trim_lower_bound for a while.
If `skip_maps` is true, trimming will occur unconditionally up to `min`. The target transaction size will not be taken into account.
Leak:
The leak can happen once `skip_maps` is true and we will move the oldest_map to `first` without actually trimming all the osdmaps between the current oldest_map epoch up to the `first` epoch of the MOSDMap message which is being handled.
oldest_map epoch is used to indicate the epoch which the last trimming has finished so we can continue trimming from this epoch later on (in the next trim_maps call).
The faulty trimming may occur when the `min` epoch is selected based on the osdmap cache lower bound (with `skip_maps`) and not based on the cluster trim lower bound.
For affected clusters:
trim_stale_maps command is introduced. See: https://github.com/ceph/ceph/pull/53227
Updated by Matan Breizman 9 months ago
- Pull request ID set to 52545
The wip PR is currently blocked by https://github.com/ceph/ceph/pull/52339
Updated by Matan Breizman 8 months ago
- Status changed from In Progress to Fix Under Review
Updated by Yuri Weinstein 6 months ago
Updated by Matan Breizman 6 months ago
- Backport set to quincy,reef
Will backport after running in main for a while.
Updated by Matan Breizman 6 months ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot 6 months ago
- Copied to Backport #63464: reef: trim_maps - possible leak on `skip_maps` added
Updated by Backport Bot 6 months ago
- Copied to Backport #63465: quincy: trim_maps - possible leak on `skip_maps` added