Project

General

Profile

Actions

Bug #64519

open

OSD/MON: No snapshot metadata keys trimming

Added by Matan Breizman 2 months ago. Updated 5 days ago.

Status:
In Progress
Priority:
Normal
Category:
Snapshots
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy,reef, squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The Monitor's keys of purged_snap_ / purged_epoch_ and OSD's PSN_ (SnapMapper::PURGED_SNAP_PREFIX) keys are not trimmed and will continue to accumulate.

Relevant threads:
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/UOJG46YXTIPOXJUSELIN42ATAD5FPMDY/
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/B72HSXIGX6IJFLTZU2SPXCQQWFTOXS5A/


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #62983: OSD/MON: purged snap keys are not mergedResolvedMatan Breizman

Actions
Actions #1

Updated by Matan Breizman 2 months ago

  • Description updated (diff)
Actions #2

Updated by Joshua Baergen 2 months ago

This reminded me of the notes in https://pad.ceph.com/p/removing_removed_snaps/timeslider#4651 that talk about why the set of deleted snapshots need to stick around for a while. But I'm assuming that "a while" probably doesn't need to mean permanently...

Actions #3

Updated by Matan Breizman about 1 month ago

  • Related to Bug #62983: OSD/MON: purged snap keys are not merged added
Actions #4

Updated by Matan Breizman about 1 month ago

  • Status changed from New to In Progress

https://tracker.ceph.com/issues/62983 should help with avoiding the gaps in the purged snaps ids intervals. As a result, all the purged_snap ids will be mereged into a single entry.

For clusters already impacted by this issue, https://github.com/ceph/ceph/pull/53545 may help with removing the "ghost" snapids which cause the gap and allow merging all the entries.

Actions #5

Updated by Matan Breizman about 1 month ago

  • Assignee set to Matan Breizman
  • Pull request ID set to 53545

Adding 53545 as a candidate for fixing this issue, this will require additional documentation on how to use the tool - so I'll keep the tracker open.

Actions #6

Updated by Eugen Block 23 days ago

I know I'm a bit early asking this, but I helped raise this issue and Mykola picked it up in the devel mailing list. I talked to one of our customers who is affected by this (more than 40 Million purged_snap entries) and they would be interested testing this feature on their secondary site (they mirror rbd images). But they're currently in the planning process to remove the second site, I have no ETA though. I expect this fix (and the respective tool to trim affected mon stores) not earlier than for Squid. There's no telling if and when they will be able to upgrade to Squid, they recently upgraded to Quincy though. So would there be a chance to backport this to Reef and Quincy as well, depending on which release they'll be on when this is considered ready?
And a couple more questions regarding the purge tool:
  1. Will it be possible to trim the keys online (without cluster downtime)?
  2. How "safe" will it be? What could go wrong and would there be some rollback mechanism?
Actions #7

Updated by Radoslaw Zarzynski 19 days ago

Looks pretty backportable but let's wait for Matan's word.

Actions #8

Updated by Matan Breizman 17 days ago

  • Backport set to quincy,reef, squid

Eugen Block wrote in #note-6:

I know I'm a bit early asking this, but I helped raise this issue and Mykola picked it up in the devel mailing list. I talked to one of our customers who is affected by this (more than 40 Million purged_snap entries) and they would be interested testing this feature on their secondary site (they mirror rbd images). But they're currently in the planning process to remove the second site, I have no ETA though. I expect this fix (and the respective tool to trim affected mon stores) not earlier than for Squid. There's no telling if and when they will be able to upgrade to Squid, they recently upgraded to Quincy though. So would there be a chance to backport this to Reef and Quincy as well, depending on which release they'll be on when this is considered ready?

Hey Eugen,
There should be no issues with backporting this back to Q as this PR offers a new separated command.
The relevant usage of the command will be using with the default option:

     *  * Default: All the snapids in the given range which are not
     *    marked as purged in the Monitor will be removed. Mostly useful
     *    for cases in which the snapid is leaked in the client side.
     *    See: https://tracker.ceph.com/issues/64646

And a couple more questions regarding the purge tool:
  1. Will it be possible to trim the keys online (without cluster downtime)?
  2. How "safe" will it be? What could go wrong and would there be some rollback mechanism?
  1. Yes, it's possible. The (online) command doesn't require shutting down the OSDs or MONs.
  2. The command was also added to our testing workloads and seem to work well.
    The tricky part is the unknown unknowns. I do not expect anything to go wrong as the command will only interact with ghost snapids. Moreover, the command can also be used gradually (short snapid removal intervals) to verify nothing goes wrong while using it.
Actions #9

Updated by Radoslaw Zarzynski 12 days ago

The PR is in QA.

Actions #10

Updated by Eugen Block 11 days ago

Thanks, Matan! It sounds very promising. I talked to the customer and they are willing to test this cleanup procedure on their secondary site. Apparently, this will be backported to Quincy so that will make it easier. I'm still not entirely sure if I understand all the required steps or if simply running ceph osd pool force-remove-snap unique_pool_0 will suffice. But maybe we can discuss that in Slack or something.

Actions #11

Updated by Matan Breizman 9 days ago

Eugen Block wrote in #note-10:

Thanks, Matan! It sounds very promising. I talked to the customer and they are willing to test this cleanup procedure on their secondary site. Apparently, this will be backported to Quincy so that will make it easier. I'm still not entirely sure if I understand all the required steps or if simply running ceph osd pool force-remove-snap unique_pool_0 will suffice. But maybe we can discuss that in Slack or something.

I'll provide a detailed explanation once the PR passed QA. Broadly speaking, only running the command should be sufficient.

Actions #12

Updated by Radoslaw Zarzynski 5 days ago

note from scrub: bump up.

Actions

Also available in: Atom PDF