Bug #54396: Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the snaptrim queue - RADOS - Ceph

Actions

Copy link

Bug #54396

closed

Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the snaptrim queue

Added by Dan van der Ster about 2 years ago. Updated almost 2 years ago.

Status:

Resolved

Priority:

High

Assignee:

Dan van der Ster

Category:

Target version:

% Done:

Source:

Tags:

Backport:

octopus,pacific,quincy

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

Pull request ID:

45140

Crash signature (v1):

Crash signature (v2):

Description

See https://www.spinics.net/lists/ceph-users/msg71061.html

This time around, after a few hours of snaptrimming, users complained of high IO
latency, and indeed Ceph reported "slow ops" on a number of OSDs and on the
active MDS. I attributed this to the snaptrimming and decided to reduce it by
initially setting osd_pg_max_concurrent_snap_trims to 1, which didn't seem to
help much, so I then set it to 0, which had the surprising effect of
transitioning all PGs back to active+clean (is this intended?). I also restarted
the MDS which seemed to be struggling. IO latency went back to normal
immediately.

In the code, when osd_pg_max_concurrent_snap_trims is 0, PrimaryLogPG::AwaitAsyncWork::react(const DoSnapWork&) calls pg->snap_mapper.get_next_objects_to_trim looking for 0 snaps to trim. But pg->snap_mapper.get_next_objects_to_trim returns ENOENT in this case, then DoSnapWork erases the remaining snap_to_trim.

Related issues 4 (0 open — 4 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #54396

Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the snaptrim queue

Updated by Dan van der Ster about 2 years ago

Updated by Dan van der Ster about 2 years ago

Updated by Radoslaw Zarzynski about 2 years ago

Updated by Radoslaw Zarzynski about 2 years ago

Updated by Laura Flores about 2 years ago

Updated by Neha Ojha about 2 years ago

Updated by Backport Bot about 2 years ago

Updated by Backport Bot about 2 years ago

Updated by Backport Bot about 2 years ago

Updated by Neha Ojha almost 2 years ago