Bug #54396: Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the snaptrim queue - RADOS - Ceph

Actions

Copy link

Bug #54396

closed

Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the snaptrim queue

Added by Dan van der Ster about 2 years ago. Updated almost 2 years ago.

Status:

Resolved

Priority:

High

Assignee:

Dan van der Ster

Category:

Target version:

% Done:

Source:

Tags:

Backport:

octopus,pacific,quincy

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

Pull request ID:

45140

Crash signature (v1):

Crash signature (v2):

Description

See https://www.spinics.net/lists/ceph-users/msg71061.html

This time around, after a few hours of snaptrimming, users complained of high IO
latency, and indeed Ceph reported "slow ops" on a number of OSDs and on the
active MDS. I attributed this to the snaptrimming and decided to reduce it by
initially setting osd_pg_max_concurrent_snap_trims to 1, which didn't seem to
help much, so I then set it to 0, which had the surprising effect of
transitioning all PGs back to active+clean (is this intended?). I also restarted
the MDS which seemed to be struggling. IO latency went back to normal
immediately.

In the code, when osd_pg_max_concurrent_snap_trims is 0, PrimaryLogPG::AwaitAsyncWork::react(const DoSnapWork&) calls pg->snap_mapper.get_next_objects_to_trim looking for 0 snaps to trim. But pg->snap_mapper.get_next_objects_to_trim returns ENOENT in this case, then DoSnapWork erases the remaining snap_to_trim.

Related issues 4 (0 open — 4 closed)

Actions

Copy link

Updated by Dan van der Ster about 2 years ago

Status changed from New to Fix Under Review
Assignee set to Dan van der Ster
Pull request ID set to 45140

Actions

Copy link

Updated by Dan van der Ster about 2 years ago

More context:

ceph pg dump reports a SNAPTRIMQ_LEN of 0 on all PGs.

Did CephFS just leak a massive 12 TiB worth of objects...? It seems to me that
the snaptrim operation did not complete at all.

Actions

Copy link