Feature #41080: rgw: break up user reset-stats into multiple cls ops - rgw - Ceph

Actions

Copy link

Feature #41080

closed

rgw: break up user reset-stats into multiple cls ops

Added by J. Eric Ivancich over 4 years ago. Updated over 3 years ago.

Status:

Resolved

Priority:

Low

Assignee:

Matt Benjamin

Target version:

Ceph - v15.0.0

% Done:

Source:

Tags:

Backport:

octopus

Reviewed:

Affected Versions:

Pull request ID:

34869

Description

Currently when a user requests the reset of user stats via radosgw-admin, a single write op is sent to the OSD holding the user's info object, and it's omap entries are read in a loop and the final result written to the object's header.

The advantage to this technique is that it is atomic, manipulating a single object with a write operation.

The downside, though, is that on OSDs that may be bogged down for other reasons, this write operation may take a while and limit access to the pg on which this object resides.

There are a couple of ideas that might mitigate this. However given how infrequent this operation is, these changes are likely not worth implementing, at least at this point in time. Instead, this tracker is here primarily to capture these ideas for the future.

It's important to understand that one object is read from and written to for this op.

For a user that has a lot of buckets this operation incrementally reads through their buckets, totaling the stats as it goes along, with one final write. Bucket stats are read in groups of 1000, so if a user had 100,000 buckets, this would involve 100 reads.

So the first idea is to do the reads as one op to determine the total and the write as a second op, to update the header. Presumably other reads on the PG could take place during the read op. The primary challenge here is to make sure there were no intervening writes between the read op and write op. A generation number and/or timestamp of the header write could be used to insure that the write op is ok to complete. Otherwise an error could take place, and possibly a set of retries.

The second idea would be even to break the reads into multiple ops, with enough information returned from each to continue the operation with more reads, followed by a single write. The same challenge as listed above is applicable here, although with more opportunities for races with other write ops.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by J. Eric Ivancich over 4 years ago

Comments to this tracker are invited.

Actions

Copy link

Updated by Josh Durgin over 4 years ago

There are a finite number of OSD op threads. If the 100 reads in a single op take a while, they will block one of those threads. By default there are 2 threads per shard for SSD, and 8 shards, so if these kind of ops were more common, they could end up blocking I/O for 1/8th of the PGs.

The first idea doesn't help much since reads to the same PG would still be blocked on each other - there's no parallelism there today. The 2nd idea, with multiple ops, would get around this and let other work happen interspersed with these operations.

Actions

Copy link

Updated by J. Eric Ivancich over 4 years ago

Thank you, Josh. That's very helpful info.

Actions

Copy link

Updated by Matt Benjamin almost 4 years ago

Pull request ID set to 34869

Actions

Copy link

Updated by Matt Benjamin almost 4 years ago

Status changed from New to Fix Under Review
Assignee set to Matt Benjamin
Backport set to octopus

Actions

Copy link

Updated by J. Eric Ivancich over 3 years ago

Status changed from Fix Under Review to Pending Backport

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Copied to Backport #46968: octopus: rgw: break up user reset-stats into multiple cls ops added

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rgw

Custom queries

Feature #41080

rgw: break up user reset-stats into multiple cls ops

Updated by J. Eric Ivancich over 4 years ago

Updated by Josh Durgin over 4 years ago

Updated by J. Eric Ivancich over 4 years ago

Updated by Matt Benjamin almost 4 years ago

Updated by Matt Benjamin almost 4 years ago

Updated by J. Eric Ivancich over 3 years ago

Updated by Nathan Cutler over 3 years ago

Updated by Nathan Cutler over 3 years ago