Feature #22448
closed
Visibility for snap trim queue length
Added by Piotr Dalek over 6 years ago.
Updated about 6 years ago.
Description
We observed unexplained, constant disk space usage increase on a few of our prod clusters. At first we thought that it's because of customers abusing them, but that wasn't it. Then we though that images are constantly filled with data, but space usage reported by Ceph wasn't consistent with filesystem. After further digging, we realized that snap trim queues for some of PGs are in 250k elements territory... We increased the snap trimmer frequency and number of parallel snap trim ops and disk space usage finally started to drop.
Ceph needs a features to efficiently and conveniently access snap trim queue lengths so it can be used with monitoring, and a features to warn Ceph cluster admins when snap trim queues are long enough to be requiring some attention.
https://github.com/ceph/ceph/pull/19520
- Copied to Backport #22449: jewel: Visibility for snap trim queue length added
- Copied to Backport #22450: luminous: Visibility for snap trim queue length added
@Piotr: Please wait until the master PR is merged before starting the backporting process. Thanks.
- Status changed from New to Fix Under Review
- Backport set to jewel, luminous
@Piotr: It's OK to add e.g. "jewel, luminous" to the "Backport" field right from the beginning, though.
When the master PR is merged, the status of the ticket is changed to "Pending Backport" and then an automated script automatically creates the backport tickets from the value of the "Backport" field.
- Status changed from Fix Under Review to Pending Backport
- Status changed from Pending Backport to Resolved
Already merged to master, luminous and jewel.
Also available in: Atom
PDF