Project

General

Profile

Actions

Feature #62670

open

[RFE] cephfs should track and expose subvolume usage and quota

Added by Paul Cuzner 9 months ago. Updated 6 months ago.

Status:
Need More Info
Priority:
Normal
Category:
Performance/Resource Usage
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:

Description

Subvolumes may be queried independently, but at scale we need a way for subvolume usage and quota thresholds to drive alerts within the ceph as a healthcheck, and/or via prometheus as alerts and usage metrics.

Here's some ideas for the kinds of metrics that would be useful for the mgr/prometheus module to expose for alerting and usage tracking

ceph_fs_subvolume_count{fs_id="1", data_pool="pool_a"} <n>
ceph_fs_subvolume_metadata{fs_id="1", data_pool="pool_a", name="subvol_1", } 1
ceph_fs_subvolume_usage_bytes_total{fs_id="1", name="subvol_1"} <n>
ceph_fs_subvolume_quota_bytes_total{fs_id="1", name="subvol_1"} <n>

With metrics like these, we could
  • raise alerts on quota near full to avoid application outages
  • use promql functions like predict_linear to forecast fill rates per subvolume
  • understand overcommit of the filesystem (sum of quota > fs capacity)
  • identify unused subvolumes so admins can follow up and delete

Having these metrics would be key to managing the capacity usage within native cephfs and ganesha/cephfs

Actions

Also available in: Atom PDF