Project

General

Profile

Bug #43546

mgr/pg-autoscaler: Autoscaler creates too many PGs for EC pools

Added by Dan van der Ster 6 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
pg_autoscaler module
Target version:
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

pg_scaler uses the raw_used_rate to calculate the target number of pools per PG; this results in too many PGs for EC pools. Instead it should use the PG size.

Consider this example, 96 OSD cluster, mon_target_pg_per_osd=100:

Three pools:
  • test (target ratio = 0.01)
  • cephfs_metadata (3x replication, target ratio = 0.09)
  • cephfs_ec_k2m2 (2+2 EC, target ratio = 0.9)

pg_autoscaler results in the following suggestion:

2020-01-10 10:50:55.117 7f3fbad87700  4 mgr[pg_autoscaler] Pool 'test' root_id -1 using 0.01 of space, bias 1.0, pg target 32.0 quantized to 32 (current 64)
2020-01-10 10:50:55.117 7f3fbad87700  4 mgr[pg_autoscaler] Pool 'cephfs_metadata' root_id -1 using 0.09 of space, bias 4.0, pg target 1152.0 quantized to 1024 (current 512)
2020-01-10 10:50:55.118 7f3fbad87700  4 mgr[pg_autoscaler] Pool 'cephfs_ec_k2m2' root_id -1 using 0.9 of space, bias 1.0, pg target 4320.0 quantized to 4096 (current 2048)

If we follow the suggestion (32+1024+4096=5152) we'll get ~214 PGs per OSD, double the mon_target_pg_per_osd.

This is because (e.g. for our EC pool) the pg_autoscaler is using raw_used_rate (2.0) rather than size (4) to calculate pool_pg_target.

This fixes the calculation:

diff --git a/src/pybind/mgr/pg_autoscaler/module.py b/src/pybind/mgr/pg_autoscaler/module.py
index e3dc83ff2c..e50860edbe 100644
--- a/src/pybind/mgr/pg_autoscaler/module.py
+++ b/src/pybind/mgr/pg_autoscaler/module.py
@@ -303,7 +303,7 @@ class PgAutoscaler(MgrModule):
             final_ratio = max(capacity_ratio, target_ratio)

             # So what proportion of pg allowance should we be using?
-            pool_pg_target = (final_ratio * root_map[root_id].pg_target) / raw_used_rate * bias
+            pool_pg_target = (final_ratio * root_map[root_id].pg_target) / p['size'] * bias

             final_pg_target = max(p['options'].get('pg_num_min', PG_NUM_MIN),
                                   nearest_power_of_two(pool_pg_target))

Log after above fix:

2020-01-10 11:32:52.711 7fdd7c9e6700  4 mgr[pg_autoscaler] Pool 'test' root_id -1 using 0.01 of space, bias 1.0, pg target 32.0 quantized to 32 (current 64)
2020-01-10 11:32:52.711 7fdd7c9e6700  4 mgr[pg_autoscaler] Pool 'cephfs_metadata' root_id -1 using 0.09 of space, bias 4.0, pg target 1152.0 quantized to 1024 (current 512)
2020-01-10 11:32:52.712 7fdd7c9e6700  4 mgr[pg_autoscaler] Pool 'cephfs_ec_k2m2' root_id -1 using 0.9 of space, bias 1.0, pg target 2160.0 quantized to 2048 (current 2048)


Related issues

Copied to mgr - Backport #43727: nautilus: mgr/pg-autoscaler: Autoscaler creates too many PGs for EC pools Resolved

History

#2 Updated by Lenz Grimmer 6 months ago

  • Subject changed from pg-autoscaler creates too many PGs for EC pools to mgr/pg-autoscaler: Autoscaler creates too many PGs for EC pools
  • Category set to pg_autoscaler module
  • Target version set to v15.0.0
  • Backport set to nautilus

#3 Updated by Lenz Grimmer 6 months ago

  • Pull request ID set to 32592

#4 Updated by Lenz Grimmer 6 months ago

  • Status changed from New to Fix Under Review
  • Affected Versions v15.0.0 added

#5 Updated by Sage Weil 6 months ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #43727: nautilus: mgr/pg-autoscaler: Autoscaler creates too many PGs for EC pools added

#7 Updated by Nathan Cutler 5 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF