Bug #58821
openpg_autoscaler module is not working since Pacific version upgrade from v16.2.4 to v16.2.9
0%
Description
Hello Team,
We have upgraded our clusters from pacific v16.2.4 to v16.2.9 few months back. Before upgrade i was able to get the output for "ceph osd pool autoscale-status".
Since upgrading the cluster to v16.2.9, no output is shown while running same command. Currently we have 16M+ objects in pool "cephfs_data" which only has 32 pg's. pg_autoscaler should kick off pg creation but still i am not seeing it happening.
While "checking ceph progress" i can see it actually reduced the pg number for "cephfs_data" pool from 128 to 32.
[Complete]: Global Recovery Event (3M)
[============================]
[Complete]: PG autoscaler decreasing pool 2 PGs from 128 to 32 (3M)
[============================]
[Complete]: Global Recovery Event (3M)
[============================]
[Complete]: Global Recovery Event (3M)
[============================]
[Complete]: PG autoscaler decreasing pool 3 PGs from 128 to 32 (3M)
I am attaching the details of cluster running in v16.2.9 and comparison with other cluster in v16.2.4.
Below is the error that i am getting on mgr logs
2023-02-22T05:12:58.295+0000 7f3f45196700 0 [pg_autoscaler INFO root] _maybe_adjust
2023-02-22T05:12:58.297+0000 7f3f45196700 0 [pg_autoscaler ERROR root] pool 2 has overlapping roots: {-20, -1}
2023-02-22T05:12:58.298+0000 7f3f45196700 0 [pg_autoscaler ERROR root] pool 3 has overlapping roots: {-20, -1}
2023-02-22T05:12:58.298+0000 7f3f45196700 0 [pg_autoscaler WARNING root] pool 1 contains an overlapping root -1... skipping scaling
2023-02-22T05:12:58.298+0000 7f3f45196700 0 [pg_autoscaler WARNING root] pool 2 contains an overlapping root -20... skipping scaling
2023-02-22T05:12:58.298+0000 7f3f45196700 0 [pg_autoscaler WARNING root] pool 3 contains an overlapping root -20... skipping scaling
Please let me know if i am missing something at the conf level or is their any existing issue with pacific v16.2.9.
Regards
Prayank
Files
Updated by Prayank Saxena about 1 year ago
I see these tickets already opened for the same https://tracker.ceph.com/issues/55611
But can i get a solution on how to resolve pg_autoscaler so that cluster can automatically scale the pg's for each pool?
Updated by Prayank Saxena about 1 year ago
What will happen if i change the crush rule of pool 1 from replicated rule(default) to customised crush rule.
Will that trigger pg_autoscaler module to create pg's for "cephfs_data" pool?
Updated by Ilya Dryomov about 1 year ago
- Target version changed from v16.2.12 to v16.2.13