cephadm upgrade pacific to quincy autoscaler is scaling pgs from 32 -> 32768 for cephfs meta pool
Pacific version - 16.2.7-34.el8cp
Quincy version - 17.0.0-10315-ga00e8b31
After doing some analysis it looks like during the upgrade to the quincy version autoscaler TARGET RATIO got set as 4.0.
- After upgrade some commands output:
# ceph osd pool ls detail pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 25006 flags hashpspool,backfillfull stripe_width 0 pg_num_min 1 application mgr,mgr_devicehealth pool 2 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode on last_change 25006 lfor 0/0/1324 flags hashpspool,backfillfull,selfmanaged_snaps stripe_width 0 application rbd pool 3 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32768 pgp_num 32768 autoscale_mode on last_change 25006 lfor 0/0/9281 flags hashpspool,backfillfull stripe_width 0 pg_num_min 16 recovery_priority 5 target_size_ratio 4 application cephfs pool 4 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 4435 pgp_num 4214 pg_num_target 32 pgp_num_target 32 autoscale_mode on last_change 25817 lfor 0/25815/25813 flags hashpspool,backfillfull stripe_width 0 application cephfs
# ceph osd pool autoscale-status POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK .mgr 448.5k 3.0 12506G 0.0000 1.0 1 on False rbd 40985M 3.0 12506G 0.0096 1.0 256 on False cephfs.cephfs.meta 102.9M 3.0 12506G 1.0000 4.0000 1.0000 1.0 32768 on False cephfs.cephfs.data 1733G 3.0 12506G 0.4158 1.0 32 on False
From MGR and system logs:
Before upgrade: 2634769 Feb 11 00:48:44 f03-h02-000-r640 conmon: debug 2022-02-11T00:48:44.028+0000 7f3bab474700 0 [pg_autoscaler INFO root] effective_target_ratio 0.0 0.0 0 13428844396544
After upgrade: 2022-02-11T00:57:14.734+0000 7f4ceec03000 0 ceph version 17.0.0-10315-ga00e8b31 (a00e8b315af02865380634f8100dc7d18a18af4f) quincy (dev), process ceph-mgr, pid 7 2022-02-11T00:58:57.186+0000 7f4add690700 0 [pg_autoscaler INFO root] effective_target_ratio 0.0 4.0 0 13428844396544
#4 Updated by Kamoltat Sirivadhna 8 months ago
the root cause of the problem is after the upgrade to quincy, cephfs meta data pool was somehow given a 4.0 target _size_ratio. This should not happen when we only have 4 pools in the same root of the cluster, especially, when total_target_byte is also 0 for cephfs.cephfs.meta , it is guaranteed that effective ratio will be 1.0 for that of cephfs.cephfs.meta, hence it will take precedence over capacity_ratio and this means it will give cephfs.cephfs.meta the maximum number of PGs it is allow to give, in this case, 32768 PGs.
Here is a link to my findings:
#5 Updated by Kamoltat Sirivadhna 7 months ago
From the monitor sides of things of pool creation, target_size_ratio cannot be more than 1.0 or less than 0.0. As it was specified herein /src/mon/MonCommands.h,
therefore, We can rule out the possibility of `target_size_ratio` getting set off by the command `ceph osd pool create <pool-name> --target_size_ratio <ratio>` However,
`ceph osd pool set <pool-name> target_size_ratio <ratio>` is able to set the target_size_ratio to be out of 0.0-1.0 range.
target_size_ratio can be more than 1.0 and the bound that was set during pool creation in /src/mon/MonCommands.h, should be changed.
#9 Updated by Kamoltat Sirivadhna 7 months ago
After recreating the problem by tweaking the upgrade/pacific-x/parallel suite and adding additional logs, we conclude that the problem lies in the declaration of `opt_mapping` in src/osd/osd_types.cc. https://github.com/ceph/ceph/pull/44054 added PG_NUM_MAX to the middle of the list, which we found out that the order of the list is important and we should always add to the end of list to preserve the order of options during upgrade. For more information regarding bug analysis please see: https://docs.google.com/document/d/10PJDwU2H7uY2o7_1lwTtQHUFoF9Fx7jguKeT-mKR2sA/edit?usp=sharing