Project

General

Profile

Actions

Bug #54263

closed

cephadm upgrade pacific to quincy autoscaler is scaling pgs from 32 -> 32768 for cephfs meta pool

Added by Vikhyat Umrao about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Pacific version - 16.2.7-34.el8cp
Quincy version - 17.0.0-10315-ga00e8b31

After doing some analysis it looks like during the upgrade to the quincy version autoscaler TARGET RATIO got set as 4.0.

- After upgrade some commands output:

# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 25006 flags hashpspool,backfillfull stripe_width 0 pg_num_min 1 application mgr,mgr_devicehealth
pool 2 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode on last_change 25006 lfor 0/0/1324 flags hashpspool,backfillfull,selfmanaged_snaps stripe_width 0 application rbd
pool 3 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32768 pgp_num 32768 autoscale_mode on last_change 25006 lfor 0/0/9281 flags hashpspool,backfillfull stripe_width 0 pg_num_min 16 recovery_priority 5 target_size_ratio 4 application cephfs
pool 4 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 4435 pgp_num 4214 pg_num_target 32 pgp_num_target 32 autoscale_mode on last_change 25817 lfor 0/25815/25813 flags hashpspool,backfillfull stripe_width 0 application cephfs
# ceph osd pool autoscale-status
POOL                  SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK   
.mgr                448.5k                3.0        12506G  0.0000                                  1.0       1              on         False  
rbd                 40985M                3.0        12506G  0.0096                                  1.0     256              on         False  
cephfs.cephfs.meta  102.9M                3.0        12506G  1.0000        4.0000           1.0000   1.0   32768              on         False  
cephfs.cephfs.data   1733G                3.0        12506G  0.4158                                  1.0      32              on         False 

From MGR and system logs:

Before upgrade:

2634769 Feb 11 00:48:44 f03-h02-000-r640 conmon[2849344]: debug 2022-02-11T00:48:44.028+0000 7f3bab474700  0 [pg_autoscaler INFO root] effective_target_ratio 0.0 0.0 0 13428844396544

After upgrade:

2022-02-11T00:57:14.734+0000 7f4ceec03000  0 ceph version 17.0.0-10315-ga00e8b31 (a00e8b315af02865380634f8100dc7d18a18af4f) quincy (dev), process ceph-mgr, pid 7
2022-02-11T00:58:57.186+0000 7f4add690700  0 [pg_autoscaler INFO root] effective_target_ratio 0.0 4.0 0 13428844396544


Related issues 4 (1 open3 closed)

Related to RADOS - Bug #54238: cephadm upgrade pacifc to quincy -> causing osd's FULL/cascading failureNew

Actions
Related to RADOS - Backport #54412: pacific:osd:add pg_num_max valueRejectedKamoltat (Junior) SirivadhnaActions
Copied to RADOS - Backport #54526: pacific: cephadm upgrade pacific to quincy autoscaler is scaling pgs from 32 -> 32768 for cephfs meta poolResolvedKamoltat (Junior) SirivadhnaActions
Copied to RADOS - Backport #54527: quincy: cephadm upgrade pacific to quincy autoscaler is scaling pgs from 32 -> 32768 for cephfs meta poolResolvedKamoltat (Junior) SirivadhnaActions
Actions

Also available in: Atom PDF