Project

General

Profile

Actions

Bug #62770

open

17.2.6 (new cluster) pool creation errors.

Added by Janne Johansson 8 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have recently installed two clusters, 17.2.6 on ubuntu20, where I can't create pools unless I specify pg_num and pgp_num manually. The mgr could create it's own .mgr pool, which I guess is because it specifies 1 PG for itself, and I can run this on the mons:

ceph osd pool create name 8
or
ceph osd pool create name 8 8

without problems, but if I run

ceph osd pool create name

it will bug out with this error:

Error ERANGE: 'pgp_num' must be greater than 0 and lower or equal than 'pg_num', which in this case is 1
My defaults are 8 for both pg_num and pgp_num, but for some reason it falls back to 0.

I found this out because the second cluster which will serve s3 could not start radosgws, since it would get stuck at not being able to create the .rgw.root pool.
The rgw logs say:

rgw main: rgw_init_ioctx ERROR: librados::Rados::pool_create returned (34) Numerical result out of range (this can be due to a pool or placement group misconfiguration, e.g. pg_num < pgp_num or mon_max_pg_per_osd exceeded)

and I believe whatever caused this "can't create pools without stating pg_num" has not been visible before, due to the rgw having a setting of its own (rgw_rados_pool_pg_num_min) , but this was deleted in 17.2.1 according to https://github.com/ceph/ceph/pull/46234 and while I can see why noone tests "ceph osd pool create name", it is curious that noone noticed the rgw issue.

Our new clusters both have ceph.conf settings to have pg_num and pgp_num set to 8 for defaults, and I have also tried to use ceph config to override the defaults and while ceph config get and dumps show the mons having 8 as default, the pg-less creations still do not work.

I have gotten rgws running by manually creating all pools myself, but if we ever enable swift or do anything else that creates more pools, the rgws will mysteriously start to fail again if this is not fixed.

So it isn't strictly an rgw issue, but it was where I noticed it first, and since I usually make rbd and other pools with specified size always, it can have been there for a while but I noticed when rgw could not autocreate pools.

ceph daemon mon.HOST config diff says:
...
"osd_pool_default_pg_num": {
"default": 32,
"mon": 8,
"file": 8,
"final": 8
},
"osd_pool_default_pgp_num": {
"default": 0,
"mon": 8,
"file": 8,
"final": 8
},
...

ceph versions

"mon": {
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 3
},
"mgr": {
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 3
},
"osd": {
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 144
},
"mds": {},
"rgw": {
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 9
},
"overall": {
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 159
}

No data to display

Actions

Also available in: Atom PDF