Bug #56626: "ceph fs volume create" fails with error ERANGE - CephFS - Ceph

Actions

Copy link

Bug #56626

closed

"ceph fs volume create" fails with error ERANGE

Added by Victoria Martinez de la Cruz almost 2 years ago. Updated over 1 year ago.

Status:

Closed

Priority:

Normal

Assignee:

Kotresh Hiremath Ravishankar

Category:

Correctness/Safety

Target version:

Ceph - v18.0.0

% Done:

Source:

Community (dev)

Tags:

Backport:

quincy,pacific

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

mgr/volumes

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Trying to create a CephFS filesystem within a cluster deployed with cephadm fails

Steps followed

1. sudo cephadm --image "quincy" bootstrap --fsid <fsid> --config <bootstrap-config> --output-config <ceph-config> --output-keyring <keyring> --output-pub-ssh-key <pub-key> --allow-overwrite --allow-fqdn-hostname --skip-monitoring-stack --skip-dashboard --single-host-defaults --skip-firewalld --skip-mon-network --mon-ip <host-ip>

bootstrap.config

[global]
log to file = true
osd crush chooseleaf type = 0
osd_pool_default_pg_num = 8
osd_pool_default_pgp_num = 8
osd_pool_default_size = 1
[mon]
mon_warn_on_pool_no_redundancy = False
[osd]
osd_memory_target_autotune = true
osd_numa_auto_affinity = true
[mgr]
mgr/cephadm/autotune_memory_target_ratio = 0.2

2. sudo cephadm shell --fsid <fsid> --config <ceph_config> --keyring <keyring> -- ceph orch daemon add osd "<hostname>:<osd>"

3. sudo cephadm shell --fsid <fsid> --config <ceph_config> --keyring <keyring> -- ceph fs volume create <my-fs>

Expected output: new filesystem is created successfully, with the data and metadata pools

Actual output: Error ERANGE: 'pgp_num' must be greater than 0 and lower or equal than 'pg_num', which in this case is 1

Environment status:

root@quincy-cephadm:/# ceph -s
cluster:
id: 4cbcbb65-3109-47ee-bd9a-2aab3f4debe7
health: HEALTH_OK

services:
    mon: 1 daemons, quorum quincy-cephadm (age 19m)
    mgr: quincy-cephadm.jbzzql(active, since 18m), standbys: quincy-cephadm.telyok
    osd: 1 osds: 1 up (since 17m), 1 in (since 18m)

data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 449 KiB
    usage:   20 MiB used, 30 GiB / 30 GiB avail
    pgs:     1 active+clean

Image used: v17.2.1 in Quay

Actions

Copy link

Updated by Patrick Donnelly almost 2 years ago

Category set to Correctness/Safety
Assignee set to Kotresh Hiremath Ravishankar
Target version changed from v17.2.2 to v18.0.0
Backport set to quincy,pacific
Component(FS) mgr/volumes added

Kotresh, PTAL.

Actions

Copy link

Updated by Patrick Donnelly almost 2 years ago

Status changed from New to Triaged

Actions

Copy link

Updated by Kotresh Hiremath Ravishankar almost 2 years ago

Hi Victoria,

I am not very familiar with the osd configs but as per code if 'osd_pool_default_pg_autoscale_mode' is 'on', the 'pg_num' is set to 1. Please see the code below

int OSDMonitor::prepare_new_pool(....)
{
...
...
  if (pg_num == 0) {
    auto pg_num_from_mode =
      [pg_num=g_conf().get_val<uint64_t>("osd_pool_default_pg_num")]
      (const string& mode) {
      return mode == "on" ? 1 : pg_num;
    };
    pg_num = pg_num_from_mode(
      pg_autoscale_mode.empty() ?
      g_conf().get_val<string>("osd_pool_default_pg_autoscale_mode") :
      pg_autoscale_mode);
  }
...
...

According to your config 'osd_pool_default_pgp_num = 8' which is greater than the 'pg_num' set (i.e 8 > 1). Setting 'osd_pool_default_pgp_num = 1' should fix the issue.
Please see the code below.

int OSDMonitor::prepare_new_pool(....)
{
...
...
  if (pgp_num > pg_num) {
    *ss << "'pgp_num' must be greater than 0 and lower or equal than 'pg_num'" 
        << ", which in this case is " << pg_num;
    return -ERANGE;
  }
... 
</prev>

Actions

Copy link

Updated by Kotresh Hiremath Ravishankar almost 2 years ago

Status changed from Triaged to In Progress

Actions

Copy link

Updated by Ramana Raja almost 2 years ago

Status changed from In Progress to Need More Info

This does not seem like a bug in the mgr/volumes code. The mgr/volumes module creates FS pools using `osd pool create <pool-name>` with no additional CLI arguments. As Kotresh mentioned, the bootstrap_config setting in devstack-plugin-ceph need to be changed.

I suggest removing the settings 'osd_pool_default_pg_num = 8' and 'osd_pool_default_pgp_num = 8' from the file src/devstack/lib/cephadm. See, https://opendev.org/openstack/devstack-plugin-ceph/src/commit/400b1011be405b240be52c6d61f5257e3a0e39b5/devstack/lib/cephadm#L162 . If 'osd_pool_default_pgp_num' is not set, then pgp_num is set to follow pg_num. See, https://github.com/ceph/ceph/commit/0ff1ef3de76e#diff-c44edb74ef654677b989657a8dad6cd87b952012f21fc055c2f531e7ab637b3bR6689

Victoria, can you check whether my suggestion works?

Actions

Copy link

Updated by Victoria Martinez de la Cruz almost 2 years ago

Tested the deployment with "osd_pool_default_pg_autoscale_mode = off" in bootstrap_conf and seems to fix the issue. Has this option been changed to on between Pacific and Quincy? Is setting this value to off advised or should I try a different config?

Current config and cluster status is

stack@devstack-quincy:~$ cat bootstrap_ceph.conf
[global]
log to file = true
osd crush chooseleaf type = 0
osd_pool_default_pg_num = 8
osd_pool_default_pgp_num = 8
osd_pool_default_size = 1
osd_pool_default_pg_autoscale_mode = off
[mon]
mon_warn_on_pool_no_redundancy = False
[osd]
osd_memory_target_autotune = true
osd_numa_auto_affinity = true
[mgr]
mgr/cephadm/autotune_memory_target_ratio = 0.2

stack@devstack-quincy:~/devstack$ sudo cephadm shell
Inferring fsid ddc1d35b-4f75-4956-a55d-6a9d7e77f1d1
Inferring config /var/lib/ceph/ddc1d35b-4f75-4956-a55d-6a9d7e77f1d1/mon.devstack-quincy/config
Using ceph image with id 'e5af760fa1c1' and tag 'v17.2.1' created on 2022-06-23 19:49:45 +0000 UTC
quay.io/ceph/ceph@sha256:d3f3e1b59a304a280a3a81641ca730982da141dad41e942631e4c5d88711a66b
root@devstack-quincy:/# ceph -s
cluster:
id: ddc1d35b-4f75-4956-a55d-6a9d7e77f1d1
health: HEALTH_OK

services:
    mon: 1 daemons, quorum devstack-quincy (age 15m)
    mgr: devstack-quincy.repaot(active, since 14m), standbys: devstack-quincy.marjpi
    mds: 1/1 daemons up, 1 standby
    osd: 1 osds: 1 up (since 13m), 1 in (since 14m)

data:
    volumes: 1/1 healthy
    pools:   4 pools, 25 pgs
    objects: 27 objects, 451 KiB
    usage:   21 MiB used, 30 GiB / 30 GiB avail
    pgs:     25 active+clean

io:
    client:   85 B/s rd, 0 op/s rd, 0 op/s wr

Actions

Copy link

Updated by Ramana Raja almost 2 years ago

Once we confirm that removing osd_pool_default_pgp_num and osd_pool_default_pg_num in devstack-plugin-ceph works, we can close this tracker.
https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/851521

Actions

Copy link

Updated by Ramana Raja over 1 year ago

Status changed from Need More Info to Closed

Closing the bug. Changes in devstack-plugin-ceph, https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/851521 fixed the issue in the OpenStack upstream CI.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #56626

"ceph fs volume create" fails with error ERANGE

Updated by Patrick Donnelly almost 2 years ago

Updated by Patrick Donnelly almost 2 years ago

Updated by Kotresh Hiremath Ravishankar almost 2 years ago

Updated by Kotresh Hiremath Ravishankar almost 2 years ago

Updated by Ramana Raja almost 2 years ago

Updated by Victoria Martinez de la Cruz almost 2 years ago

Updated by Ramana Raja almost 2 years ago

Updated by Ramana Raja over 1 year ago