Bug #52550: cephadm:using --single-host-defaults with bootstrap results in health warn state - Orchestrator - Ceph

Actions

Copy link

Bug #52550

closed

cephadm:using --single-host-defaults with bootstrap results in health warn state

Added by Paul Cuzner over 2 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

cephadm

Target version:

Ceph - v17.0.0

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v17.0.0

ceph-qa-suite:

Pull request ID:

43616

Crash signature (v1):

Crash signature (v2):

Description

After deploying a single node cluster on a 2 disk vm with --single-host-defaults, and adding OSDs and pools, you see health warn due to pg undersized healthcheck
e.g.
[ceph: root@cephtest /]# ceph health detail
HEALTH_WARN Degraded data redundancy: 33 pgs undersized
[WRN] PG_DEGRADED: Degraded data redundancy: 33 pgs undersized
pg 1.0 is stuck undersized for 24m, current state active+undersized, last acting [1]
pg 2.0 is stuck undersized for 23m, current state active+undersized, last acting [0]
pg 2.1 is stuck undersized for 23m, current state active+undersized, last acting [1]
pg 2.2 is stuck undersized for 23m, current state active+undersized, last acting [0]
pg 2.3 is stuck undersized for 23m, current state active+undersized, last acting [1]
pg 2.4 is stuck undersized for 23m, current state active+undersized, last acting [1]
pg 2.5 is stuck undersized for 23m, current state active+undersized, last acting [1]
pg 2.6 is stuck undersized for 23m, current state active+undersized, last acting [1]
pg 2.7 is stuck undersized for 23m, current state active+undersized, last acting [1]

Looking at cephadm src I can see the crush leaf type being set to 0 (osd) and the default pool size set to 2, but the default crush rule is not changed. So whena pool create runs, the default rule is used (0), which is still set to leaf of host.

Steps to reproduce
1. create a machine with 2 disks for osd use
2. bootstrap the cluster with --single-host-defaults
3. configure the osds with ceph orch apply osd --all-available-devices
4. create a pool - ceph osd pool create rbd

the pool will point to the default crush rule (0) which is still set with leaf = host

Kudos to Chris Blum for reporting the issue.