Bug #47127: osd_id_claims uses shortlabel instead of the FQDN and cannot be fulfilled. - Orchestrator - Ceph

Actions

Copy link

Bug #47127

open

osd_id_claims uses shortlabel instead of the FQDN and cannot be fulfilled.

Added by Daniël Vos over 3 years ago. Updated over 2 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

cephadm/osd

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v15.2.4

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hello!

I was testing replacing disks on my 9 node ceph 15.2.4 cluster provisioned by cephadm and using the orchestrator

We give our hosts a FQDN instead of shortlabel in /etc/hostname.

The cluster:

ceph orch host ls
HOST                    ADDR                    LABELS  STATUS
mon1.ceph2.example.net   mon1.ceph2.example.net
mon2.ceph2.example.net   mon2.ceph2.example.net
mon3.ceph2.example.net   mon3.ceph2.example.net
node1.ceph2.example.net  node1.ceph2.example.net
node2.ceph2.example.net  node2.ceph2.example.net
node3.ceph2.example.net  node3.ceph2.example.net
node4.ceph2.example.net  node4.ceph2.example.net
node5.ceph2.example.net  node5.ceph2.example.net
node6.ceph2.example.net  node6.ceph2.example.net

I have told the orchestrator to remove osd.59 with the --replace flag so that the OSD ID gets reserved for this host as such:

ceph orch osd rm 59 --replace

We have 60 OSDs, 0 through 59. We replaced the disk that was osd.59 on node6.ceph2.example.net and the orchestrator gave it osd.60.
59 is still claimed and reserved for 'node6'.

What I think goes wrong is that node6 does not match node6.ceph2.example.net and therefore the orchestrator hands out a new ID.

The spec:

ceph orch ls osd --export

block_db_size: null
block_wal_size: null
data_devices: null
data_directories: null
db_devices: null
db_slots: null
encrypted: false
journal_devices: null
journal_size: null
objectstore: bluestore
osd_id_claims: {}
osds_per_device: null
placement:
  hosts:
  - hostname: node1.ceph2.example.net
    name: ''
    network: ''
service_id: '1'
service_name: osd.1
service_type: osd
unmanaged: true
wal_devices: null
wal_slots: null
---
block_db_size: null
block_wal_size: null
data_devices:
  all: false
  limit: null
  model: null
  paths: []
  rotational: 0
  size: null
  vendor: null
data_directories: null
db_devices: null
db_slots: null
encrypted: false
journal_devices: null
journal_size: null
objectstore: bluestore
osd_id_claims:
  node6:
  - '59'
osds_per_device: null
placement:
  host_pattern: node*
service_id: nvme_drive_group
service_name: osd.nvme_drive_group
service_type: osd
unmanaged: false
wal_devices: null
wal_slots: null

The cephadm log:

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node6.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node5.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node4.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node3.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node2.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node1.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Found osd claims for drivegroup nvme_drive_group -> {'node6': ['59']}

8/25/20 9:54:17 AM
[INF]
Found osd claims -> {'node6': ['59']}

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node6.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node5.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node4.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node3.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node2.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Applying nvme_drive_group on host node1.ceph2.example.net...

8/25/20 9:54:17 AM
[INF]
Found osd claims for drivegroup nvme_drive_group -> {'node6': ['59']}

8/25/20 9:54:17 AM
[INF]
Found osd claims -> {'node6': ['59']}

My ceph health status is now on WARNING:

root @ node1.ceph2 # ceph health detail
HEALTH_WARN 1 stray daemons(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 1 stray daemons(s) not managed by cephadm
    stray daemon osd.59 on host node6.ceph2.example.net not managed by cephadm

There is no (stray) daemon for osd.59 running on node6.ceph2.example.net. I assume this error is incorrect and should say that there are unclaimed OSD IDs on node6.ceph2.example.net.

root @ node6.ceph2 # docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
398897ed1450        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   18 hours ago        Up 18 hours                             ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.60
6db03a587263        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.57
df367319fe04        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.53
b6a5e5978ba9        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.50
c3628ca8f50e        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.54
89787a68d57e        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.52
cf2e266c0394        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.56
c97f093ca2b4        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.58
e6c8b4b417e3        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.55
42571862a4a1        ceph/ceph:v15.2.4   "/usr/bin/ceph-osd -…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-osd.51
3a9f2cba488b        ceph/ceph:v15.2.4   "/usr/bin/ceph-crash…"   7 days ago          Up 7 days                               ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-crash.node6

As per instructions on https://docs.ceph.com/docs/master/cephadm/concepts/ my hosts are compliant with the second valid way to resolve hostnames

~
root @ node6.ceph2 # hostname
node6.ceph2.example.net

~
root @ node6.ceph2 # hostname -s
node6

If there is any information you need to debug this let me know and I will be happy to give it to you

For now I am stuck with a health state of warning and a claim on osd.59 that I cannot fulfill. How can I remove a OSD claim so that my cluster goes back to healthy?
I have tried removing the osd.nvme_drive_group spec and re-applying my spec.yml but that did not do the trick.

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by Daniël Vos over 3 years ago

I have tracked it down to the

find_destroyed_osds()

function in mgr/cephadm/services/osd.py

The data is extracted from

ceph osd tree

but the tree doesn't take FQDN's into consideration but only shows shortlabels, like such:

root @ mon1.ceph2 # ceph osd tree
ID   CLASS  WEIGHT    TYPE NAME           STATUS  REWEIGHT  PRI-AFF
 -1         54.57889  root default
-15         27.28793      room 1d12
 -9          9.09698          host node4
 30    ssd   0.90970              osd.30      up   1.00000  1.00000
 31    ssd   0.90970              osd.31      up   1.00000  1.00000
 32    ssd   0.90970              osd.32      up   1.00000  1.00000

This is because Ceph automatically sets a ceph-osd daemon's location to be root=default host=HOSTNAME (based on the output from hostname -s ). (Source: https://docs.ceph.com/docs/master/rados/operations/crush-map/)

If I am not mistaken the following happens:
The call to

osd_id_claims.get(host, [])

here: https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/osd.py#L40 will result in an empty list because it looks for the FQDN of my host (node6.ceph2.example.net) in a list that only contains an entry for node6

Ofcourse, someone that has experience with this part of the code would need to confirm my findings (And hopefully think of a fix?)

Actions

Copy link