Project

General

Profile

Bug #57870

cephadm: --apply-spec is trying to do too much and failing as a result

Added by Adam King 4 months ago. Updated 17 days ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

--apply-spec is intended to do 2 things:

1) distribute ssh keys to hosts with hosts specs in the applied spec
2) open a shell and apply the given spec to the mgr module

Right now, the first part of this can fail (which can subsequently cause the second part to fail if it tries to add a host that doesn't have the pub key present). I think this is largely because we are trying to do some amount of actual yaml parsing in binary without the yaml parsing and this is very error prone. We should only be doing the bare minimum to get the hostnames and addresses we need to distribute the ssh keys and then apply the spec to the mgr module. Most of the attempted parsing is unnecessary.

For example, trying to apply the spec

service_type: host
addr: 192.168.122.177
hostname: vm-01
---
service_type: host
addr: 192.168.122.217
hostname: vm-02
---
service_type: alertmanager
service_name: alertmanager
placement:
  count: 1
---
service_type: crash
service_name: crash
placement:
  host_pattern: '*'
---

resulted in

Applying spec.yaml to cluster
Unable to parse spec.yaml succesfully
Non-zero exit code 22 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay.io/adk3798/ceph:latest -e NODE_NAME=vm-00 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/d6b74f9a-4bde-11ed-91b8-5254004b5306:/var/log/ceph:z -v /tmp/ceph-tmprbejd10z:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpmsw36jg1:/etc/ceph/ceph.conf:z -v /root/spec.yaml:/tmp/spec.yml:ro quay.io/adk3798/ceph:latest orch apply -i /tmp/spec.yml
/usr/bin/ceph: stderr Error EINVAL: Failed to connect to vm-01 (192.168.122.177). Permission denied
/usr/bin/ceph: stderr Log: Opening SSH connection to 192.168.122.177, port 22
/usr/bin/ceph: stderr [conn=1] Connected to SSH server at 192.168.122.177, port 22
/usr/bin/ceph: stderr [conn=1]   Local address: 192.168.122.125, port 52840
/usr/bin/ceph: stderr [conn=1]   Peer address: 192.168.122.177, port 22
/usr/bin/ceph: stderr [conn=1] Beginning auth for user root
/usr/bin/ceph: stderr [conn=1] Auth failed for user root
/usr/bin/ceph: stderr [conn=1] Connection failure: Permission denied
/usr/bin/ceph: stderr [conn=1] Aborting connection
/usr/bin/ceph: stderr 

Applying spec.yaml to cluster failed!

despite the spec being valid


Related issues

Copied to Orchestrator - Backport #58454: quincy: cephadm: --apply-spec is trying to do too much and failing as a result In Progress

History

#1 Updated by Adam King 4 months ago

  • Backport set to quincy

#2 Updated by Adam King 4 months ago

  • Pull request ID set to 48496

#3 Updated by Adam King 17 days ago

  • Status changed from In Progress to Pending Backport

#4 Updated by Backport Bot 17 days ago

  • Copied to Backport #58454: quincy: cephadm: --apply-spec is trying to do too much and failing as a result added

#5 Updated by Backport Bot 17 days ago

  • Tags set to backport_processed

Also available in: Atom PDF