Project

General

Profile

Actions

Bug #50306

closed

/etc/hosts is not passed to ceph containers. clusters that were relying on /etc/hosts for name resolution will have strange behavior

Added by John Fulton about 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
cephadm
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While using `cephadm bootstrap --apply-spec` to bootstrap a spec containing other hosts, cephadm attempts to set up SSH keys for root on those other hosts even if though I passed the following options which refer to an account and SSH keypair which are already working.

  --ssh-private-key /home/ceph-admin/.ssh/id_rsa \
  --ssh-public-key /home/ceph-admin/.ssh/id_rsa.pub \
  --ssh-user ceph-admin \

I understand cephadm defaulting to root for the SSH user and root SSH keys when sudo is used, but maybe cephadm needs to use keys/accounts of the above parameters to override the defaults if they are passed.

I hit this issue even witht he fix for bug #50041 installed as reported in Bug #49277.

[ceph-admin@oc0-ceph-2 ~]$ sudo /usr/sbin/cephadm --image quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 bootstrap --skip-firewalld --ssh-private-key /home/ceph-admin/.ssh/id_rsa --ssh-public-key /home/ceph-admin/.ssh/id_rsa.pub --ssh-user ceph-admin --allow-fqdn-hostname --output-keyring /etc/ceph/ceph.client.admin.keyring --output-config /etc/ceph/ceph.conf --fsid ca9bf37b-ed0f-4e5a-bb21-e5b5f9b75135 --apply-spec /home/ceph-admin/specs/ceph_spec.yaml --config /home/ceph-admin/bootstrap_ceph.conf --skip-monitoring-stack --skip-dashboard --mon-ip 192.168.24.22                                            
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/bin/podman) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: ca9bf37b-ed0f-4e5a-bb21-e5b5f9b75135
Verifying IP 192.168.24.22 port 3300 ...
Verifying IP 192.168.24.22 port 6789 ...
Mon IP 192.168.24.22 is in CIDR network 192.168.24.0/24
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64...
Ceph version: ceph version 16.2.0 (0c2054e95bcd9b30fdd908a79ac1d8bbc3394442) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 192.168.24.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Using provided ssh keys...
Adding host oc0-ceph-2...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Applying /home/ceph-admin/specs/ceph_spec.yaml to cluster
Adding ssh key to oc0-ceph-3
Adding ssh key to oc0-ceph-4
Non-zero exit code 22 from /bin/podman run --rm --ipc=host --no-hosts --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 -e NODE_NAME=oc0-ceph-2 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/ca9bf37b-ed0f-4e5a-bb21-e5b5f9b75135:/var/log/ceph:z -v /tmp/ceph-tmpbfayxvwp:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpjm_n51dx:/etc/ceph/ceph.conf:z -v /home/ceph-admin/specs/ceph_spec.yaml:/tmp/spec.yml:z quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 orch apply -i /tmp/spec.yml                                            
/usr/bin/ceph: stderr Error EINVAL: Failed to connect to oc0-ceph-3 (oc0-ceph-3).
/usr/bin/ceph: stderr Please make sure that the host is reachable and accepts connections using the cephadm SSH key
/usr/bin/ceph: stderr
/usr/bin/ceph: stderr To add the cephadm SSH key to the host:
/usr/bin/ceph: stderr > ceph cephadm get-pub-key > ~/ceph.pub
/usr/bin/ceph: stderr > ssh-copy-id -f -i ~/ceph.pub ceph-admin@oc0-ceph-3
/usr/bin/ceph: stderr
/usr/bin/ceph: stderr To check that the host is reachable:
/usr/bin/ceph: stderr > ceph cephadm get-ssh-config > ssh_config
/usr/bin/ceph: stderr > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
/usr/bin/ceph: stderr > chmod 0600 ~/cephadm_private_key
/usr/bin/ceph: stderr > ssh -F ssh_config -i ~/cephadm_private_key ceph-admin@oc0-ceph-3
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 7924, in <module>
    main()
  File "/usr/sbin/cephadm", line 7912, in main
    r = ctx.func(ctx)
  File "/usr/sbin/cephadm", line 1717, in _default_image
    return func(ctx)
  File "/usr/sbin/cephadm", line 4037, in command_bootstrap
    out = cli(['orch', 'apply', '-i', '/tmp/spec.yml'], extra_mounts=mounts)
  File "/usr/sbin/cephadm", line 3931, in cli
    ).run(timeout=timeout)
  File "/usr/sbin/cephadm", line 3174, in run
    desc=self.entrypoint, timeout=timeout)
  File "/usr/sbin/cephadm", line 1411, in call_throws
    raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --no-hosts --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 -e NODE_NAME=oc0-ceph-2 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/ca9bf37b-ed0f-4e5a-bb21-e5b5f9b75135:/var/log/ceph:z -v /tmp/ceph-tmpbfayxvwp:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpjm_n51dx:/etc/ceph/ceph.conf:z -v /home/ceph-admin/specs/ceph_spec.yaml:/tmp/spec.yml:z quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 orch apply -i /tmp/spec.yml                                         
[ceph-admin@oc0-ceph-2 ~]$ 
[ceph-admin@oc0-ceph-2 ~]$ cat /home/ceph-admin/specs/ceph_spec.yaml
---
service_type: host
addr: oc0-ceph-3
hostname: oc0-ceph-3
---
service_type: host
addr: oc0-ceph-4
hostname: oc0-ceph-4
---
service_type: mon
placement:
  hosts:
    - oc0-ceph-2
    - oc0-ceph-3
    - oc0-ceph-4
---
service_type: osd
service_id: default_drive_group
placement:
  hosts:
    - oc0-ceph-2
    - oc0-ceph-3
    - oc0-ceph-4
data_devices:
  all: true
[ceph-admin@oc0-ceph-2 ~]$ 

Files

gt7DGcXc.txt (9.29 KB) gt7DGcXc.txt Daniel Pivonka, 04/14/2021 06:12 PM

Related issues 1 (0 open1 closed)

Related to Orchestrator - Bug #49654: iSCSI stops working after Upgrade 15.2.4 -> 15.2.9Resolved

Actions
Actions

Also available in: Atom PDF