Project

General

Profile

Bug #49277

Updated by Sebastian Wagner about 3 years ago

The feature introduced by https://tracker.ceph.com/issues/44873 seems to have the following flaw. 

   If I bootstrap a cluster on node oc0-ceph-0 _with_ --apply-spec, then the bootstrap proceeds but the spec [1] is never applied and the cephadm log shows it waiting acquire a lock. 

   If I bootstrap a cluster on node oc0-ceph-0 _without_ --apply-spec and then apply the same spec file [1] a few seconds later, then the spec is applied flawlessly.  


 I have ansible tasks I can use to easily reproduce [2][3] to ensure a consistent test. I used cephadm-15.2.5-0.el8.x86_64.rpm with with the latest "docker.io/ceph/ceph:v15" as of Jan 10, 2020. 


 The command run by ansible is: 

  /usr/sbin/cephadm bootstrap --ssh-private-key /home/ceph-admin/.ssh/id_rsa --ssh-public-key /home/ceph-admin/.ssh/id_rsa.pub --ssh-user ceph-admin --output-keyring /etc/ceph/ceph.client.admin.keyring --output-config /etc/ceph/ceph.conf --fsid 77642368-c850-5eb9-ba49-e59024b4d0ab --mon-ip 192.168.24.6 


 FWIW: This is not a major problem for TripleO's cephadm integration because we can bootstrap a single node and apply the spec afterwards.  



 [1] ceph_spec.yml 
 --- 

 <pre><code class="yaml"> 
 service_type: host 
 addr: oc0-ceph-1 
 hostname: oc0-ceph-1 
 --- 
 service_type: host 
 addr: oc0-ceph-2 
 hostname: oc0-ceph-2 
 --- 
 service_type: mon 
 placement: 
   hosts: 
     - oc0-ceph-0 
     - oc0-ceph-1 
     - oc0-ceph-2 
 --- 
 service_type: osd 
 service_id: default_drive_group 
 placement: 
   hosts: 
     - oc0-ceph-0 
     - oc0-ceph-1 
     - oc0-ceph-2 
 data_devices: 
   all: true 

 </code></pre> 

 [2] https://review.opendev.org/c/openstack/tripleo-ansible/+/770674/54/tripleo_ansible/roles/tripleo_cephadm/tasks/bootstrap.yaml 
 [3] https://review.opendev.org/c/openstack/tripleo-ansible/+/770674/54/tripleo_ansible/roles/tripleo_cephadm/tasks/apply_spec.yaml 

Back