Project

General

Profile

Bug #12380

ansible failures (github certificate validation) on setup

Added by Greg Farnum about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
ansible
Target version:
-
Start date:
07/17/2015
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:

Description

Sorry if this is a dup, I can't figure out the actual failure inside of ansible here....

info:   http://pulpito.ceph.com/teuthology-2015-07-14_23:14:02-samba-master---basic-multi/974604/
log:    http://qa-proxy.ceph.com/teuthology/teuthology-2015-07-14_23:14:02-samba-master---basic-multi/974604/
sentry: http://sentry.ceph.com/sepia/teuthology/search?q=d0e784afd67747c58f93c250b5603181

    Command failed with status 2: 'ansible-playbook -v --extra-vars
    \'{"ansible_ssh_user": "ubuntu"}\' -i /etc/ansible/hosts --limit
    burnupi05.front.sepia.ceph.com,burnupi46.front.sepia.ceph.com
    /var/lib/teuthworker/src/ceph-cm-ansible_master/cephlab.yml'

History

#1 Updated by Dan Mick about 4 years ago

Relevant failure:

failed: [burnupi46.front.sepia.ceph.com] => (item={'name': 'aschoen', 'key': 'https://raw.githubusercontent.com/ceph/keys/master/ssh/aschoen.pub'}) => {"failed": true, "item": {"key": "https://raw.githubusercontent.com/ceph/keys/master/ssh/aschoen.pub", "name": "aschoen"}}
msg: Failed to validate the SSL certificate for raw.githubusercontent.com:443. Use validate_certs=False (insecure) or make sure your managed systems have a valid CA certificate installed. Paths checked for this platform: /etc/ssl/certs, /etc/pki/ca-trust/extracted/pem, /etc/pki/tls/certs, /usr/share/ca-certificates/cacert.org, /etc/ansible

(this happened for several of the keys)

This has happened several times recently.

Why is CA validation failing for github all of a sudden? Are we missing a package that contains root certs, and/or should we disable cert validation?

#2 Updated by Dan Mick about 4 years ago

  • Priority changed from High to Urgent

#3 Updated by Dan Mick about 4 years ago

  • Subject changed from ansible failures on setup to ansible failures (github certificate validation) on setup

#4 Updated by Zack Cerza about 4 years ago

My random googling found:
https://github.com/ansible/ansible/issues/6474

Perhaps we should be calling update-ca-certificates ?

#5 Updated by Dan Mick about 4 years ago

Workaround: although it appears undocumented, it looks like supplying the param "validate_certs=no" to the authorized_key module would make fetch_url() avoid trying the validation. I don't know offhand why it should be failing, but this might avoid the issue.

--- a/roles/testnode/tasks/ssh.yml
+++ b/roles/testnode/tasks/ssh.yml
@@ -21,5 +21,6 @@
   authorized_key:
     user="{{ teuthology_user }}" 
     key=https://raw.githubusercontent.com/ceph/keys/autogenerated/ssh/@all.pub
+    validate_certs=no
   tags:
     - pubkeys

#6 Updated by Dan Mick about 4 years ago

I further note that chef, and ansible, both add a wgetrc that disables certificate checking on Debian, but of course that doesn't affect the authorized_key play, which
would have happened outside of the CM in the chef world. Perhaps based on wget.

I also note that nothing seems to install ca-certificates, which is probably where
openssl gets its certs from on Ubuntu at least. burnupi05 where this failed was
trusty.

I did verify that with validate_certs=no on my desktop the CA check doesn't happen, too.

Maybe the right thing is to add ca-certificates to the required packages for test nodes.

#7 Updated by Andrew Schoen about 4 years ago

Maybe we do want ``validate_certs: false`` here? Another work around would be to store these keys directly in the secrets repo without making the call out to keys.git on github.

#8 Updated by Zack Cerza about 4 years ago

Andrew Schoen wrote:

Maybe we do want ``validate_certs: false`` here? Another work around would be to store these keys directly in the secrets repo without making the call out to keys.git on github.

No way am I skipping cert validation when downloading SSH keys :)

I just opened a PR which I'll mention in the next comment.

#9 Updated by Zack Cerza about 4 years ago

  • Status changed from New to Need Review
  • Assignee set to Zack Cerza

This PR causes ansible to retry each key three times with a five-second delay.

https://github.com/ceph/ceph-cm-ansible/pull/86

#10 Updated by Dan Mick about 4 years ago

What about ca-certificates? Do these machines have it installed? Should we add it to the list of packages to install?

#11 Updated by Zack Cerza about 4 years ago

I really doubt if ca-certificates is the issue; I've seen the task succeed on lots of machines without it installed. I also haven't been able to reproduce manually.

#13 Updated by Zack Cerza about 4 years ago

  • Status changed from Testing to Resolved

Also available in: Atom PDF