Bug #14459
closedcephlab_rc_local needs to be smarter about waiting for the network
0%
Description
From a conversation with dgalloway:
<wusui> i am having trouble ssh'ing from magna002 to magna049 magna079 and magna086 (all ask for ubuntu's password)
<dgalloway> ok. that leads me to believe something's wrong in the ansible. the logs indicate the user is created and password is deleted but that's obviously not the case
<wusui> okay. i have enough oher machines locked. it's not crucial that you figure out these machines right now.
<dgalloway> i'll fix 079 and 086. can you put a ticket in with your observations and say 049 can be used to investigate?
<wusui> okay
<dgalloway> i just want to make sure it doesn't get forgotten
<dgalloway> you can just paste this IRC convo if you want
<wusui> okay
<dgalloway> thanks
Updated by Zack Cerza over 8 years ago
Just a bit of information: magna049 was reimaged a second time today, and ansible wasn't properly run after the second reimage.
[root@magna049 ~]# TZ=UTC ls -l /.cephlab_rc_local -rw-r--r--. 1 root root 0 Jan 21 18:30 /.cephlab_rc_local [root@magna049 ~]# ls -l /ceph-qa-ready ls: cannot access /ceph-qa-ready: No such file or directory
[root@magna001 ansible]# TZ=UTC ls -l /var/log/ansible/magna049.log -rw-r--r-- 1 root root 320903 Jan 21 17:04 /var/log/ansible/magna049.log
[root@magna049 ~]# cat /tmp/rc.local.log + '[' -e /.cephlab_rc_local ']' + sleep 30 + wget -t1 -O /dev/null http://10.8.128.1:80/cblr/svc/op/trig/mode/post/system/magna049 --2016-01-21 13:29:12-- http://10.8.128.1/cblr/svc/op/trig/mode/post/system/magna049 Connecting to 10.8.128.1:80... connected. HTTP request sent, awaiting response... 500 Internal Server Error 2016-01-21 13:30:12 ERROR 500: Internal Server Error. + true + touch /.cephlab_rc_local
Updated by Zack Cerza over 8 years ago
If you look at the http error log on magna001 around the timestamp: Thu Jan 21 18:31:14 2016
, there are lots of timeouts...
Updated by Zack Cerza about 8 years ago
- Project changed from 19 to ceph-cm-ansible
- Subject changed from Some magna machines are not reachable via passwordless ssh to cephlab_rc_local needs to be smarter about waiting for the network
- Assignee set to Zack Cerza
Updated by David Galloway almost 8 years ago
- Status changed from New to Resolved
- Assignee changed from Zack Cerza to David Galloway
Updated by David Galloway almost 8 years ago
- Has duplicate Bug #14477: Refine post-install cobbler ansible trigger added