Bug #57092
teuthology reimage Ubuntu 22.04 fails w/ ssh_keyscan reached maximum tries
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
2022-08-10 13:32:50,385.385 INFO:teuthology.provision.fog.smithi100:Scheduling deploy of ubuntu 22.04 2022-08-10 13:32:50,907.907 INFO:teuthology.orchestra.console:Power off smithi100 2022-08-10 13:33:00,388.388 INFO:teuthology.orchestra.console:Power off for smithi100 completed 2022-08-10 13:33:00,488.488 INFO:teuthology.orchestra.console:Power on smithi100 2022-08-10 13:33:06,197.197 INFO:teuthology.orchestra.console:Power on for smithi100 completed 2022-08-10 13:33:06,299.299 INFO:teuthology.provision.fog.smithi100:Waiting for deploy to finish 2022-08-10 13:38:28,179.179 INFO:teuthology.orchestra.run:Running command with timeout 600 2022-08-10 13:38:28,723.723 INFO:teuthology.provision.fog.smithi100:Node is ready 2022-08-10 13:38:28,789.789 INFO:teuthology.orchestra.run.smithi100.stdout:smithi100.front.sepia.ceph.com 2022-08-10 13:38:28,892.892 INFO:teuthology.orchestra.run.smithi100.stdout:172.21.15.100 smithi100.front.sepia.ceph.com smithi100 2022-08-10 13:38:29,420.420 INFO:teuthology.provision.fog.smithi100:Deploy complete! Traceback (most recent call last): File "/home/dgalloway/git/ceph/teuthology/virtualenv/bin/teuthology-lock", line 33, in <module> sys.exit(load_entry_point('teuthology', 'console_scripts', 'teuthology-lock')()) File "/home/dgalloway/git/ceph/teuthology/scripts/lock.py", line 18, in main sys.exit(teuthology.lock.cli.main(parse_args(sys.argv[1:]))) File "/home/dgalloway/git/ceph/teuthology/teuthology/lock/cli.py", line 211, in main ctx.desc, ctx.os_type, ctx.os_version, ctx.arch) File "/home/dgalloway/git/ceph/teuthology/teuthology/lock/ops.py", line 142, in lock_many return reimage_machines(ctx, machines, machine_type) File "/home/dgalloway/git/ceph/teuthology/teuthology/lock/ops.py", line 325, in reimage_machines reimaged = do_update_keys(list(reimaged.keys()))[1] File "/home/dgalloway/git/ceph/teuthology/teuthology/lock/ops.py", line 288, in do_update_keys keys_dict = misc.ssh_keyscan(machines, _raise=_raise) File "/home/dgalloway/git/ceph/teuthology/teuthology/misc.py", line 1108, in ssh_keyscan while proceed(): File "/home/dgalloway/git/ceph/teuthology/teuthology/contextutil.py", line 133, in __call__ raise MaxWhileTries(error_msg) teuthology.exceptions.MaxWhileTries: 'ssh_keyscan smithi100.front.sepia.ceph.com' reached maximum tries (5) after waiting for 5 seconds
I can reimage and ssh directly though.
History
#1 Updated by Zack Cerza over 1 year ago
- Status changed from New to Resolved
- Assignee set to Zack Cerza
#2 Updated by Matan Breizman over 1 year ago
Also failed on RHEL:
/a/yuriw-2022-08-15_17:54:08-rados-wip-yuri2-testing-2022-08-15-0848-quincy-distro-default-smithi/6973876/
#3 Updated by David Galloway over 1 year ago
Matan Breizman wrote:
Also failed on RHEL:
/a/yuriw-2022-08-15_17:54:08-rados-wip-yuri2-testing-2022-08-15-0848-quincy-distro-default-smithi/6973876/
I think this is different. It seems smithi165 should have actually been available according to the reimage log.
+ TheTimeIs 2022-08-16T00:05:14.063 + touch /.cephlab_net_configured + break + set +e + attempts=0 + myips= + '[' '' '!=' '' ']' + '[' 0 -ge 10 ']' ++ ip -4 addr ++ grep -oP '(?<=inet\s)\d+(\.\d+){3}' ++ grep -v '127.0.0.1\|127.0.1.1' + myips=172.21.15.165 + attempts=1 + sleep 1 + '[' 172.21.15.165 '!=' '' ']' + set -e + '[' -n 172.21.15.165 ']' + for ip in $myips + timeout 1s ping -I 172.21.15.165 -nq -c1 172.21.0.1 ++ dig +short -x 172.21.15.165 @172.21.0.1 ++ sed 's/\.com.*/\.com/g' + newhostname=smithi165.front.sepia.ceph.com + '[' -n smithi165.front.sepia.ceph.com ']' + hostname smithi165.front.sepia.ceph.com ++ hostname -d + newdomain=front.sepia.ceph.com ++ hostname -s + shorthostname=smithi165 + echo smithi165 + grep -q front.sepia.ceph.com /etc/hosts + sed -i 's/.*front.sepia.ceph.com.*/172.21.15.165 smithi165.front.sepia.ceph.com smithi165/g' /etc/hosts + break + command -v zypper + command -v apt-get + '[' -e /.cephlab_rc_local ']' + exit 0 [[0;32m OK [0m] Started /etc/rc.d/rc.local Compatibility. Starting Terminate Plymouth Boot Screen... Starting Hold until boot process finishes up... Red Hat Enterprise Linux 8.4 (Ootpa) Kernel 4.18.0-372.19.1.el8_6.x86_64 on an x86_64 Activate the web console with: systemctl enable --now cockpit.socket smithi165 login:
Yet
2022-08-16T00:00:29.746 INFO:teuthology.provision.fog.smithi074:Waiting for deploy to finish 2022-08-16T00:00:29.794 INFO:teuthology.orchestra.console:Power on for smithi165 completed 2022-08-16T00:00:29.896 INFO:teuthology.provision.fog.smithi165:Waiting for deploy to finish 2022-08-16T00:00:37.897 INFO:teuthology.orchestra.console:Power on for smithi066 completed 2022-08-16T00:00:37.998 INFO:teuthology.provision.fog.smithi066:Waiting for deploy to finish 2022-08-16T00:03:24.571 ERROR:teuthology.orchestra.connection:Error authenticating with smithi066.front.sepia.ceph.com: Authentication failed. 2022-08-16T00:05:01.640 INFO:teuthology.orchestra.run:Running command with timeout 600 2022-08-16T00:05:01.861 INFO:teuthology.provision.fog.smithi074:Node is ready 2022-08-16T00:05:01.877 INFO:teuthology.orchestra.run.smithi074.stdout:smithi074.front.sepia.ceph.com 2022-08-16T00:05:01.933 INFO:teuthology.orchestra.run.smithi074.stdout:172.21.15.74 smithi074.front.sepia.ceph.com smithi074 2022-08-16T00:05:02.252 INFO:teuthology.provision.fog.smithi074:Deploy complete! 2022-08-16T00:05:28.320 INFO:teuthology.orchestra.run:Running command with timeout 600 2022-08-16T00:05:28.389 INFO:teuthology.provision.fog.smithi165:Node is ready 2022-08-16T00:05:28.445 INFO:teuthology.orchestra.run.smithi165.stdout:smithi165.front.sepia.ceph.com 2022-08-16T00:05:28.501 INFO:teuthology.orchestra.run.smithi165.stdout:172.21.15.165 smithi165.front.sepia.ceph.com smithi165 2022-08-16T00:05:28.846 INFO:teuthology.provision.fog.smithi165:Deploy complete! 2022-08-16T00:05:55.963 INFO:teuthology.orchestra.run:Running command with timeout 600 2022-08-16T00:05:56.193 INFO:teuthology.provision.fog.smithi066:Node is ready 2022-08-16T00:05:56.209 INFO:teuthology.orchestra.run.smithi066.stdout:smithi066.front.sepia.ceph.com 2022-08-16T00:05:56.265 INFO:teuthology.orchestra.run.smithi066.stdout:172.21.15.66 smithi066.front.sepia.ceph.com smithi066 2022-08-16T00:05:56.591 INFO:teuthology.provision.fog.smithi066:Deploy complete! 2022-08-16T00:06:05.834 ERROR:teuthology.dispatcher.supervisor:Reimaging error. Nuking machines... Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_9e7483cc68a9eb6b54dacbb0bec3bf23a5d32425/teuthology/dispatcher/supervisor.py", line 209, in reimage reimaged = reimage_machines(ctx, targets, job_config['machine_type']) File "/home/teuthworker/src/git.ceph.com_git_teuthology_9e7483cc68a9eb6b54dacbb0bec3bf23a5d32425/teuthology/lock/ops.py", line 325, in reimage_machines reimaged = do_update_keys(list(reimaged.keys()))[1] File "/home/teuthworker/src/git.ceph.com_git_teuthology_9e7483cc68a9eb6b54dacbb0bec3bf23a5d32425/teuthology/lock/ops.py", line 288, in do_update_keys keys_dict = misc.ssh_keyscan(machines, _raise=_raise) File "/home/teuthworker/src/git.ceph.com_git_teuthology_9e7483cc68a9eb6b54dacbb0bec3bf23a5d32425/teuthology/misc.py", line 1108, in ssh_keyscan while proceed(): File "/home/teuthworker/src/git.ceph.com_git_teuthology_9e7483cc68a9eb6b54dacbb0bec3bf23a5d32425/teuthology/contextutil.py", line 133, in __call__ raise MaxWhileTries(error_msg) teuthology.exceptions.MaxWhileTries: 'ssh_keyscan smithi165.front.sepia.ceph.com' reached maximum tries (5) after waiting for 5 seconds
Maybe it just needed another minute?