Bug #21317
openUpdate VPS with latest distro: RuntimeError: Could not reconnect to ubuntu@vpm129.front.sepia.ceph.com
0%
Description
We see this quite frequently in vps jobs, where the task tries to update the kernel and eventually after kernel update(mostly xenial) it fails to reconnect. Ilya has been looking at something similar to this
but can we update the VPS images to just use the latest distro so that kernel update task can then be skipped?
2017-09-08T17:08:53.310 INFO:teuthology.orchestra.run.vpm139:Running: 'sudo python -c \'import shutil, sys; shutil.copyfileobj(sys.stdin, file(sys.argv[1], "wb"))\' /etc/grub.d/01_ceph_kernel && sudo chmod 755 /etc/grub.d/01_ceph_kernel' 2017-09-08T17:08:53.430 INFO:teuthology.task.kernel:Distro Kernel Version: 4.4.0-93-generic 2017-09-08T17:08:53.439 INFO:teuthology.orchestra.run.vpm139:Running: 'sudo update-grub' 2017-09-08T17:08:53.575 INFO:teuthology.orchestra.run.vpm139.stderr:Generating grub configuration file ... 2017-09-08T17:08:53.624 INFO:teuthology.orchestra.run.vpm139.stderr:Found linux image: /boot/vmlinuz-4.4.0-93-generic 2017-09-08T17:08:53.630 INFO:teuthology.orchestra.run.vpm139.stderr:Found initrd image: /boot/initrd.img-4.4.0-93-generic 2017-09-08T17:08:53.735 INFO:teuthology.orchestra.run.vpm139.stderr:Found linux image: /boot/vmlinuz-4.4.0-34-generic 2017-09-08T17:08:53.741 INFO:teuthology.orchestra.run.vpm139.stderr:Found initrd image: /boot/initrd.img-4.4.0-34-generic 2017-09-08T17:08:53.846 INFO:teuthology.orchestra.run.vpm139.stderr:done 2017-09-08T17:08:53.851 INFO:teuthology.orchestra.run.vpm139:Running: 'sudo shutdown -r now' 2017-09-08T17:08:53.859 INFO:teuthology.misc:Re-opening connections... 2017-09-08T17:08:53.864 INFO:teuthology.misc:trying to connect to ubuntu@vpm139.front.sepia.ceph.com 2017-09-08T17:08:53.870 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'vpm139.front.sepia.ceph.com', 'key_filename': ['/home/teuthworker/.ssh/id_rsa'], 'timeout': 60} 2017-09-08T17:08:54.367 INFO:teuthology.orchestra.run.vpm139:Running: 'true' 2017-09-08T17:08:54.714 INFO:teuthology.misc:trying to connect to ubuntu@vpm045.front.sepia.ceph.com 2017-09-08T17:08:54.718 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'vpm045.front.sepia.ceph.com', 'key_filename': ['/home/teuthworker/.ssh/id_rsa'], 'timeout': 60}
Updated by David Galloway over 6 years ago
- Category set to Infrastructure Service
Vasu Kulkarni wrote:
That VPS eventually came back up at Sep 8 17:19:34.
Are you asking for CentOS 7.4? I'm not sure what you're asking for here.
Updated by Vasu Kulkarni over 6 years ago
David Galloway wrote:
Vasu Kulkarni wrote:
That VPS eventually came back up at Sep 8 17:19:34.
Are you asking for CentOS 7.4? I'm not sure what you're asking for here.
Sorry I am not asking for 7.4, I am asking is to refresh the current VPS so that it contains the latest distro kernel and the kernel taks would be skipped since the kernel is already latest. right now it tries to update kernel and eventually fails during reconnect.
If you look at the smoke jobs which use vps, you will see atleast couple of jobs that fail due to reconnect issue
ex: http://pulpito.ceph.com/teuthology-2017-09-08_05:00:13-smoke-master-testing-basic-vps/
Updated by Vasu Kulkarni over 6 years ago
Related to : http://tracker.ceph.com/issues/19918
Updated by David Galloway over 6 years ago
Is 4.4.0-92-generic the kernel you're looking for?
EDIT: Determined even the latest 16.04 cloud image ships with 4.4.0-92-generic. 4.4.0-93-generic is the latest. I've asked Vasu to test these jobs on OVH nodes instead of VPSes.