Project

General

Profile

Actions

Support #20621

closed

Investigate reimaging testnodes after every job

Added by David Galloway almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Category:
Infrastructure Service
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:

Description

It has been suggested (again) that we spend some time researching reimaging testnodes after every job. I'm going to do some research and see what can be done to get reimaging done as fast as possible.

Some ideas:
  1. Revisit edeploy
  2. Either bake or find a bare minimum OS install of each distro and let ceph-cm-ansible do most of the heavy lifting
  3. Tweaks to kickstarts
  4. @minimal instead of @base package group for EL
  5. https://xcat-docs.readthedocs.io/en/stable/advanced/sysclone/sysclone.html or similar solution
  6. After reimage or even during teuthology jobs, run ceph-cm-ansible playbooks on localhost (the testnode) instead of from Cobbler or teuthology.front.
  7. Make sure NIC boot order is set on each testnode to only try booting from the cabled NIC (would save about 5-10sec since that's how long it takes PXE to time out). This would also have the added benefit of saving time when the node is rebooted during a job
  8. Openstack's Ironic basically does what edeploy does -- boots a tiny linux image into memory and DD's an OS image onto the drive via nova-baremetal-agent
Actions

Also available in: Atom PDF