Support #49209
closedMachines are not reachable after reboot
Added by yogesh mane about 3 years ago. Updated about 3 years ago.
0%
Description
ymane@magna002:~$ ssh magna021
ssh: connect to host magna021 port 22: No route to host
ymane@magna002:~$ ssh plena001
ssh: connect to host plena001 port 22: Connection timed out
Updated by adam kraitman about 3 years ago
- Status changed from New to In Progress
- Assignee set to adam kraitman
Hey yogesh,
Can i reimage the machines ?
Thanks
Updated by David Galloway about 3 years ago
Here's where they're getting stuck.
[FAILED] Failed to mount /mnt/c2. See 'systemctl status mnt-c2.mount' for details. [DEPEND] Dependency failed for [ 14.024056] EDAC MC0: Giving out device to module sb_edac controller Ivy Bridge SrcID#0_Ha#0: DEV 0000:ff:0e.0 (INTERRUPT) Local File Syste[ 14.024056] EDAC sbridge: Ver: 1.1.2 ms. [DEPEND] Dependency failed for Mark the need to relabel after reboot. [FAILED] Failed to mount /mnt/c1. See 'systemctl status mnt-c1.mount' for details. Starting Restore /run/initramfs on shutdown... [ OK ] Reached target User and Group Name Lookups. [ OK ] Reached target Login Prompts. [ OK ] Reached target Network. [ OK ] Reached target Timers. [ OK ] Reached target Ceph cluster 8d835b22-49a4-11eb-9589-002590fc2a2e. [ OK ] Reached target All Ceph clusters and services. [ OK ] Reached target Network is Online. Starting Notify NFS peers of a restart... [ OK ] Reached target Sockets. Starting Tell Plymouth To Write Out Runtime Data... Starting Import network configuration from initramfs... [ OK ] Started Restore /run/initramfs on shutdown. [ OK ] Started Notify NFS peers of a restart. [ OK ] Started Tell Plymouth To Write Out Runtime Data. [ OK ] Started Import network configuration from initramfs. [ OK ] Started Emergency Shell. Starting Create Volatile Files and Directories... [ OK ] Reached target Emergency Mode. [ 14.294912] intel_rapl_common: Found RAPL domain package [ OK ] Started Create Volatile Files and Directories. Starting Security Auditing Service... Mounting RPC Pipe File System... Starting RPC Bind... [ 14.363553] intel_rapl_common: Found RAPL domain core [ 14.369256] intel_rapl_common: Found RAPL domain dram [ 14.389886] RPC: Registered named UNIX socket transport module. [ 14.399755] RPC: Registered udp transport module. [ 14.407321] RPC: Registered tcp transport module. [ 14.415803] RPC: Registered tcp NFSv4.1 backchannel transport module. [ OK ] Mounted RPC Pipe File System. [ OK ] Reached target rpc_pipefs.target. [ OK ] Reached target NFS client services. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. Starting Crash recovery kernel arming... [ OK ] Started RPC Bind. [ OK ] Started Security Auditing Service. Starting Update UTMP about System Boot/Shutdown... [ OK ] Started Update UTMP about System Boot/Shutdown. Starting Update UTMP about System Runlevel Changes... [ OK ] Started Update UTMP about System Runlevel Changes. [ OK ] Started Crash recovery kernel arming. You are in emergency mode. After logging in, type "journalctl -xb" to view system logs, "systemctl reboot" to reboot, "systemctl default" or "exit" to boot into default mode. Give root password for maintenance (or press Control-D to continue):
You can see this by opening a Serial-Over-LAN connection.
http://wiki.ceph.redhat.com/dokuwiki/doku.php?id=testnodeaccess#ipmi_serial-over-lan
ipmitool -I lanplus -U inktank -P XXXXX -H magna021.ipmi.ceph.redhat.com sol activate
You can either rescue the machines via the console or reimage them by unlocking them and re-locking them.
teuthology-lock --unlock magna021
teuthology-lock --lock magna021 --os-type rhel --os-version X.Y
Updated by Kumar Hemanth about 3 years ago
adam kraitman wrote:
Hey yogesh,
Can i reimage the machines ?Thanks
Hi adam,
I am unable to use ipmi console via browser due to java web start app issue.
Is it possible to unmount directory "/mnt/c2" and "/mnt/c1" and reboot. before we re-image.
Thanks
Hemanth
Updated by adam kraitman about 3 years ago
Hey Kumar, If you will re-image them those mounts will get disconnected anyway when the machine reboots, So you can run the following
teuthology-lock --unlock magna021
teuthology-lock --lock magna021 --os-type rhel --os-version X.Y
Updated by yogesh mane about 3 years ago
Hi Adam,
I have reimaged magna021 and we will reimage plena001 as well.
We can close this issue.
Thanks
Yogesh
Updated by adam kraitman about 3 years ago
- Status changed from In Progress to Resolved