Project

General

Profile

Actions

Support #49209

closed

Machines are not reachable after reboot

Added by yogesh mane about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
% Done:

0%

Tags:
Reviewed:
Affected Versions:

Description

ymane@magna002:~$ ssh magna021
ssh: connect to host magna021 port 22: No route to host
ymane@magna002:~$ ssh plena001
ssh: connect to host plena001 port 22: Connection timed out

Actions #1

Updated by adam kraitman about 3 years ago

  • Status changed from New to In Progress
  • Assignee set to adam kraitman

Hey yogesh,
Can i reimage the machines ?

Thanks

Actions #2

Updated by David Galloway about 3 years ago

Here's where they're getting stuck.

[FAILED] Failed to mount /mnt/c2.
See 'systemctl status mnt-c2.mount' for details.
[DEPEND] Dependency failed for [   14.024056] EDAC MC0: Giving out device to module sb_edac controller Ivy Bridge SrcID#0_Ha#0: DEV 0000:ff:0e.0 (INTERRUPT)
Local File Syste[   14.024056] EDAC sbridge:  Ver: 1.1.2 
ms.
[DEPEND] Dependency failed for Mark the need to relabel after reboot.
[FAILED] Failed to mount /mnt/c1.
See 'systemctl status mnt-c1.mount' for details.
         Starting Restore /run/initramfs on shutdown...
[  OK  ] Reached target User and Group Name Lookups.
[  OK  ] Reached target Login Prompts.
[  OK  ] Reached target Network.
[  OK  ] Reached target Timers.
[  OK  ] Reached target Ceph cluster 8d835b22-49a4-11eb-9589-002590fc2a2e.
[  OK  ] Reached target All Ceph clusters and services.
[  OK  ] Reached target Network is Online.
         Starting Notify NFS peers of a restart...
[  OK  ] Reached target Sockets.
         Starting Tell Plymouth To Write Out Runtime Data...
         Starting Import network configuration from initramfs...
[  OK  ] Started Restore /run/initramfs on shutdown.
[  OK  ] Started Notify NFS peers of a restart.
[  OK  ] Started Tell Plymouth To Write Out Runtime Data.
[  OK  ] Started Import network configuration from initramfs.
[  OK  ] Started Emergency Shell.
         Starting Create Volatile Files and Directories...
[  OK  ] Reached target Emergency Mode.
[   14.294912] intel_rapl_common: Found RAPL domain package
[  OK  ] Started Create Volatile Files and Directories.
         Starting Security Auditing Service...
         Mounting RPC Pipe File System...
         Starting RPC Bind...
[   14.363553] intel_rapl_common: Found RAPL domain core
[   14.369256] intel_rapl_common: Found RAPL domain dram
[   14.389886] RPC: Registered named UNIX socket transport module.
[   14.399755] RPC: Registered udp transport module.
[   14.407321] RPC: Registered tcp transport module.
[   14.415803] RPC: Registered tcp NFSv4.1 backchannel transport module.
[  OK  ] Mounted RPC Pipe File System.
[  OK  ] Reached target rpc_pipefs.target.
[  OK  ] Reached target NFS client services.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
         Starting Crash recovery kernel arming...
[  OK  ] Started RPC Bind.
[  OK  ] Started Security Auditing Service.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started Update UTMP about System Boot/Shutdown.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.
[  OK  ] Started Crash recovery kernel arming.
You are in emergency mode. After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, "systemctl default" or "exit" 
to boot into default mode.
Give root password for maintenance
(or press Control-D to continue):

You can see this by opening a Serial-Over-LAN connection.

http://wiki.ceph.redhat.com/dokuwiki/doku.php?id=testnodeaccess#ipmi_serial-over-lan

ipmitool -I lanplus -U inktank -P XXXXX -H magna021.ipmi.ceph.redhat.com sol activate

You can either rescue the machines via the console or reimage them by unlocking them and re-locking them.

teuthology-lock --unlock magna021
teuthology-lock --lock magna021 --os-type rhel --os-version X.Y

Actions #3

Updated by Kumar Hemanth about 3 years ago

adam kraitman wrote:

Hey yogesh,
Can i reimage the machines ?

Thanks

Hi adam,

I am unable to use ipmi console via browser due to java web start app issue.
Is it possible to unmount directory "/mnt/c2" and "/mnt/c1" and reboot. before we re-image.

Thanks
Hemanth

Actions #4

Updated by adam kraitman about 3 years ago

Hey Kumar, If you will re-image them those mounts will get disconnected anyway when the machine reboots, So you can run the following
teuthology-lock --unlock magna021
teuthology-lock --lock magna021 --os-type rhel --os-version X.Y

Actions #5

Updated by yogesh mane about 3 years ago

Hi Adam,

I have reimaged magna021 and we will reimage plena001 as well.
We can close this issue.

Thanks
Yogesh

Actions #6

Updated by adam kraitman about 3 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF