Actions
Bug #18689
closedmira108 and mira072 inaccessible and can't be nuked
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
2017-01-26 16:38:08,006.006 INFO:teuthology.orchestra.console:Power cycling mira108 2017-01-26 16:48:08,192.192 ERROR:teuthology.nuke: Traceback (most recent call last): File "/home/yuriw/teuthology/teuthology/nuke/__init__.py", line 312, in nuke_helper check_console(host) File "/home/yuriw/teuthology/teuthology/nuke/actions.py", line 422, in check_console console.power_cycle() File "/home/yuriw/teuthology/teuthology/orchestra/console.py", line 203, in power_cycle self._wait_for_login(timeout=300) File "/home/yuriw/teuthology/teuthology/orchestra/console.py", line 155, in _wait_for_login raise ConsoleError("Did not get a login prompt from %s!" % self.name) ConsoleError: Did not get a login prompt from mira108.front.sepia.ceph.com! 2017-01-26 16:48:08,192.192 INFO:teuthology.nuke:Will attempt to connect via SSH 2017-01-26 16:48:11,738.738 ERROR:teuthology.nuke:Could not nuke {u'mira108.front.sepia.ceph.com': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDnKiE7063UDRi6+dUhGB49c1jds1c+Gdnbgv5zB8VFt7j51ByjJTFwx46wWJSd3ZFd0yFS+t/aJfNOV61lVhHNI9RClI0ON0W/qupGKenxgsd1ncGKrVMcoKzJ8Khu7XJ7gUS+yFLYbdhJozEW9mYcggp7DYitcD7yHUBahiuDZVrCCYFpi7iGqkcms8mkE1s6UN22nSMyeEoJ3vnN4N8FmduR8HTVOAL/1FxSdRgEd8SHtytg+IKgkKmfuCKqgXSOtjqAQVD5DMEUnzSDAUpNBhZbWDZOrTilgNhWaSD60OCGMWgWjEK26plJQ6YEmJELCSd2T/6H0b3cKT17MHOF'} Traceback (most recent call last): File "/home/yuriw/teuthology/teuthology/nuke/__init__.py", line 281, in nuke_one nuke_helper(ctx, should_unlock) File "/home/yuriw/teuthology/teuthology/nuke/__init__.py", line 317, in nuke_helper remote.connect() File "/home/yuriw/teuthology/teuthology/orchestra/remote.py", line 59, in connect self.ssh = connection.connect(**args) File "/home/yuriw/teuthology/teuthology/orchestra/connection.py", line 104, in connect ssh.connect(**connect_args) File "/home/yuriw/teuthology/virtualenv/local/lib/python2.7/site-packages/paramiko/client.py", line 324, in connect raise NoValidConnectionsError(errors) NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.21.8.104 2017-01-26 16:48:11,740.740 ERROR:teuthology.nuke:Could not nuke the following targets: targets: mira108.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDnKiE7063UDRi6+dUhGB49c1jds1c+Gdnbgv5zB8VFt7j51ByjJTFwx46wWJSd3ZFd0yFS+t/aJfNOV61lVhHNI9RClI0ON0W/qupGKenxgsd1ncGKrVMcoKzJ8Khu7XJ7gUS+yFLYbdhJozEW9mYcggp7DYitcD7yHUBahiuDZVrCCYFpi7iGqkcms8mkE1s6UN22nSMyeEoJ3vnN4N8FmduR8HTVOAL/1FxSdRgEd8SHtytg+IKgkKmfuCKqgXSOtjqAQVD5DMEUnzSDAUpNBhZbWDZOrTilgNhWaSD60OCGMWgWjEK26plJQ6YEmJELCSd2T/6H0b3cKT17MHOF
Updated by David Galloway about 7 years ago
mira108 dropped to initramfs.
dmesg output
[ 2.722018] ================================= [ 2.722018] [ INFO: inconsistent lock state ] [ 2.722019] 4.10.0-rc5-ceph-g10f34ebb06e7 #1 Not tainted [ 2.722020] --------------------------------- [ 2.722021] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 2.722022] cpuhp/5/37 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 2.722023] (tick_broadcast_lock){?.....}, at: [<ffffffffb1126aee>] tick_broadcast_control+0x4e/0x180 [ 2.722031] {IN-HARDIRQ-W} state was registered at: [ 2.722031] [ 2.722034] [<ffffffffb10eac52>] __lock_acquire+0x762/0x1250 [ 2.722034] [ 2.722035] [<ffffffffb10ebba0>] lock_acquire+0x100/0x1f0 [ 2.722035] [ 2.722038] [<ffffffffb189eb80>] _raw_spin_lock_irqsave+0x50/0x70 [ 2.722039] [ 2.722040] [<ffffffffb1126c36>] tick_broadcast_switch_to_oneshot+0x16/0x50 [ 2.722040] [ 2.722042] [<ffffffffb1126fda>] tick_switch_to_oneshot+0x4a/0xc0 [ 2.722042] [ 2.722043] [<ffffffffb11270d5>] tick_init_highres+0x15/0x20 [ 2.722044] [ 2.722046] [<ffffffffb1116c8b>] hrtimer_run_queues+0x8b/0xd0 [ 2.722046] [ 2.722048] [<ffffffffb111519e>] run_local_timers+0x1e/0x50 [ 2.722048] [ 2.722049] [<ffffffffb11151f7>] update_process_times+0x27/0x60 [ 2.722049] [ 2.722051] [<ffffffffb11253af>] tick_periodic+0x2f/0xc0 [ 2.722051] [ 2.722052] [<ffffffffb1125465>] tick_handle_periodic+0x25/0x70 [ 2.722053] [ 2.722056] [<ffffffffb105cd78>] local_apic_timer_interrupt+0x38/0x60 [ 2.722056] [ 2.722058] [<ffffffffb18a1928>] smp_apic_timer_interrupt+0x38/0x50 [ 2.722059] [ 2.722060] [<ffffffffb18a0ab3>] apic_timer_interrupt+0x93/0xa0 [ 2.722060] [ 2.722062] [<ffffffffb189dabc>] mwait_idle+0x6c/0x220 [ 2.722062] [ 2.722065] [<ffffffffb103f93f>] arch_cpu_idle+0xf/0x20 [ 2.722065] [ 2.722066] [<ffffffffb189df33>] default_idle_call+0x23/0x40 [ 2.722067] [ 2.722069] [<ffffffffb10de4ca>] do_idle+0x16a/0x200 [ 2.722069] [ 2.722070] [<ffffffffb10de8e2>] cpu_startup_entry+0x62/0x70 [ 2.722070] [ 2.722072] [<ffffffffb105b2be>] start_secondary+0x14e/0x180 [ 2.722072] [ 2.722074] [<ffffffffb10001c4>] verify_cpu+0x0/0xfc [ 2.722074] irq event stamp: 95 [ 2.722076] hardirqs last enabled at (95): [<ffffffffb189e52c>] _raw_spin_unlock_irq+0x2c/0x40 [ 2.722079] hardirqs last disabled at (94): [<ffffffffb18953ea>] __schedule+0xca/0xb60 [ 2.722082] softirqs last enabled at (0): [<ffffffffb108a5e6>] copy_process+0x576/0x20a0 [ 2.722083] softirqs last disabled at (0): [< (null)>] (null) [ 2.722083] [ 2.722083] other info that might help us debug this: [ 2.722083] Possible unsafe locking scenario: [ 2.722083] [ 2.722084] CPU0 [ 2.722084] ---- [ 2.722084] lock(tick_broadcast_lock); [ 2.722085] <Interrupt> [ 2.722085] lock(tick_broadcast_lock); [ 2.722086] [ 2.722086] *** DEADLOCK *** [ 2.722086] [ 2.722087] no locks held by cpuhp/5/37. [ 2.722087] [ 2.722087] stack backtrace: [ 2.722089] CPU: 5 PID: 37 Comm: cpuhp/5 Not tainted 4.10.0-rc5-ceph-g10f34ebb06e7 #1 [ 2.722089] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 2.722090] Call Trace: [ 2.722093] dump_stack+0x85/0xc2 [ 2.722094] print_usage_bug+0x1e1/0x1f0 [ 2.722096] mark_lock+0x526/0x5d0 [ 2.722098] ? check_usage_forwards+0x100/0x100 [ 2.722099] __lock_acquire+0x5ce/0x1250 [ 2.722101] lock_acquire+0x100/0x1f0 [ 2.722102] ? tick_broadcast_control+0x4e/0x180 [ 2.722106] ? backlight_resume+0x90/0x90 [ 2.722107] _raw_spin_lock+0x38/0x50 [ 2.722108] ? tick_broadcast_control+0x4e/0x180 [ 2.722109] tick_broadcast_control+0x4e/0x180 [ 2.722111] ? backlight_resume+0x90/0x90 [ 2.722112] intel_idle_cpu_online+0x22/0x100 [ 2.722114] cpuhp_invoke_callback+0x1f2/0x810 [ 2.722116] cpuhp_thread_fun+0x4a/0x110 [ 2.722118] smpboot_thread_fn+0x11a/0x1e0 [ 2.722120] kthread+0x10c/0x140 [ 2.722121] ? sort_range+0x30/0x30 [ 2.722122] ? kthread_stop+0x2b0/0x2b0 [ 2.722124] ret_from_fork+0x31/0x40
Updated by David Galloway about 7 years ago
- Category set to Test Node
- Status changed from New to In Progress
mira072 had locked up and wouldn't power on. I power cycled its PDU port and it came back. Will reimage and flash latest firmware.
mira108 has been reimaged and released.
Updated by David Galloway about 7 years ago
- Status changed from In Progress to Resolved
mira072 had its firmware updated, was reimaged and released.
Actions