Bug #14619
closedmira039 missing drive 5
0%
Description
I mira039 was listed as 'coluld not nuke' on teh stale nodes list:
2016-02-02 16:41:22,836.836 ERROR:teuthology.nuke:Could not nuke {u'mira039.front.sepia.ceph.com': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUlZ2hLNhfSTqsWtuDLJ7Zk+kGcUy93lP6jgfcAscmyLiHhtX86VLWMoANh1Hp/A2ZY35wX6WvFGly5qjQY/3sKAIg2IeGfarz1t7bz4HM67cQZwnCrk9BeChFw8ICwwCCF6/V4NLHphSF+6pxpzipiv2tj+w1ZTv4xHIlX/TA/ThtTRp0vuSX+FTDTY6HqFduEUnTRuV1IDB02Qt92CnlfWAWAZHIZ6FUjrBLnaFVp37E+0dmLijV1nRqUN5ldy6Cl0RZ0P/ksyiMcZ69SY9sAFRFaulJTdXCX+Ki+XHfN2XQGcWBBooRCt2+f3ToPuVuSA5vTvcpty1pTt/QS/nv'}
Could not sol activate:
yuriw@Yuris-MacBook-Air:~$ ipmitool XXXXXXX sol activate [SOL Session operational. Use ~? for help] ERROR: Received message with invalid authcode! Assertion failed: (0), function ipmi_lan_poll_recv, file lanplus.c, line 659. Abort trap: 6
It was locked by VPSHOST, I think I saw it was actually coming up as stale locked by Sage ?!teuthology-lock --unlock --owner VPSHOST@VPSHOST mira036
It's marked down now
Updated by Yuri Weinstein about 8 years ago
- Subject changed from can't ssh nuke or access via ipmi sol mira036 to can't ssh nuke or access via ipmi sol mira039
Updated by Dan Mick about 8 years ago
- Subject changed from can't ssh nuke or access via ipmi sol mira039 to mira039 missing drive 5
- Description updated (diff)
- Assignee set to David Galloway
The upshot is: mira036 was not involved; mira039 was unresponsive, and on reboot spent a long time determining that drive 5 had failed, but now is just in "needs drive 5" state.
Updated by Dan Mick about 8 years ago
it's also behaving badly on reboot; still spending a lot of time in RAID BIOS. Maybe the drive is failed in such a way that it's poisoning the attempts to probe it, and it would behave better with the drive removed/replaced, but...be prepared for something deeper being wrong.
Updated by David Galloway about 8 years ago
- Status changed from New to In Progress
Updated RAID controller firmware from V1.49 2011-08-24 to V1.52 2015-11-20 and system boots considerably faster during RAID BIOS
Ticket is on hold till we get more drives.
Updated by David Galloway about 8 years ago
Something's up with this machine. I started a reimage on 5 miras at the same time. The other four are done and this one hasn't gotten through package installation yet.
Maybe try replacing RAID controller after all.
Updated by David Galloway about 8 years ago
- Status changed from In Progress to Resolved
I'm unable to reproduce any weirdness on this machine now. Releasing to the pool and will try replacing RAID controller if issues persist.