Project

General

Profile

Actions

Bug #4596

closed

broken ipmi on plana48

Added by Sage Weil about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Sandon Van Ness
Category:
-
Target version:
-
% Done:

100%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/a/teuthology-2013-03-30_01:00:05-rados-next-testing-basic

6232/summary.yaml:failure_reason: 'IPMI console required for powercycling, but not available on osd role:

Actions #1

Updated by Sandon Van Ness about 11 years ago

IPMI looks ok on that machine however from the error message it had a problem getting 'IPMI console'. Just a guess (as I haven't looked into the code) but I am guessing it checks that by getty login prompt or something. That machine is crashed (stuck in kbd prompt). Because of this you would not get a getty login prompt when you accessed the machine via SOL which could be why it got that error.

Or do you think it was just another networking issue cropping up?

Actions #2

Updated by Sage Weil about 11 years ago

  • Status changed from New to Closed

Makes sense!

Actions #3

Updated by Sage Weil about 11 years ago

  • Status changed from Closed to In Progress

Actually.. hmm. IIRC 6233 also errored out with the same message. After the first error, it should have nuked the node (and powercycled it). Can you see if there are other clues in that job's output?

Actions #4

Updated by Sage Weil about 11 years ago

also, alex said on ceph-qa:

>> 6430: (1147s) collection:cephfs clusters:fixed-3.yaml fs:btrfs.yaml
tasks:kclient_workunit_suites_pjd.yaml
>>     [Errno 9] Bad file descriptor
(plana 11, 48, 64)

This too.  I also couldn't reach the plana 48 console.
Actions #5

Updated by Sandon Van Ness about 11 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Something went wrong when the inktank user got setup on this machine. Probably some dropped IPMI commands. I fixed it up. It looked like:

root@sigoto: 12:03 PM :~# ipmitool -I lanplus -H plana48.ipmi.sepia.ceph.com -U root -P XXXXXXXXX user list
ID Name Callin Link Auth IPMI Msg Channel Priv Limit
2 root true true true ADMINISTRATOR
3 planatemp true false true ADMINISTRATOR
4 true true true ADMINISTRATOR

and now:

root@sigoto: 12:05 PM :~# ipmitool -I lanplus -H plana48.ipmi.sepia.ceph.com -U root -P XXXXXXXXX user list
ID Name Callin Link Auth IPMI Msg Channel Priv Limit
2 root true true true ADMINISTRATOR
3 planatemp true false true ADMINISTRATOR
4 inktank true true true ADMINISTRATOR

I also tested the sol using the inktank user.

Actions

Also available in: Atom PDF