Project

General

Profile

Bug #23753

"Error ENXIO: problem getting command descriptions from osd.4" in upgrade:kraken-x-luminous-distro-basic-smithi

Added by Yuri Weinstein almost 6 years ago. Updated almost 6 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/kraken-x
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2018-04-15_03:25:02-upgrade:kraken-x-luminous-distro-basic-smithi/
Jobs: '2398926', '2398931'
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2018-04-15_03:25:02-upgrade:kraken-x-luminous-distro-basic-smithi/2398926/teuthology.log

2018-04-15T20:00:29.442 INFO:teuthology.orchestra.run.smithi182:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.3.asok dump_historic_ops'
2018-04-15T20:00:29.449 INFO:teuthology.orchestra.run.smithi049.stderr:osd.2: osd_enable_op_tracker = 'true'
2018-04-15T20:00:29.462 INFO:teuthology.orchestra.run.smithi049.stderr:osd.3: osd_enable_op_tracker = 'true'
2018-04-15T20:00:29.463 INFO:teuthology.orchestra.run.smithi049.stderr:Error ENXIO: problem getting command descriptions from osd.4
2018-04-15T20:00:29.463 INFO:teuthology.orchestra.run.smithi049.stderr:osd.4: problem getting command descriptions from osd.4
2018-04-15T20:00:29.475 INFO:teuthology.orchestra.run.smithi049.stderr:osd.5: osd_enable_op_tracker = 'true'
2018-04-15T20:00:29.506 INFO:tasks.ceph.osd.3.smithi182.stderr:2018-04-15 20:00:29.506015 7f10fb637700 -1 received  signal: Hangup from  PID: 13271 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 3  UID: 0
2018-04-15T20:00:29.532 INFO:teuthology.orchestra.run.smithi049:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok dump_ops_in_flight'
2018-04-15T20:00:29.606 INFO:tasks.ceph.osd.0.smithi049.stderr:2018-04-15 20:00:29.606517 7fdd5ee64700 -1 received  signal: Hangup from  PID: 14672 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 0  UID: 0
2018-04-15T20:00:29.652 INFO:teuthology.orchestra.run.smithi049:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok dump_blocked_ops'
2018-04-15T20:00:29.707 INFO:tasks.ceph.osd.0.smithi049.stderr:2018-04-15 20:00:29.707005 7fdd5ee64700 -1 received  signal: Hangup from  PID: 14672 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 0  UID: 0
2018-04-15T20:00:29.808 INFO:tasks.ceph.osd.2.smithi049.stderr:2018-04-15 20:00:29.808011 7fc36e15d700 -1 received  signal: Hangup from  PID: 80801 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 2  UID: 0
2018-04-15T20:00:29.839 INFO:teuthology.orchestra.run.smithi049:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok dump_historic_ops'
2018-04-15T20:00:29.908 INFO:tasks.ceph.osd.5.smithi182.stderr:2018-04-15 20:00:29.908787 7fed37f6e700 -1 received  signal: Hangup from  PID: 24710 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-04-15T20:00:29.914 INFO:teuthology.orchestra.run.smithi049.stderr:noscrub is unset
2018-04-15T20:00:30.010 INFO:tasks.ceph.osd.1.smithi049.stderr:2018-04-15 20:00:30.009816 7f8252483700 -1 received  signal: Hangup from  PID: 94916 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 1  UID: 0
..........
2018-04-15T20:00:30.518 INFO:tasks.ceph.osd.5.smithi182.stderr:2018-04-15 20:00:30.518380 7fed37f6e700 -1 received  signal: Hangup from  PID: 24710 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-04-15T20:00:30.530 INFO:teuthology.orchestra.run.smithi049:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok dump_blocked_ops'
2018-04-15T20:00:30.539 INFO:teuthology.orchestra.run.smithi182:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 0 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.4.asok dump_ops_in_flight'
2018-04-15T20:00:30.619 INFO:tasks.ceph.osd.5.smithi182.stderr:2018-04-15 20:00:30.619288 7fed37f6e700 -1 received  signal: Hangup from  PID: 24710 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-04-15T20:00:30.630 INFO:teuthology.orchestra.run.smithi182.stderr:admin_socket: exception getting command descriptions: [Errno 111] Connection refused
2018-04-15T20:00:30.639 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_ceph_luminous/qa/tasks/ceph_manager.py", line 917, in wrapper
    return func(self)
  File "/home/teuthworker/src/git.ceph.com_ceph_luminous/qa/tasks/ceph_manager.py", line 1040, in do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/git.ceph.com_ceph_luminous/qa/tasks/ceph_manager.py", line 438, in revive_osd
    skip_admin_check=skip_admin_check)
  File "/home/teuthworker/src/git.ceph.com_ceph_luminous/qa/tasks/ceph_manager.py", line 2432, in revive_osd
    timeout=timeout, stdout=DEVNULL)
  File "/home/teuthworker/src/git.ceph.com_ceph_luminous/qa/tasks/ceph_manager.py", line 1513, in wait_run_admin_socket
    id=service_id))
Exception: timed out waiting for admin_socket to appear after osd.4 restart

History

#1 Updated by Yuri Weinstein almost 6 years ago

  • Project changed from CephFS to Ceph

#2 Updated by Greg Farnum almost 6 years ago

  • Project changed from Ceph to RADOS
  • Priority changed from Urgent to High

This generally means the OSD isn't on?

#3 Updated by Nathan Cutler almost 6 years ago

  • Description updated (diff)

#4 Updated by Josh Durgin almost 6 years ago

  • Priority changed from High to Normal

#5 Updated by Josh Durgin almost 6 years ago

  • Status changed from New to Can't reproduce

re-open if it recurs

Also available in: Atom PDF