Project

General

Profile

Actions

Bug #10126

closed

"Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-distro-basic-vps run

Added by Yuri Weinstein over 9 years ago. Updated over 9 years ago.

Status:
Rejected
Priority:
Urgent
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-17_08:56:42-upgrade:giant-x-next-distro-basic-vps/604989/

2014-11-17T11:18:01.619 INFO:tasks.thrashosds:joining thrashosds
2014-11-17T11:18:01.620 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 119, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/thrashosds.py", line 174, in task
    thrash_proc.do_join()
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph_manager.py", line 288, in do_join
    self.thread.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
Exception: timed out waiting for admin_socket to appear after osd.10 restart
Actions #1

Updated by Yuri Weinstein over 9 years ago

  • Subject changed from "Exception: timed out waiting for admin_socket" in to "Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-distro-basic-vps run
Actions #2

Updated by Samuel Just over 9 years ago

  • Assignee set to David Zafman
  • Priority changed from Normal to Urgent
Actions #3

Updated by David Zafman over 9 years ago

  • Status changed from New to Rejected

I suspect this is slowness in VM machines. There are no core files and nothing I could see of interest in osd.10 log. The load average might have been about 6 around time of timeout, and as high as 18 was seen.

Actions #4

Updated by Yuri Weinstein over 9 years ago

Still an issue, not sure what to make of it.
Run http://pulpito.ceph.com/teuthology-2014-11-24_17:05:01-upgrade:giant-x-next-distro-basic-vps/

Job http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-24_17:05:01-upgrade:giant-x-next-distro-basic-vps/619428/

2014-11-24T22:48:57.505 INFO:teuthology.orchestra.run.vpm052.stdout:successfully deleted pool unique_pool_2
2014-11-24T22:48:57.509 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-11-24T22:48:57.509 DEBUG:teuthology.run_tasks:Unwinding manager rados
2014-11-24T22:48:57.509 INFO:tasks.rados:joining rados
2014-11-24T22:48:57.509 DEBUG:teuthology.run_tasks:Unwinding manager rados
2014-11-24T22:48:57.509 INFO:tasks.rados:joining rados
2014-11-24T22:48:57.509 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-11-24T22:48:57.509 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds
2014-11-24T22:48:57.510 INFO:tasks.thrashosds:joining thrashosds
2014-11-24T22:48:57.510 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 119, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/thrashosds.py", line 174, in task
    thrash_proc.do_join()
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph_manager.py", line 288, in do_join
    self.thread.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
Exception: timed out waiting for admin_socket to appear after osd.13 restart
Actions #5

Updated by Yuri Weinstein over 9 years ago

  • Status changed from Rejected to New
Actions #6

Updated by David Zafman over 9 years ago

  • Status changed from New to Rejected

Second time this was reproduced a misc.log was available. The load average was 17 on node with osd.13 which must not have started up in a timely manner:

2014-11-25T00:29:25.931795-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 13
2014-11-25T00:29:25.931815-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:29:40.298567-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 17
2014-11-25T00:29:55.313893-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 17
2014-11-25T00:29:59.135975-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:04.493125-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:04.861795-05:00 vpm045 CROND19050: (root) CMD (/usr/lib64/sa/sa1 1 1)
2014-11-25T00:30:09.716262-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:10.335799-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 17
2014-11-25T00:30:14.871124-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:24.773126-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:25.351825-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 20
2014-11-25T00:30:32.665934-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:38.845530-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:40.366813-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 19
2014-11-25T00:30:47.604863-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:30:55.382290-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 18
2014-11-25T00:31:04.185813-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:10.040583-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:10.393310-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 17
2014-11-25T00:31:17.417797-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:25.409078-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 16
2014-11-25T00:31:28.725539-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:40.425752-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 16
2014-11-25T00:31:40.690805-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:48.630767-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:53.806705-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:31:55.441199-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 15
2014-11-25T00:31:59.339533-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:10.452191-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 14
2014-11-25T00:32:12.831295-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:18.838913-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:24.090366-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:25.465091-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 14
2014-11-25T00:32:37.203739-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:40.479649-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 15
2014-11-25T00:32:48.064169-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:53.757125-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:32:55.490796-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 15
2014-11-25T00:33:00.988492-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:33:10.505857-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 15
2014-11-25T00:33:10.765120-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:33:18.896284-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:33:25.521236-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 14
2014-11-25T00:33:26.384788-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:33:32.955774-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:33:40.535540-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 15
2014-11-25T00:33:55.546410-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 16
2014-11-25T00:34:00.594364-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:34:10.561813-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 16
2014-11-25T00:34:13.413935-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:34:21.705924-05:00 vpm045 sudo: ubuntu : TTY=unknown ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.13.asok dump_ops_in_flight
2014-11-25T00:34:25.576320-05:00 vpm045 sendmail1231: rejecting connections on daemon MTA: load average: 17

Actions

Also available in: Atom PDF