Project

General

Profile

Actions

Bug #11995

closed

"HEALTH_WARN 1 mons down, quorum 0,2 b,c" in upgrade:giant-x-next-distro-basic-vps run

Added by Yuri Weinstein almost 9 years ago. Updated almost 9 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/giant-x
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2015-06-11_17:05:04-upgrade:giant-x-next-distro-basic-vps/
Jobs: ['929547', '929549', '929553', '929555', '929563', '929573', '929581', '929583', '929595', '929614', '929616', '929628', '929642', '929648', '929654']
Logs for one: http://qa-proxy.ceph.com/teuthology/teuthology-2015-06-11_17:05:04-upgrade:giant-x-next-distro-basic-vps/929549/

2015-06-11T18:22:26.236 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 3 mons down, quorum  
2015-06-11T18:22:33.237 INFO:teuthology.orchestra.run.vpm127:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2015-06-11T18:22:41.364 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 1 mons down, quorum 0,2 b,c
2015-06-11T18:22:48.365 INFO:teuthology.orchestra.run.vpm127:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2015-06-11T18:22:57.946 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 1 mons down, quorum 0,2 b,c
2015-06-11T18:23:04.947 INFO:teuthology.orchestra.run.vpm127:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2015-06-11T18:23:14.334 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 1 mons down, quorum 0,2 b,c
2015-06-11T18:23:15.334 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 50, in _run_spawned
    mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task
    return fn(**kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/sequential.py", line 48, in task
    mgr.__enter__()
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph.py", line 1136, in restart
    healthy(ctx=ctx, config=None)
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph.py", line 1042, in healthy
    remote=mon0_remote,
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 874, in wait_until_healthy
    while proceed():
  File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 134, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: 'wait_until_healthy' reached maximum tries (150) after waiting for 900 seconds


Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Bug #12064: mon: MMonMetadata send to mons that don't support itResolvedSage Weil06/17/2015

Actions
Actions #1

Updated by Yuri Weinstein almost 9 years ago

  • Project changed from teuthology to Ceph
Actions #2

Updated by Samuel Just almost 9 years ago

2015-06-12 01:25:04.610650 7fc245154700 10 mon.a@1(electing) e1 ms_get_authorizer for mon
2015-06-12 01:25:04.611871 7fc245154700 1 -- 10.214.130.127:6789/0 >> 10.214.130.71:6789/0 pipe(0x3b3e500 sd=21 :55060 s=2 pgs=1006804 cs=2013073 l=0 c=0x3c256e0).do_sendmsg error (32) Broken pipe
2015-06-12 01:25:04.611955 7fc24e9dd700 0 -- 10.214.130.127:6789/0 >> 10.214.130.71:6789/0 pipe(0x3b3e500 sd=21 :55060 s=2 pgs=1006804 cs=2013073 l=0 c=0x3c256e0).fault, initiating reconnect
2015-06-12 01:25:04.611991 7fc245154700 1 -- 10.214.130.127:6789/0 >> 10.214.130.71:6789/0 pipe(0x3b3e500 sd=21 :55060 s=1 pgs=1006804 cs=2013074 l=0 c=0x3c256e0).writer error sending 0x3a758c0, (32) Broken pipe
2015-06-12 01:25:04.612005 7fc245154700 0 -- 10.214.130.127:6789/0 >> 10.214.130.71:6789/0 pipe(0x3b3e500 sd=21 :55060 s=1 pgs=1006804 cs=2013074 l=0 c=0x3c256e0).fault

Actions #3

Updated by Samuel Just almost 9 years ago

  • Status changed from New to Rejected

Network blip

Actions #4

Updated by Samuel Just almost 9 years ago

  • Status changed from Rejected to 12
  • Assignee set to Sage Weil
  • Priority changed from Normal to Urgent
Actions #5

Updated by David Zafman almost 9 years ago

  • Status changed from 12 to Duplicate
Actions

Also available in: Atom PDF