Project

General

Profile

Actions

Bug #9702

closed

"MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-distro-basic-multi run

Added by Yuri Weinstein over 9 years ago. Updated over 9 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_19:20:01-upgrade:firefly-x-giant-distro-basic-multi/531749/

2014-10-07T10:00:23.096 INFO:teuthology.orchestra.run.plana83:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2014-10-07T10:00:23.374 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 1 requests are blocked > 32 sec
2014-10-07T10:00:23.940 INFO:tasks.rados.rados.0.plana09.stdout:update_object_version oid 2 v 920 (ObjNum 714 snap 221 seq_num 714) dirty exists
2014-10-07T10:00:24.375 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 50, in _run_spawned
    mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 39, in run_one_task
    return fn(**kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/sequential.py", line 48, in task
    mgr.__enter__()
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/ceph.py", line 1090, in restart
    healthy(ctx=ctx, config=None)
  File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/ceph.py", line 995, in healthy
    remote=mon0_remote,
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 822, in wait_until_healthy
    while proceed():
  File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 127, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: 'wait_until_healthy'reached maximum tries (150) after waiting for 900 seconds
archive_path: /var/lib/teuthworker/archive/teuthology-2014-10-06_19:20:01-upgrade:firefly-x-giant-distro-basic-multi/531749
branch: giant
description: upgrade:firefly-x/parallel/{0-cluster/start.yaml 1-firefly-install/firefly.yaml
  2-workload/ec-rados-default.yaml 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 4-final-upgrade/client.yaml
  5-final-workload/rados_loadgenmix.yaml distros/ubuntu_12.04.yaml}
email: ceph-qa@ceph.com
job_id: '531749'
kernel: &id001
  kdb: true
  sha1: distro
last_in_suite: false
machine_type: plana,burnupi,mira
name: teuthology-2014-10-06_19:20:01-upgrade:firefly-x-giant-distro-basic-multi
nuke-on-error: true
os_type: ubuntu
os_version: '12.04'
overrides:
  admin_socket:
    branch: giant
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - scrub mismatch
    - ScrubResult
    sha1: 260933b1107182e37ef30593134dfea415fb3a3b
  ceph-deploy:
    branch:
      dev: giant
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: 260933b1107182e37ef30593134dfea415fb3a3b
  s3tests:
    branch: giant
  workunit:
    sha1: 260933b1107182e37ef30593134dfea415fb3a3b
owner: scheduled_teuthology@teuthology
priority: 1000
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
- - mon.b
  - mon.c
  - osd.2
  - osd.3
- - client.0
  - client.1
suite: upgrade:firefly-x
suite_branch: master
suite_path: /var/lib/teuthworker/src/ceph-qa-suite_master
targets:
  ubuntu@plana09.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDEYHVcLAeMRhz4qfwrWIFsf79HIqaChMbGC2LGuvVgkMM/a1uVGhsYFHWlfDKMo9UnShyfjTUUmpD9jjjNpfuejhynQeTYBLjjrvE2yS7J9chyxRhSCXUN1rnAa5UmcDzd9CJjltX9h18iHNDPRGu1H3gzaZzonQo9Hwk6H+Xhubz/Y7GYBRq4jySYCnQ11hNj2pdnwxfiqjNawjaB3yYYTnA6NT9QUEPNNtTgFStyuACTKBI1JowDdHaxafRz0lMnCoU+r84gSZ+TWqyNkj9+8tyir6QqXQRV9CDlAQPIPiMPVnl9u17vEISaIR/I5RL7G3KR5CxER/Kc0oIhXPn1
  ubuntu@plana73.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZ2A+g0XPeZCpKCT1zMMe/aFZFeCv92KQNgVc4vln9Nr1zrJ9Lntpim0wMUwSbxOEVRRPGqqmbkDtA9FZ5TWNI6azqokdFe/4GWIq4hTb0KeinmxtKEQTEcPfGtnf37I2QOPIlltDLNIWAAjBB3GUWe76kKhdbKl44HsqcDXIXcodrbLuxbNZ6YNY+wGNda22+LzFZ8clyYjG9FmiFO65ykV6HBm5gn8LF7dVNpCPzk7Occq3T+/w4lJfIcEZVKtGs/a1NJOa3hJltprzzjamsjTxqaPfvN8H/EFllI5wwbvcZJ/ryikg7CXo9YczApz/YP1GGB0bx3Fmic8fGg7CT
  ubuntu@plana83.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDRaI5lWSgbamgXGNjThU4cSIYyLBgFaaMSPHKuimiUTqvuO8yCqK6aTNIq2cc7SoaDbVayx9m7Kr8CF5WAzXyBy/bfEInCckhI1sQL+x+9gRaisL5w7lSS/paxrroaTO/pG+JNydCZmLZZdmzI++g4jCqlxONu5pfUCoO5nBEnBEgyMZKB6mz7NFQIlHhmCUE61uepVE4CUUPFISEWBotxyrRAkROZGv8lncNZFM0MKXjX55ep360O8x9v0hYddqW3BO3Aszbro93161Lw/sOfXuO1n/bn9s2qfDed9wbPYeA9RiLRZMVZuNwNMETH2gH0uuMKK6q7p375WVvLFegF
tasks:
- internal.lock_machines:
  - 3
  - plana,burnupi,mira
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.push_inventory: null
- internal.serialize_remote_roles: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.sudo: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: firefly
- print: '**** done installing firefly'
- ceph:
    fs: xfs
- print: '**** done ceph'
- parallel:
  - workload
  - upgrade-sequence
- print: '**** done parallel'
- install.upgrade:
    client.0: null
- print: '**** done install.upgrade'
- workunit:
    clients:
      client.1:
      - rados/load-gen-mix.sh
teuthology_branch: master
tube: multi
upgrade-sequence:
  sequential:
  - install.upgrade:
      mon.a: null
  - print: '**** done install.upgrade mon.a to the version from teuthology-suite arg'
  - ceph.restart:
      daemons:
      - mon.a
      wait-for-healthy: true
  - sleep:
      duration: 60
  - ceph.restart:
      daemons:
      - osd.0
      - osd.1
      wait-for-healthy: true
  - sleep:
      duration: 60
  - ceph.restart:
    - mds.a
  - sleep:
      duration: 60
  - print: '**** running mixed versions of osds and mons'
  - exec:
      mon.b:
      - ceph osd crush tunables firefly
  - install.upgrade:
      mon.b: null
  - print: '**** done install.upgrade mon.b to the version from teuthology-suite arg'
  - ceph.restart:
      daemons:
      - mon.b
      - mon.c
      wait-for-healthy: true
  - sleep:
      duration: 60
  - ceph.restart:
      daemons:
      - osd.2
      - osd.3
      wait-for-healthy: true
  - sleep:
      duration: 60
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.3172
workload:
  sequential:
  - rados:
      clients:
      - client.0
      ec_pool: true
      objects: 50
      op_weights:
        append: 100
        copy_from: 50
        delete: 50
        read: 100
        rmattr: 25
        rollback: 50
        setattr: 25
        snap_create: 50
        snap_remove: 50
        write: 0
      ops: 4000
description: upgrade:firefly-x/parallel/{0-cluster/start.yaml 1-firefly-install/firefly.yaml
  2-workload/ec-rados-default.yaml 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 4-final-upgrade/client.yaml
  5-final-workload/rados_loadgenmix.yaml distros/ubuntu_12.04.yaml}
duration: 1743.8906021118164
failure_reason: '''wait_until_healthy''reached maximum tries (150) after waiting for
  900 seconds'
flavor: basic
owner: scheduled_teuthology@teuthology
success: false

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #9835: osd: bug in misdirected op checks (firefly)Resolved10/20/2014

Actions
Actions #2

Updated by Sage Weil over 9 years ago

  • Status changed from New to Duplicate

probably dup of #9835

Actions #3

Updated by Yuri Weinstein over 9 years ago

Update in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-26_18:13:01-upgrade:firefly-x-giant-distro-basic-multi/

Jobs ['571925', '571926'] - seemed failed due to the same reason.

Actions #6

Updated by Yuri Weinstein over 9 years ago

Same issues in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-10_17:18:01-upgrade:firefly-x-next-distro-basic-vps/

Failure: 'wait_until_healthy'reached maximum tries (150) after waiting for 900 seconds
['594228', '594230']
Actions

Also available in: Atom PDF