Project

General

Profile

Actions

Bug #8769

closed

osd.3 crashed in upgrade:dumpling-x:stress-split-firefly---basic-multi suite

Added by Yuri Weinstein almost 10 years ago. Updated almost 10 years ago.

Status:
Rejected
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-05_19:55:01-upgrade:dumpling-x:stress-split-firefly---basic-multi/344776/

I could not find trace in the osd's log file (?)

2014-07-07T12:18:50.439 INFO:teuthology.orchestra.run.mira070:Running: 'rm -rf /home/ubuntu/cephtest/apache/tmp.client.0 && rmdir /home/ubuntu/cephtest/apache/htdocs.client.0'
2014-07-07T12:18:50.487 INFO:teuthology.orchestra.run.mira070:Running: 'rmdir /home/ubuntu/cephtest/apache'
2014-07-07T12:18:50.595 INFO:teuthology.task.thrashosds:joining thrashosds
2014-07-07T12:18:50.595 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 105, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/src/teuthology_master/teuthology/task/thrashosds.py", line 178, in task
    thrash_proc.do_join()
  File "/home/teuthworker/src/teuthology_master/teuthology/task/ceph_manager.py", line 166, in do_join
    self.thread.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
Exception: timed out waiting for admin_socket to appear after osd.3 restart
archive_path: /var/lib/teuthworker/archive/teuthology-2014-07-05_19:55:01-upgrade:dumpling-x:stress-split-firefly---basic-multi/344776
branch: firefly
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rbd-cls.yaml
  6-next-mon/monb.yaml 7-workload/rbd_api.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml}
email: null
job_id: '344776'
last_in_suite: false
machine_type: plana,mira
name: teuthology-2014-07-05_19:55:01-upgrade:dumpling-x:stress-split-firefly---basic-multi
nuke-on-error: true
os_type: ubuntu
os_version: '14.04'
overrides:
  admin_socket:
    branch: firefly
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: 6d6039a5a56743c006a0d081157cb6ee9e3b7af6
  ceph-deploy:
    branch:
      dev: firefly
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: 6d6039a5a56743c006a0d081157cb6ee9e3b7af6
  s3tests:
    branch: firefly
  workunit:
    sha1: 6d6039a5a56743c006a0d081157cb6ee9e3b7af6
owner: scheduled_teuthology@teuthology
priority: 1000
roles:
- - mon.a
  - mon.b
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - osd.3
  - osd.4
  - osd.5
  - mon.c
- - client.0
suite: upgrade:dumpling-x:stress-split
targets:
  ubuntu@mira070.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAyAL58JJIeR08vrPBlno92U2H4WTdNS6tMPXy5j6ASuYycUmACwAlxdCyqhzzhnuHXPMxhGLEhpzPTsbtFkCoLc8QtwxJMQ4S4JhrKTJVJDG5CCPmBmth26RuAR1wtT61CIYak0+Pbo1H1GbHQTBIAzlybwJfcgVHGsQwxWrXz67l5WSwlPuH0InksalSqotuQRx2PMiS1AbWPWl/B8HfNI6M7J828Z0VGhNcPYNay6p7aYZZyWJ13LMQtHdfjePor3TjKjG24UVT6BzMKP+l8Da/lh27IstW3g3KWw136gdE+3QbaV8e1fq7NhG0X78rqx/58D+MCsLqxaoQkJlgMQ==
  ubuntu@mira071.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAst1zlCeD0IgD/y+HQZxNewwT7z5zG0jMxopPq9jd+4bTN2U4rn+Uok1J4+owV+dXbShXRw11W/1TMERNqwE3bpuE7DF1uljOph/KqZa8kDqtApqjRJhvSh3MOWAOqjxK2ZLLPx1KwUyiKvRA+GtlmR2nlQLgNaim72ZHhQhxwts4bHrEwo9eWZ0uD+xaDkDiSjZRffBIIdyrZKJpxUglsS/Pp+YebdcoRay3KqmRTxxlIsMavp79683HqFFG9UbIb9WtbX5l35bZUelkmpc2vMvUpBKYNGz3j4FHPpLIzBVC3Uk8p4PhOQONTHmQeSupBwFNB/TpG0URaJuyOsr+vw==
  ubuntu@mira072.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA0eJaqz9JzA5IgFq8po2chIKwyg+eTY6QuxUJJMXRVn5sB8BSgNW9tWF24glW5wOaLKke6dVKFUm1smjoTeVoXtgLhSXRlbTAOCToTozrE/LmxFHAiXgEAsUdWA8YgycBUiz1yGKBLUEicM7Cuh5nT/JKm8lyr05fNfPXDeV9ASUA3PMgLK7xjzN61+SEGXwVA20MnK+wXUkJ5E4IMdZkHD0Vee31Sffg0V1TfwKAv/ldnpReEPGUX6Y2S1+WyzOAvO/PgS+2969TIyWKLWDs0DdrOpQu0kcDx4LjXIYrfJ0WlOd1OvPGGsyxZ39lY51xhb18ZztK23yseQtAo/qbOw==
tasks:
- internal.lock_machines:
  - 3
  - plana,mira
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.serialize_remote_roles: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.sudo: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: dumpling
- ceph:
    fs: xfs
- install.upgrade:
    osd.0: null
- ceph.restart:
    daemons:
    - osd.0
    - osd.1
    - osd.2
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    thrash_primary_affinity: false
    timeout: 1200
- ceph.restart:
    daemons:
    - mon.a
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - cls/test_cls_rbd.sh
- ceph.restart:
    daemons:
    - mon.b
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rbd/test_librbd.sh
- install.upgrade:
    mon.c: null
- ceph.restart:
    daemons:
    - mon.c
    wait-for-healthy: false
    wait-for-osds-up: true
- ceph.wait_for_mon_quorum:
  - a
  - b
  - c
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rbd/test_librbd_python.sh
- rgw:
    client.0:
      idle_timeout: 300
- swift:
    client.0:
      rgw_server: client.0
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 50
      read: 100
      rollback: 50
      snap_create: 50
      snap_remove: 50
      write: 100
    ops: 4000
teuthology_branch: master
tube: multi
verbose: false
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.9915
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rbd-cls.yaml
  6-next-mon/monb.yaml 7-workload/rbd_api.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml}
duration: 4533.542229890823
failure_reason: timed out waiting for admin_socket to appear after osd.3 restart
flavor: basic
owner: scheduled_teuthology@teuthology
success: false
Actions #1

Updated by Sage Weil almost 10 years ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Sage Weil almost 10 years ago

  • Status changed from New to Rejected

not much to go on without the osd log; let's wait for it to reproduce.

Actions

Also available in: Atom PDF