Bug #9714
closedDead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
0%
Description
Dead jobs: ['534726', '534728', '534730', '534737', '534738', '534741', '534742']
Dead: 2014-10-09T03:51:11.965 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
One for example, logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi/534726/
Last lines:
2014-10-09T03:51:05.869 INFO:tasks.thrashosds.thrasher:in_osds: [0, 1, 2, 3, 4, 5] out_osds: [] dead_osds: [] live_osds: [1, 0, 3, 2, 5, 4] 2014-10-09T03:51:05.869 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0 2014-10-09T03:51:05.870 INFO:tasks.thrashosds.thrasher:Removing osd 0, in_osds are: [0, 1, 2, 3, 4, 5] 2014-10-09T03:51:05.870 INFO:teuthology.orchestra.run.mira046:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd out 0' 2014-10-09T03:51:06.956 INFO:teuthology.orchestra.run.mira046.stderr:marked out osd.0. 2014-10-09T03:51:11.965 INFO:tasks.thrashosds.thrasher:in_osds: [1, 2, 3, 4, 5] out_osds: [0] dead_osds: [] live_osds: [1, 0, 3, 2, 5, 4] 2014-10-09T03:51:11.965 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
archive_path: /var/lib/teuthworker/archive/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi/534726 branch: giant description: upgrade:dumpling-firefly-x:stress-split/{00-cluster/start.yaml 01-dumpling-install/dumpling.yaml 02-partial-upgrade-firefly/firsthalf.yaml 03-thrash/default.yaml 04-mona-upgrade-firefly/mona.yaml 05-workload/rbd-cls.yaml 06-monb-upgrade-firefly/monb.yaml 07-workload/radosbench.yaml 08-monc-upgrade-firefly/monc.yaml 09-workload/{rbd-python.yaml rgw-s3tests.yaml} 10-osds-upgrade-firefly/secondhalf.yaml 11-workload/snaps-few-objects.yaml 12-partial-upgrade-x/first.yaml 13-thrash/default.yaml 14-mona-upgrade-x/mona.yaml 15-workload/rbd-import-export.yaml 16-monb-upgrade-x/monb.yaml 17-workload/readwrite.yaml 18-monc-upgrade-x/monc.yaml 19-workload/radosbench.yaml 20-osds-upgrade-x/osds_secondhalf.yaml 21-final-workload/rados_stress_watch.yaml distros/ubuntu_14.04.yaml} email: ceph-qa@ceph.com job_id: '534726' kernel: &id001 kdb: true sha1: distro last_in_suite: false machine_type: plana,burnupi,mira name: teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi nuke-on-error: true os_type: ubuntu os_version: '14.04' overrides: admin_socket: branch: giant ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 mon warn on legacy crush tunables: false osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 log-whitelist: - slow request - wrongly marked me down - objects unfound and apparently lost - log bound mismatch - wrongly marked me down - objects unfound and apparently lost - log bound mismatch sha1: 3bfb5fab41b6247259183c3f52c786e35beb3b01 ceph-deploy: branch: dev: giant conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 3bfb5fab41b6247259183c3f52c786e35beb3b01 s3tests: branch: giant workunit: sha1: 3bfb5fab41b6247259183c3f52c786e35beb3b01 owner: scheduled_teuthology@teuthology priority: 1000 roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - mon.c - - osd.3 - osd.4 - osd.5 - - client.0 suite: upgrade:dumpling-firefly-x:stress-split suite_branch: master suite_path: /var/lib/teuthworker/src/ceph-qa-suite_master targets: ubuntu@mira046.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDYIVIibHwBdEDJ25Owjw8QkSrlozG8FODNRxu1ttOagKkY3uaBnwVVQw0sLCHZi3n1O1nAWWRTfY69/4OPxJIRuFy6Jqz8dx9d6SHIZk1IwS+PUM1s2vJVY7cm3V3ibfQBmiyTD8ydRlKW8nmOMMHnMz5on1zNFgPgwEVziXdr0dmU5qakTwkUOchrHka/fH6CzAvHTmMANsWgpMek/Nqs2fxRF7/bufj4/4H8Et6AP2iF7mGIgE5beg+WLoXHE4mQdv5Zcs6FsDFiKpLSxrZFa6fx4VO1H0sRwbFdDVKuASH68HT+8eni6qvm+l2wYJHloAuYYbpH6xBMhMTW97WZ ubuntu@plana24.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC722lqzhbMaA+ku5+dLCoZyGcbaXA5/YIPr4oEHwmfjeRZIuARFjMpJaA/+TNdUACVxBP/vePdPYxurjBi0RggFa4YqvKoN8m1RcVZ8QiSPqDCBb+Og6Tjc7/NrRdP9wiJHwCqAhJ2Mgc94NX3oHs1WvASmeY1LI0B29ufDCSyR5p8MGxTWc4JggBEUHWI8jPEKrN+GxvLD/Ezya6t48TG3yN1BApJH8VzniCGf2J1IBoQ5vc8AnjtNYJCyTMhuX0aKOxIphyVEIJC3bz3VeyHfFNIoTJrXriIxhP6LfBF8UbQMhKPbiVpkJbqFqBmOMlgNCnex60fEpqO6DuI82bh ubuntu@plana27.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDC65VysYnJj+jrnAlVO9Ibdj56IzYapOaGvYU2pGAyCOZq6UbUkg0nGa+snNTIFBO640lsn+RFGEl8+DtxM6NKQeRW3GZKXf6+CtouEAg18qaU7wTdFu++A5Fp5SjIaTpFko1IDBuGelYfXfXmySIDMa54IorjevKPYHvO7soCxy/Y4lfotp/wna9xzr4QIKG4fQ4dC+wuQANT8gdfg/c7jSIq7sioHfs3Xg7dH/nKIg4KqLuMi7gc36tXLwWHWlpIzXR9WFMzlFSsRD1pIx8cyYv6rHSAj1vEizygMOaErqynioVVIE7UT+Qwp1HoJlShdsLwqFtxDseftRTQWzO3 tasks: - internal.lock_machines: - 3 - plana,burnupi,mira - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.push_inventory: null - internal.serialize_remote_roles: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: branch: firefly - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - workunit: branch: dumpling clients: client.0: - cls/test_cls_rbd.sh - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - radosbench: clients: - client.0 time: 1800 - install.upgrade: mon.c: null - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: clients: client.0: - rbd/test_librbd_python.sh - rgw: client.0: null default_idle_timeout: 300 - s3tests: client.0: rgw_server: client.0 - install.upgrade: osd.3: branch: firefly - ceph.restart: daemons: - osd.3 - osd.4 - osd.5 - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 - install.upgrade: osd.0: null - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rbd/import_export.sh env: RBD_CREATE_ARGS: --new-format - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - rados: clients: - client.0 objects: 500 op_weights: delete: 10 read: 45 write: 45 ops: 4000 - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - radosbench: clients: - client.0 time: 1800 - install.upgrade: osd.3: null - ceph.restart: daemons: - osd.3 - osd.4 - osd.5 - workunit: clients: client.0: - rados/stress_watch.sh teuthology_branch: master tube: multi verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.3172
Updated by Yuri Weinstein over 9 years ago
Same problem in run - http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_19:00:01-upgrade:dumpling-x-firefly-distro-basic-multi/
suite:upgrade:dumpling-x
Jobs: '537893', '537898', '537899', '537904', '537905', '537910', '537911'
One example logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_19:00:01-upgrade:dumpling-x-firefly-distro-basic-multi/537893/
2014-10-11T02:28:29.353 INFO:tasks.thrashosds.thrasher:in_osds: [2, 3, 5, 1, 4] out_osds: [0] dead_osds: [] live_osds: [1, 0, 3, 2, 5, 4] 2014-10-11T02:28:29.353 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0 2014-10-11T02:28:29.353 INFO:tasks.thrashosds.thrasher:inject_pause on 3 2014-10-11T02:28:29.353 INFO:tasks.thrashosds.thrasher:Testing filestore_inject_stall pause injection for duration 3 2014-10-11T02:28:29.354 INFO:tasks.thrashosds.thrasher:Checking after 0, should_be_down=False 2014-10-11T02:28:29.354 INFO:teuthology.orchestra.run.mira115:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.3.asok config set filestore_inject_stall 3' 2014-10-11T02:28:34.439 INFO:tasks.thrashosds.thrasher:in_osds: [2, 3, 5, 1, 4] out_osds: [0] dead_osds: [] live_osds: [1, 0, 3, 2, 5, 4] 2014-10-11T02:28:34.439 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
Updated by Yuri Weinstein over 9 years ago
- Assignee set to Samuel Just
Sam, can you take a look at this?
Still an issue in one off run - http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-13_08:58:30-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-vps/543574/teuthology.log
Updated by Sage Weil over 9 years ago
- Status changed from New to Duplicate
i think this is a dup of #9757