Bug #9787
closed0%
Description
Looks similar to #9702
014-10-14T22:02:20.581 INFO:teuthology.orchestra.run.burnupi34:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2014-10-14T22:02:20.745 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN pool data pg_num 34 > pgp_num 24 2014-10-14T22:02:21.745 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 54, in run_tasks manager.__enter__() File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/ceph.py", line 1090, in restart healthy(ctx=ctx, config=None) File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/ceph.py", line 995, in healthy remote=mon0_remote, File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 822, in wait_until_healthy while proceed(): File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 127, in __call__ raise MaxWhileTries(error_msg) MaxWhileTries: 'wait_until_healthy'reached maximum tries (150) after waiting for 900 seconds
archive_path: /var/lib/teuthworker/archive/teuthology-2014-10-13_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi/546368 branch: giant description: upgrade:dumpling-firefly-x:stress-split/{00-cluster/start.yaml 01-dumpling-install/dumpling.yaml 02-partial-upgrade-firefly/firsthalf.yaml 03-thrash/default.yaml 04-mona-upgrade-firefly/mona.yaml 05-workload/readwrite.yaml 06-monb-upgrade-firefly/monb.yaml 07-workload/rbd_api.yaml 08-monc-upgrade-firefly/monc.yaml 09-workload/{rbd-python.yaml rgw-s3tests.yaml} 10-osds-upgrade-firefly/secondhalf.yaml 11-workload/snaps-few-objects.yaml 12-partial-upgrade-x/first.yaml 13-thrash/default.yaml 14-mona-upgrade-x/mona.yaml 15-workload/rbd-import-export.yaml 16-monb-upgrade-x/monb.yaml 17-workload/readwrite.yaml 18-monc-upgrade-x/monc.yaml 19-workload/radosbench.yaml 20-osds-upgrade-x/osds_secondhalf.yaml 21-final-workload/rgw-swift.yaml distros/ubuntu_14.04.yaml} email: ceph-qa@ceph.com job_id: '546368' kernel: &id001 kdb: true sha1: distro last_in_suite: false machine_type: plana,burnupi,mira name: teuthology-2014-10-13_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi nuke-on-error: true os_type: ubuntu os_version: '14.04' overrides: admin_socket: branch: giant ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 mon warn on legacy crush tunables: false osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 log-whitelist: - slow request - wrongly marked me down - objects unfound and apparently lost - log bound mismatch - wrongly marked me down - objects unfound and apparently lost - log bound mismatch sha1: 674781960b8856ae684520c3b0e9a6b8c2bc7bec ceph-deploy: branch: dev: giant conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 674781960b8856ae684520c3b0e9a6b8c2bc7bec s3tests: branch: giant workunit: sha1: 674781960b8856ae684520c3b0e9a6b8c2bc7bec owner: scheduled_teuthology@teuthology priority: 1000 roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - mon.c - - osd.3 - osd.4 - osd.5 - - client.0 suite: upgrade:dumpling-firefly-x:stress-split suite_branch: giant suite_path: /var/lib/teuthworker/src/ceph-qa-suite_giant targets: ubuntu@burnupi34.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDVHu1u8/oxlx4Gs/CzuGsF6R5obvz8zBIZJ2oW6ZlWn3da3ybaWDEY3rmRtCEpmFIXK5UKFRFEqlKcbDVbl3OB53a4SUcgLgH0YcVgab3zy4rp7SDdBXzGJK7aM7hhGiKY73O7pKpFLX8thRxNIzRBR1Rr49Re41WXfb/45fDl2tiGNMX0QgorKUtMCkeKv4C/NhG4g+pk0j2kur4QCUfFGGzcYJNlpGzmyBoe0g8UYtLAPKOBjpUHY4iDwe2hB36ifiW1T9WvJ3f7/axcZpFuFosdMEJJ3mrIOAeko4CpcV7lJVCT3S/Kj9KsyklLt682ni999dQ/RRHDQkqd0Qth ubuntu@burnupi58.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDGDu9GmokE8emaK7D1+nDTlqZV1YkO/l2lzTlBu3MWBgSiilStQlYCm7swRwfAiujm1tu9XyeIYfyhFfAGClbl21bKWYjUjp3HDifDRpGO6iOOWBXx8rk7tHiGsJV/A/6+3M7M9MLdHRWD0rxVOk58KxLnE7i+1TcPWZ0SeectH10oO5n8D/f0u8EHsNSnfw9dKMBIzfPAZl+KLi1ULVVd36KXi332ZmzNaaMx+OdKRl7DL2dyu7zPF6lfY4N3T+Ret1Rb0WcD+6yZXs/jvD8tAq+FHnLa5M0rcjGwOXF0qfxTYHrS34fahmNiTr4HE6WQxb4B/FlEKHOpAVfKa3fr ubuntu@mira115.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDgh+VAmGnwc68GBShVM5vhbsX68xB34mUrcikqlpsR+AdiRsJ/sTJX1NHgZtR2YYYNUkXEOck5JbCY6H1JYJ7c6t56CdMq6HA46ozL8aLUvpkebV8Ey3gYiDzUk/B7PmRgmW2AUYqFa2jOzkJDop7yMFgM6J2wq/ZbiuvJh1rV2IHQJv1OUjzGQP3cL6GeJVEWNzVZkmEGSn7EtMan7/8SB1KQ7qcRcp7Aol2V6lqB6LdqrQcDOAG3BGi7ssAEwowVrXM0t3f8VuxoEFYJg3rBaxqKyR3bh/uiFdhaOFHKRuOOek5vyHfBEHqiL3+3ogl89329t1W0LJH6gVY5tWAn tasks: - internal.lock_machines: - 3 - plana,burnupi,mira - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.push_inventory: null - internal.serialize_remote_roles: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: branch: firefly - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - rados: clients: - client.0 objects: 500 op_weights: delete: 10 read: 45 write: 45 ops: 4000 - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rbd/test_librbd.sh - install.upgrade: mon.c: null - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: clients: client.0: - rbd/test_librbd_python.sh - rgw: client.0: null default_idle_timeout: 300 - s3tests: client.0: rgw_server: client.0 - install.upgrade: osd.3: branch: firefly - ceph.restart: daemons: - osd.3 - osd.4 - osd.5 - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 - install.upgrade: osd.0: null - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rbd/import_export.sh env: RBD_CREATE_ARGS: --new-format - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - rados: clients: - client.0 objects: 500 op_weights: delete: 10 read: 45 write: 45 ops: 4000 - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - radosbench: clients: - client.0 time: 1800 - install.upgrade: osd.3: null - ceph.restart: daemons: - osd.3 - osd.4 - osd.5 - rgw: client.0: null default_idle_timeout: 300 - swift: client.0: rgw_server: client.0 teuthology_branch: master tube: multi verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.3181
description: upgrade:dumpling-firefly-x:stress-split/{00-cluster/start.yaml 01-dumpling-install/dumpling.yaml 02-partial-upgrade-firefly/firsthalf.yaml 03-thrash/default.yaml 04-mona-upgrade-firefly/mona.yaml 05-workload/readwrite.yaml 06-monb-upgrade-firefly/monb.yaml 07-workload/rbd_api.yaml 08-monc-upgrade-firefly/monc.yaml 09-workload/{rbd-python.yaml rgw-s3tests.yaml} 10-osds-upgrade-firefly/secondhalf.yaml 11-workload/snaps-few-objects.yaml 12-partial-upgrade-x/first.yaml 13-thrash/default.yaml 14-mona-upgrade-x/mona.yaml 15-workload/rbd-import-export.yaml 16-monb-upgrade-x/monb.yaml 17-workload/readwrite.yaml 18-monc-upgrade-x/monc.yaml 19-workload/radosbench.yaml 20-osds-upgrade-x/osds_secondhalf.yaml 21-final-workload/rgw-swift.yaml distros/ubuntu_14.04.yaml} duration: 2669.694764852524 failure_reason: '''wait_until_healthy''reached maximum tries (150) after waiting for 900 seconds' flavor: basic owner: scheduled_teuthology@teuthology success: false
Updated by Samuel Just over 9 years ago
I see the following in the log:
2014-10-14T22:02:27.978 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 117, in run_tasks
suppress = manager.__exit__(*exc_info)
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/thrashosds.py", line 172, in task
thrash_proc.do_join()
File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/ceph_manager.py", line 275, in do_join
self.thread.get()
File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
raise self._exception
CommandFailedError: Command failed on burnupi58 with status 1: 'sudo ceph_objectstore_tool --data-path /var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op list-pg
If this is the export/import stuff, we should disable it until it is more stable.
Updated by Samuel Just over 9 years ago
- Assignee set to David Zafman
- Priority changed from Normal to High
Updated by David Zafman over 9 years ago
- Project changed from Ceph to teuthology
- Assignee changed from David Zafman to Anonymous
The command was uninstalled from the machine while the test is running. Is this part of what happens during an upgrade?
2014-10-14T21:28:56.259 INFO:teuthology.orchestra.run.burnupi58:Running: 'sudo ceph_objectstore_tool --data-path /var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op list-pgs'
2014-10-14T21:28:56.322 INFO:teuthology.orchestra.run.burnupi58.stderr:sudo: ceph_objectstore_tool: command not found
Updated by Anonymous over 9 years ago
- Status changed from New to In Progress
I'm currently working on this. It's just a guess, but I think that this may be related to chef not being automatically installed.
Updated by Anonymous over 9 years ago
This appears to be the same problem as 9627
Updated by Anonymous over 9 years ago
Note to self: 9627 is also this bug. That was prompted by a test run trying to fix 9553. Yaml file ~/tests/t9553.yaml reproduces this. I have put in a few chef: lines to see if this fixes things. It takes a while to run.
Updated by David Zafman over 9 years ago
Not a chef problem, since the ceph_objectstore_tool didn't exist in dumpling or firefly. It was introduced in giant. I'll mark this as duplicate and close other bug.
Updated by David Zafman over 9 years ago
- Status changed from In Progress to Duplicate