Bug #8204
"timed out waiting for admin_socket to appear after osd.5 restart" in upgrade:dumpling-x:stress-split-firefly-distro-basic-vps
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I could not reproduce it manually, but after consulting with devel still logging, so we can trace to similar race conditions.
2014-04-24T02:27:38.131 DEBUG:teuthology.orchestra.run:Running [10.214.138.165]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage rados rmpool unique_pool_0 unique_pool_0 --yes-i-really-really-mean-it' 2014-04-24T02:27:39.010 INFO:teuthology.orchestra.run.out:[10.214.138.165]: successfully deleted pool unique_pool_0 2014-04-24T02:27:39.013 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart 2014-04-24T02:27:39.013 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart 2014-04-24T02:27:39.013 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds 2014-04-24T02:27:39.013 INFO:teuthology.task.thrashosds:joining thrashosds 2014-04-24T02:27:39.013 ERROR:teuthology.run_tasks:Manager failed: thrashosds Traceback (most recent call last): File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 92, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/teuthworker/teuthology-firefly/teuthology/task/thrashosds.py", line 172, in task thrash_proc.do_join() File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph_manager.py", line 153, in do_join self.thread.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get raise self._exception Exception: timed out waiting for admin_socket to appear after osd.5 restart 2014-04-24T02:27:39.113 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
archive_path: /var/lib/teuthworker/archive/teuthology-2014-04-23_20:35:02-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/212185 description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml 6-next-mon/monb.yaml 7-workload/radosbench.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_12.04.yaml} email: null job_id: '212185' kernel: &id001 kdb: true sha1: distro last_in_suite: false machine_type: vps name: teuthology-2014-04-23_20:35:02-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps nuke-on-error: true os_type: ubuntu os_version: '12.04' overrides: admin_socket: branch: firefly ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 mon warn on legacy crush tunables: false osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 log-whitelist: - slow request - wrongly marked me down - objects unfound and apparently lost - log bound mismatch sha1: 2708c3c559d99e6f3b557ee1d223efa3745f655c ceph-deploy: branch: dev: firefly conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 2708c3c559d99e6f3b557ee1d223efa3745f655c s3tests: branch: master workunit: sha1: 2708c3c559d99e6f3b557ee1d223efa3745f655c owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - - osd.3 - osd.4 - osd.5 - mon.c - - client.0 targets: ubuntu@vpm026.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPlbj7ZynqKzcinro/0OnwOJ9ZzSlS3lQnuRrF0BN6I9sbQ+cc8XJgfjHayj1dVfbVhrTHSOpI/W5Ye5y67s7StF17+7bvlij4nvum48Gj0j49Kk4o92fB+dEEUXGNJ8DMgt+a38Ir3rofY7QUSYTKW6ral4vMXcx5WDIMN5snVqavVyimBEx8R1WHgwyJvMhngnIQso1QAiGdJGBnc9PpLta5MrMcjSDdUkqV/j+lKQPJOxZ46iNtclwQT7N8ebPNjvrrBfq6NXYwrF0cyAHdXn6vmTWekJanngrvX9vBS5SwxQlavHhYRnXD1IU9lgu6AKgMGDi4VYuze70R0qmN ubuntu@vpm081.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC02OsNIQMfPdbMm+6ZoA5GGHe1JtzCavw5MLTDaDZg/Lu+MigtAAF6rF+y0Dexo/1zb4nSBIKuXLSrIqOa2o97I4h6dUDZFLEXVOa099OpzPWH4AQvQHvcWX8uTonTJwgQoNfq7ZiiZsfxiw8bGuUDs4E0JzirW4wdMq3vx/08BP2JEmcUHSy3R2bjSWkrR/UQ/2TwjHBdxuViUdJw2Qb82OhXN4s+japtthmzPkXM01K3Utd73aX4vF0ugysZBp7PNICjwMvUIm3TPyXVtxNLasRA9+KYyWwhAYWF6ARwRhfwdjXb+hlItYQMBnYVa4B7lpYo+/2cIKlTq6JWeME7 ubuntu@vpm082.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDxC+68142vzg20oxUeTY36SNiNHikRRrYaPPwD/mKkgf5QXdMhDEZXP5tSwpkmAliEwtxpvW22LW81CsJh0peEyKJtTAdgMCcatDYsjZAiMVYFUYIxFmudV+79g4ZSUS0ThRf876cCBivZLg5CZ2drCjaR8bTKWfWrjVsvBeWnBcKEGzSDI9hy1Sl49s8B8zMA5J6jPv52nnCxnQ89kHjHmj1mokhx2ftGgnX2+hdYISLIMjd/ZNAs/WT88ChgLCAmbotNJ1EWGtI9U68VSGqVjFeMtckCwRMhcmEqCSUvXRfvtzKtxKAUDxBVCPrESJoWxgenE5EON0tp8tpU87sl tasks: - internal.lock_machines: - 3 - vps - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: null - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - workunit: branch: dumpling clients: client.0: - rados/test-upgrade-firefly.sh - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - radosbench: clients: - client.0 time: 1800 - install.upgrade: mon.c: null - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: branch: dumpling clients: client.0: - rados/test-upgrade-firefly.sh - workunit: branch: dumpling clients: client.0: - rbd/test_librbd_python.sh - rgw: client.0: idle_timeout: 300 - swift: client.0: rgw_server: client.0 - rados: clients: - client.0 objects: 500 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 teuthology_branch: firefly verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.30590
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml 6-next-mon/monb.yaml 7-workload/radosbench.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_12.04.yaml} duration: 9434.099356889725 failure_reason: timed out waiting for admin_socket to appear after osd.5 restart flavor: basic owner: scheduled_teuthology@teuthology success: false
Related issues
History
#1 Updated by Samuel Just almost 10 years ago
- Status changed from New to Duplicate
#2 Updated by Yuri Weinstein over 9 years ago
2014-11-21T23:28:44.607 ERROR:teuthology.run_tasks:Manager failed: thrashosds Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 119, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/thrashosds.py", line 174, in task thrash_proc.do_join() File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph_manager.py", line 288, in do_join self.thread.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get raise self._exception Exception: timed out waiting for admin_socket to appear after osd.10 restart