Bug #15278
closedapt-get update failure apparently due to timeout rather than hash mismatch
0%
Description
2016-03-24T16:30:59.181 INFO:teuthology.task.ansible.out:
TASK: [common | Update apt cache.] ************************************
2016-03-24T16:34:07.970 INFO:teuthology.task.ansible.out:ESC[0;31mfailed: [smithi048.front.sepia.ceph.com] => {"attempts": 24, "failed": true}ESC[0m
ESC[0;31mmsg: Task failed as maximum retries was encounteredESC[0m
2016-03-24T16:34:11.565 INFO:teuthology.task.ansible.out:ESC[0;31mfailed: [smithi035.front.sepia.ceph.com] => {"attempts": 24, "failed": true}ESC[0m
ESC[0;31mmsg: Task failed as maximum retries was encounteredESC[0m
2016-03-24T16:34:11.577 INFO:teuthology.task.ansible.out:ESC[0;31m
FATAL: all hosts have already failed -- abortingESC[0m
2016-03-24T16:34:11.578 INFO:teuthology.task.ansible.out:
PLAY RECAP ************************************************************
to retry, use: --limit @/var/lib/teuthworker/cephlab.retry
ESC[0;31msmithi035.front.sepia.ceph.comESC[0m : ESC[0;32mokESC[0mESC[0;32m=ESC[0mESC[0;32m16ESC[0m changed=0 unreachable=0 ESC[0;31mfailedESC[0mESC[0;31m=ESC[0mESC[0;31m1ESC[0m
ESC[0;31msmithi048.front.sepia.ceph.comESC[0m : ESC[0;32mokESC[0mESC[0;32m=ESC[0mESC[0;32m16ESC[0m changed=0 unreachable=0 ESC[0;31mfailedESC[0mESC[0;31m=ESC[0mESC[0;31m1ESC[0m
2016-03-24T16:34:11.709 INFO:teuthology.task.ansible:Archiving ansible failure log at: /var/lib/teuthworker/archive/samuelj-2016-03-24_15:06:08-rados-wip-sam-testing-distro-basic-smithi/84787/ansible_failures.yaml
2016-03-24T16:34:11.712 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 69, in run_tasks
manager.__enter__()
File "/home/teuthworker/src/teuthology_master/teuthology/task/__init__.py", line 121, in enter
self.begin()
File "/home/teuthworker/src/teuthology_master/teuthology/task/ansible.py", line 236, in begin
self.execute_playbook()
File "/home/teuthworker/src/teuthology_master/teuthology/task/ansible.py", line 262, in execute_playbook
self._handle_failure(command, status)
File "/home/teuthworker/src/teuthology_master/teuthology/task/ansible.py", line 285, in handle_failure
raise AnsibleFailedError(failures)
AnsibleFailedError: {'smithi035.front.sepia.ceph.com': {'invocation': {'module_name': 'apt', 'module_args': ''}, 'failed': True, 'attempts': 24, 'msg': 'Task failed as maximum retries was encountered'}, 'smithi048.front.sepia.ceph.com': {'invocation': {'module_name': 'apt', 'module_args': ''}, 'failed': True, 'attempts': 24, 'msg': 'Task failed as maximum retries was encountered'}}
2016-03-24T16:34:11.731 ERROR:teuthology.run_tasks: Sentry event: http://sentry.ceph.com/sepia/teuthology/?q=4e21798f5f104ccca97246d3267b94ce
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 69, in run_tasks
manager._enter__()
File "/home/teuthworker/src/teuthology_master/teuthology/task/__init__.py", line 121, in enter
self.begin()
File "/home/teuthworker/src/teuthology_master/teuthology/task/ansible.py", line 236, in begin
self.execute_playbook()
File "/home/teuthworker/src/teuthology_master/teuthology/task/ansible.py", line 262, in execute_playbook
self._handle_failure(command, status)
File "/home/teuthworker/src/teuthology_master/teuthology/task/ansible.py", line 285, in _handle_failure
raise AnsibleFailedError(failures)
AnsibleFailedError: {'smithi035.front.sepia.ceph.com': {'invocation': {'module_name': 'apt', 'module_args': ''}, 'failed': True, 'attempts': 24, 'msg': 'Task failed as maximum retries was encountered'}, 'smithi048.front.sepia.ceph.com': {'invocation': {'module_name': 'apt', 'module_args': ''}, 'failed': True, 'attempts': 24, 'msg': 'Task failed as maximum retries was encountered'}}
Several examples in this run. For the moment, I am assuming that this is a networking or apt-get server availability thing?
sjust@teuthology:/a/samuelj-2016-03-24_15:06:08-rados-wip-sam-testing-distro-basic-smithi/84787
Updated by Samuel Just about 8 years ago
I think sjust@teuthology:/a/samuelj-2016-03-24_15:06:08-rados-wip-sam-testing-distro-basic-smithi/84783 might be another instance? Happened around the same time, also not apparently a hash mismatch.
Updated by David Galloway about 8 years ago
- Category set to Test Node
- Status changed from New to 4
- Assignee set to David Galloway
Is this resolved by https://github.com/ceph/ceph-cm-ansible/pull/221 ?
Updated by David Galloway almost 8 years ago
- Status changed from 4 to Closed
Assuming this is resolved by your PR that blasts and rebuilds the apt cache with each ansible run.