Bug #53298
closedTeuthology jobs running for more than 12 hours, timeout not working
0%
Description
Some jobs seem to get stuck in a loop of Traceback errors and run for very long. The job is not timing out.
2021-11-04T23:32:43.081 INFO:tasks.workunit:timeout=3h
2021-11-04T23:32:43.081 INFO:tasks.workunit:cleanup=True
INFO:teuthology.orchestra.run:Running command with timeout 3600
2021-11-04T23:41:36.256 INFO:teuthology.orchestra.run:Running command with timeout 3600
2021-11-04T23:41:36.257 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm rf - /home/ubuntu/cephtest/mnt.0/client.0/tmp
2021-11-04T23:41:36.289 INFO:tasks.workunit:Stopping ['rbd/test_librbd.sh'] on client.0...
2021-11-04T23:41:36.289 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm rf - /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2021-11-04T23:41:36.518 INFO:journalctl@ceph.mon.a.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon83812: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.518 INFO:journalctl@ceph.mon.a.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon83812: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:36.519 INFO:journalctl@ceph.mon.c.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon86457: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.519 INFO:journalctl@ceph.mon.c.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon86457: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:36.520 DEBUG:teuthology.parallel:result is None
2021-11-04T23:41:36.521 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm rf - /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.546 INFO:tasks.workunit:Deleted dir /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.546 DEBUG:teuthology.orchestra.run.smithi039:> rmdir -- /home/ubuntu/cephtest/mnt.0
2021-11-04T23:41:36.600 INFO:tasks.workunit:Deleted artificial mount point /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.601 INFO:teuthology.task.full_sequential:In full_sequential, running task print...
2021-11-04T23:41:36.601 INFO:teuthology.task.print:**** done end test_rbd_api.yaml
2021-11-04T23:41:36.602 DEBUG:teuthology.parallel:result is None
2021-11-04T23:41:36.682 INFO:journalctl@ceph.mon.b.smithi052.stdout:Nov 04 23:41:36 smithi052 ceph-mon62286: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.682 INFO:journalctl@ceph.mon.b.smithi052.stdout:Nov 04 23:41:36 smithi052 ceph-mon62286: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:37.582 INFO:journalctl@ceph.mon.a.smithi039.stdout:Nov 04 23:41:37 smithi039 ceph-mon83812: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:37.582 INFO:journalctl@ceph.mon.c.smithi039.stdout:Nov 04 23:41:37 smithi039 ceph-mon86457: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:37.682 INFO:journalctl@ceph.mon.b.smithi052.stdout:Nov 04 23:41:37 smithi052 ceph-mon62286: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:38.082 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: 2021-11-04T23:41:37.815+0000 7fd3e27db700 -1 log_channel(cephadm) log [ERR] : Failed to apply alertmanager spec AlertManagerSpec({'placement': PlacementSpec(count=1, hosts=[HostPlacementSpec(hostname='smithi039', network='', name='a')]), 'service_type': 'alertmanager', 'service_id': None, 'unmanaged': False, 'preview_only': False, 'networks': [], 'config': None, 'user_data': {}, 'port': None}): name alertmanager.a already in use
2021-11-04T23:41:38.083 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: Traceback (most recent call last):
2021-11-04T23:41:38.083 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/serve.py", line 545, in _apply_all_services
2021-11-04T23:41:38.083 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: if self._apply_service(spec):
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/serve.py", line 747, in _apply_service
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: rank_generation=slot.rank_generation,
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/module.py", line 636, in get_unique_name
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: f'name {daemon_type}.{forcename} already in use')
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: orchestrator._interface.OrchestratorValidationError: name alertmanager.a already in use
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: 2021-11-04T23:41:37.817+0000 7fd3e27db700 -1 log_channel(cephadm) log [ERR] : Failed to apply grafana spec MonitoringSpec({'placement': PlacementSpec(count=1, hosts=[HostPlacementSpec(hostname='smithi052', network='', name='a')]), 'service_type': 'grafana', 'service_id': None, 'unmanaged': False, 'preview_only': False, 'networks': [], 'config': None, 'port': None}): name grafana.a already in use
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: Traceback (most recent call last):
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/serve.py", line 545, in _apply_all_services
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon79749: if self._apply_service(spec):
Updated by Aishwarya Mathuria over 2 years ago
- Status changed from New to Closed