Project

General

Profile

Actions

Bug #53298

closed

Teuthology jobs running for more than 12 hours, timeout not working

Added by Aishwarya Mathuria over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Some jobs seem to get stuck in a loop of Traceback errors and run for very long. The job is not timing out.

2021-11-04T23:32:43.081 INFO:tasks.workunit:timeout=3h
2021-11-04T23:32:43.081 INFO:tasks.workunit:cleanup=True
INFO:teuthology.orchestra.run:Running command with timeout 3600
2021-11-04T23:41:36.256 INFO:teuthology.orchestra.run:Running command with timeout 3600
2021-11-04T23:41:36.257 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm rf - /home/ubuntu/cephtest/mnt.0/client.0/tmp
2021-11-04T23:41:36.289 INFO:tasks.workunit:Stopping ['rbd/test_librbd.sh'] on client.0...
2021-11-04T23:41:36.289 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm rf - /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2021-11-04T23:41:36.518 INFO::Nov 04 23:41:36 smithi039 ceph-mon83812: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.518 INFO::Nov 04 23:41:36 smithi039 ceph-mon83812: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:36.519 INFO::Nov 04 23:41:36 smithi039 ceph-mon86457: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.519 INFO::Nov 04 23:41:36 smithi039 ceph-mon86457: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:36.520 DEBUG:teuthology.parallel:result is None
2021-11-04T23:41:36.521 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm rf - /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.546 INFO:tasks.workunit:Deleted dir /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.546 DEBUG:teuthology.orchestra.run.smithi039:> rmdir -- /home/ubuntu/cephtest/mnt.0
2021-11-04T23:41:36.600 INFO:tasks.workunit:Deleted artificial mount point /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.601 INFO:teuthology.task.full_sequential:In full_sequential, running task print...
2021-11-04T23:41:36.601 INFO:teuthology.task.print:**** done end test_rbd_api.yaml
2021-11-04T23:41:36.602 DEBUG:teuthology.parallel:result is None
2021-11-04T23:41:36.682 INFO::Nov 04 23:41:36 smithi052 ceph-mon62286: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.682 INFO::Nov 04 23:41:36 smithi052 ceph-mon62286: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:37.582 INFO::Nov 04 23:41:37 smithi039 ceph-mon83812: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:37.582 INFO::Nov 04 23:41:37 smithi039 ceph-mon86457: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:37.682 INFO::Nov 04 23:41:37 smithi052 ceph-mon62286: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:38.082 INFO::Nov 04 23:41:37 smithi039 conmon79749: 2021-11-04T23:41:37.815+0000 7fd3e27db700 -1 log_channel(cephadm) log [ERR] : Failed to apply alertmanager spec AlertManagerSpec({'placement': PlacementSpec(count=1, hosts=[HostPlacementSpec(hostname='smithi039', network='', name='a')]), 'service_type': 'alertmanager', 'service_id': None, 'unmanaged': False, 'preview_only': False, 'networks': [], 'config': None, 'user_data': {}, 'port': None}): name alertmanager.a already in use
2021-11-04T23:41:38.083 INFO::Nov 04 23:41:37 smithi039 conmon79749: Traceback (most recent call last):
2021-11-04T23:41:38.083 INFO::Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/serve.py", line 545, in _apply_all_services
2021-11-04T23:41:38.083 INFO::Nov 04 23:41:37 smithi039 conmon79749: if self._apply_service(spec):
2021-11-04T23:41:38.084 INFO::Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/serve.py", line 747, in _apply_service
2021-11-04T23:41:38.084 INFO::Nov 04 23:41:37 smithi039 conmon79749: rank_generation=slot.rank_generation,
2021-11-04T23:41:38.084 INFO::Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/module.py", line 636, in get_unique_name
2021-11-04T23:41:38.084 INFO::Nov 04 23:41:37 smithi039 conmon79749: f'name {daemon_type}.{forcename} already in use')
2021-11-04T23:41:38.084 INFO::Nov 04 23:41:37 smithi039 conmon79749: orchestrator._interface.OrchestratorValidationError: name alertmanager.a already in use
2021-11-04T23:41:38.085 INFO::Nov 04 23:41:37 smithi039 conmon79749: 2021-11-04T23:41:37.817+0000 7fd3e27db700 -1 log_channel(cephadm) log [ERR] : Failed to apply grafana spec MonitoringSpec({'placement': PlacementSpec(count=1, hosts=[HostPlacementSpec(hostname='smithi052', network='', name='a')]), 'service_type': 'grafana', 'service_id': None, 'unmanaged': False, 'preview_only': False, 'networks': [], 'config': None, 'port': None}): name grafana.a already in use
2021-11-04T23:41:38.085 INFO::Nov 04 23:41:37 smithi039 conmon79749: Traceback (most recent call last):
2021-11-04T23:41:38.085 INFO::Nov 04 23:41:37 smithi039 conmon79749: File "/usr/share/ceph/mgr/cephadm/serve.py", line 545, in _apply_all_services
2021-11-04T23:41:38.085 INFO::Nov 04 23:41:37 smithi039 conmon79749: if self._apply_service(spec):

Actions #1

Updated by Aishwarya Mathuria over 2 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF