Bug #53298: Teuthology jobs running for more than 12 hours, timeout not working - teuthology - Ceph

Actions

Copy link

Bug #53298

closed

Teuthology jobs running for more than 12 hours, timeout not working

Added by Aishwarya Mathuria over 2 years ago. Updated over 2 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Crash signature (v1):

Crash signature (v2):

Description

Some jobs seem to get stuck in a loop of Traceback errors and run for very long. The job is not timing out.

2021-11-04T23:32:43.081 INFO:tasks.workunit:timeout=3h
2021-11-04T23:32:43.081 INFO:tasks.workunit:cleanup=True
INFO:teuthology.orchestra.run:Running command with timeout 3600
2021-11-04T23:41:36.256 INFO:teuthology.orchestra.run:Running command with timeout 3600
2021-11-04T23:41:36.257 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm ~~rf -~~ /home/ubuntu/cephtest/mnt.0/client.0/tmp
2021-11-04T23:41:36.289 INFO:tasks.workunit:Stopping ['rbd/test_librbd.sh'] on client.0...
2021-11-04T23:41:36.289 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm ~~rf -~~ /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2021-11-04T23:41:36.518 INFO:journalctl@ceph.mon.a.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon⁸³⁸¹²: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.518 INFO:journalctl@ceph.mon.a.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon⁸³⁸¹²: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:36.519 INFO:journalctl@ceph.mon.c.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon⁸⁶⁴⁵⁷: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.519 INFO:journalctl@ceph.mon.c.smithi039.stdout:Nov 04 23:41:36 smithi039 ceph-mon⁸⁶⁴⁵⁷: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:36.520 DEBUG:teuthology.parallel:result is None
2021-11-04T23:41:36.521 DEBUG:teuthology.orchestra.run.smithi039:> sudo rm ~~rf -~~ /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.546 INFO:tasks.workunit:Deleted dir /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.546 DEBUG:teuthology.orchestra.run.smithi039:> rmdir -- /home/ubuntu/cephtest/mnt.0
2021-11-04T23:41:36.600 INFO:tasks.workunit:Deleted artificial mount point /home/ubuntu/cephtest/mnt.0/client.0
2021-11-04T23:41:36.601 INFO:teuthology.task.full_sequential:In full_sequential, running task print...
2021-11-04T23:41:36.601 INFO:teuthology.task.print:**** done end test_rbd_api.yaml
2021-11-04T23:41:36.602 DEBUG:teuthology.parallel:result is None
2021-11-04T23:41:36.682 INFO:journalctl@ceph.mon.b.smithi052.stdout:Nov 04 23:41:36 smithi052 ceph-mon⁶²²⁸⁶: osdmap e1459: 8 total, 8 up, 8 in
2021-11-04T23:41:36.682 INFO:journalctl@ceph.mon.b.smithi052.stdout:Nov 04 23:41:36 smithi052 ceph-mon⁶²²⁸⁶: pgmap v3710: 105 pgs: 9 creating+peering, 6 unknown, 90 active+clean; 4.6 MiB data, 1.2 GiB used, 714 GiB / 715 GiB avail; 3.7 KiB/s rd, 5.5 KiB/s wr, 13 op/s
2021-11-04T23:41:37.582 INFO:journalctl@ceph.mon.a.smithi039.stdout:Nov 04 23:41:37 smithi039 ceph-mon⁸³⁸¹²: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:37.582 INFO:journalctl@ceph.mon.c.smithi039.stdout:Nov 04 23:41:37 smithi039 ceph-mon⁸⁶⁴⁵⁷: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:37.682 INFO:journalctl@ceph.mon.b.smithi052.stdout:Nov 04 23:41:37 smithi052 ceph-mon⁶²²⁸⁶: osdmap e1460: 8 total, 8 up, 8 in
2021-11-04T23:41:38.082 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: 2021-11-04T23:41:37.815+0000 7fd3e27db700 -1 log_channel(cephadm) log [ERR] : Failed to apply alertmanager spec AlertManagerSpec({'placement': PlacementSpec(count=1, hosts=[HostPlacementSpec(hostname='smithi039', network='', name='a')]), 'service_type': 'alertmanager', 'service_id': None, 'unmanaged': False, 'preview_only': False, 'networks': [], 'config': None, 'user_data': {}, 'port': None}): name alertmanager.a already in use
2021-11-04T23:41:38.083 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: Traceback (most recent call last):
2021-11-04T23:41:38.083 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: File "/usr/share/ceph/mgr/cephadm/serve.py", line 545, in _apply_all_services
2021-11-04T23:41:38.083 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: if self._apply_service(spec):
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: File "/usr/share/ceph/mgr/cephadm/serve.py", line 747, in _apply_service
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: rank_generation=slot.rank_generation,
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: File "/usr/share/ceph/mgr/cephadm/module.py", line 636, in get_unique_name
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: f'name {daemon_type}.{forcename} already in use')
2021-11-04T23:41:38.084 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: orchestrator._interface.OrchestratorValidationError: name alertmanager.a already in use
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: 2021-11-04T23:41:37.817+0000 7fd3e27db700 -1 log_channel(cephadm) log [ERR] : Failed to apply grafana spec MonitoringSpec({'placement': PlacementSpec(count=1, hosts=[HostPlacementSpec(hostname='smithi052', network='', name='a')]), 'service_type': 'grafana', 'service_id': None, 'unmanaged': False, 'preview_only': False, 'networks': [], 'config': None, 'port': None}): name grafana.a already in use
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: Traceback (most recent call last):
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: File "/usr/share/ceph/mgr/cephadm/serve.py", line 545, in _apply_all_services
2021-11-04T23:41:38.085 INFO:journalctl@ceph.mgr.y.smithi039.stdout:Nov 04 23:41:37 smithi039 conmon⁷⁹⁷⁴⁹: if self._apply_service(spec):

Actions

Copy link

Updated by Aishwarya Mathuria over 2 years ago

Status changed from New to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Tools » teuthology

Custom queries

Bug #53298

Teuthology jobs running for more than 12 hours, timeout not working

Updated by Aishwarya Mathuria over 2 years ago