Actions
Bug #51944
closedteuthology.dispatcher.supervisor:Could not save logs
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
2021-07-28T07:16:53.639 INFO:teuthology.misc:Compressing logs... 2021-07-28T07:16:53.640 INFO:teuthology.orchestra.remote:Trying to reconnect to host 2021-07-28T07:17:28.139 ERROR:teuthology.dispatcher.supervisor:Could not save logs Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/dispatcher/supervisor.py", line 231, in run_with_watchdog teuth_config.archive_base, job_config) File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/dispatcher/supervisor.py", line 289, in transfer_archives compress_logs(ctx, log_path) File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/misc.py", line 1378, in compress_logs wait=False, File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/cluster.py", line 64, in run return [remote.run(**kwargs) for remote in remotes] File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/cluster.py", line 64, in <listcomp> return [remote.run(**kwargs) for remote in remotes] File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/remote.py", line 509, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_73aa7e3b960c7ffac669297b6aa86606265edd1b/teuthology/orchestra/run.py", line 434, in run transport = client.get_transport() AttributeError: 'NoneType' object has no attribute 'get_transport'
/a/yuriw-2021-07-27_17:19:39-rados-wip-yuri-testing-2021-07-27-0830-pacific-distro-basic-smithi/6297040$
Files
Updated by Neha Ojha almost 3 years ago
/a/yuriw-2021-07-28_20:56:33-rados-wip-yuri7-testing-2021-07-28-1155-octopus-distro-basic-smithi/6299474
Updated by Josh Durgin over 2 years ago
- Assignee set to Kamoltat (Junior) Sirivadhna
Updated by Deepika Upadhyay over 2 years ago
/ceph/teuthology-archive/yuriw-2021-09-02_14:43:38-rbd-wip-yuri7-testing-2021-09-01-0934-octopus-distro-basic-smithi/6372639/supervisor.6372639.log
Updated by David Galloway over 2 years ago
- File supervisor.6369338.log supervisor.6369338.log added
- Priority changed from Normal to Urgent
I think there's a much larger issue here.
2021-09-09T17:12:08.799 INFO:teuthology.dispatcher.supervisor:Running job 6369338 2021-09-09T17:12:08.799 DEBUG:teuthology.dispatcher.supervisor:Running: /home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/virtualenv/bin/teuthology -v --owner scheduled_teuthology@teuthology --archive /home/teuthworker/archive/teuthology-2021-09-01_03:30:03-rados-octopus-distro-basic-gibba/6369338 --name teuthology-2021-09-01_03:30:03-rados-octopus-distro-basic-gibba --description rados/cephadm/upgrade/{1-start-distro/1-start-centos_8 2-repo_digest/defaut 3-start-upgrade 4-wait fixed-2} -- /home/teuthworker/archive/teuthology-2021-09-01_03:30:03-rados-octopus-distro-basic-gibba/6369338/orig.config.yaml 2021-09-09T17:12:08.803 INFO:teuthology.dispatcher.supervisor:Job archive: /home/teuthworker/archive/teuthology-2021-09-01_03:30:03-rados-octopus-distro-basic-gibba/6369338 2021-09-09T17:12:08.804 INFO:teuthology.dispatcher.supervisor:Job PID: 20266 2021-09-09T17:12:08.804 INFO:teuthology.dispatcher.supervisor:Running with watchdog 2021-09-10T05:12:53.550 WARNING:teuthology.dispatcher.supervisor:Job ran longer than 43200s. Killing... 2021-09-10T05:12:53.761 INFO:teuthology.kill:Killing Pids: {20266} 2021-09-10T05:12:54.855 INFO:teuthology.kill:Nuking machines: ['gibba001', 'gibba041'] 2021-09-10T05:12:55.711 INFO:teuthology.kill:2021-09-10 05:12:55,711.711 INFO:teuthology.nuke:targets: 2021-09-10T05:12:55.712 INFO:teuthology.kill: gibba001.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQD7Dwz5eZsjA4lRqJ5lAnlTHpnVPHYhTsrPLQy6CKb8Eoc5+gX05oyvpeKzNpGBsXw+R/hlp/Ti0e6teKxrTupL+HIDE6GwanV2J2iwcdWzr9iA2tXuJ4egbfrNpLDJbQnkX59D6Ia6uyEceS2N32CDP4vkwkJ4P64+K20+G3a63fiYWcYGxEGWJi3sWILuxS8rOPLy55GOY6GOntGV3sWluDsxxrT1JrZPG1tKM1JtT7vZT9ofe7N1Jro/FyUdW8vp9coRShM4RHIea/mj0QCmnohhoO+pS2ZnmOJNTAdMudsV/z3Ru8d/n+0sS5VjHwiC/hutTNZMaTkOFYSsIR8j+rSbeX3AICU22wSTFlyVVDaKByOt2qhMLxY9vRJVaxSKyFg0OWniK7igo/TxBtvMvVhn2Vpizb6w9H5Z74XExwxBtQx+TKEJncfGai7nkyR9Yhf0xOz+4489Tx/Ad8XzHKrY7B32KkaIu9e7XrJBW9grjwsWGrvuL5b7OrE5lss= 2021-09-10T05:12:55.713 INFO:teuthology.kill: gibba041.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDEcwXnlx05+pkQzUNij/F8XjjFK2w7q0rbnVzlOXaDW9fUbFHz+uWDlL+vR3LAZrqFLW7XXGazyyacjchn4+v9B8WlLT8pJysCnfukpQYNViL8tPU1+IK+mGG8pIzhKXVSEhsDHk9xvmwYBsq1JONz7Nw3hJrTCbEZD77XtA/JnOK1ksWwP1ZEOjZnQGmq4s7VPlUDb8HlH0dcd7IBXshWsTTDLk52y+CrJd/6VcgIeNDVsXXkzXwBwsKcQSvM20J3x0+ne8ETP6jFVa46eNXZUHcIwOyZ/6g1Sl0hasb6vOeiIwZafVte5H7an2GCrMbsGUkQ+nRizzH6fOZXRym3rlIDxUd8uKwCZpof9hrjkkhdmkb+cfwcdtTuPDEfZCaTKM8cBn7Wi46Enic0mdvR0AmnGGtwP47cCcLU4bPTqh54V3eSaFApNwnyWKY9iN/JSQ164GpbHCs26ixabDyNRUfGkdz44Qq9bOs0ZfFhghZ1YZD4gGTC7NshKsxChiE= 2021-09-10T05:12:55.713 INFO:teuthology.kill:2021-09-10 05:12:55,711.711 INFO:teuthology.task.internal.check_lock:Checking locks... 2021-09-10T05:12:55.715 INFO:teuthology.kill:2021-09-10 05:12:55,715.715 INFO:teuthology.task.internal.check_lock:Checking locks... 2021-09-10T05:12:55.775 INFO:teuthology.kill:2021-09-10 05:12:55,775.775 INFO:teuthology.orchestra.console:Power off gibba041 2021-09-10T05:12:55.788 INFO:teuthology.kill:2021-09-10 05:12:55,787.787 INFO:teuthology.orchestra.console:Power off gibba001 2021-09-10T05:13:08.280 INFO:teuthology.kill:2021-09-10 05:13:08,280.280 INFO:teuthology.orchestra.console:Power off for gibba001 completed 2021-09-10T05:13:08.285 INFO:teuthology.kill:2021-09-10 05:13:08,285.285 INFO:teuthology.orchestra.console:Power off for gibba041 completed 2021-09-10T05:13:08.474 INFO:teuthology.task.internal:roles: ubuntu@gibba001.front.sepia.ceph.com - ['mon.a', 'mon.c', 'mgr.y', 'osd.0', 'osd.1', 'osd.2', 'osd.3', 'client.0', 'node-exporter.a', 'alertmanager.a'] 2021-09-10T05:13:08.475 INFO:teuthology.task.internal:roles: ubuntu@gibba041.front.sepia.ceph.com - ['mon.b', 'mgr.x', 'osd.4', 'osd.5', 'osd.6', 'osd.7', 'client.1', 'prometheus.a', 'grafana.a', 'node-exporter.b'] 2021-09-10T05:13:08.475 INFO:teuthology.misc:Compressing logs... 2021-09-10T05:13:08.476 INFO:teuthology.orchestra.remote:Trying to reconnect to host 2021-09-10T05:14:08.524 ERROR:teuthology.dispatcher.supervisor:Could not save logs Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/dispatcher/supervisor.py", line 231, in run_with_watchdog teuth_config.archive_base, job_config) File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/dispatcher/supervisor.py", line 289, in transfer_archives compress_logs(ctx, log_path) File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/misc.py", line 1378, in compress_logs wait=False, File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/orchestra/cluster.py", line 64, in run return [remote.run(**kwargs) for remote in remotes] File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/orchestra/cluster.py", line 64, in <listcomp> return [remote.run(**kwargs) for remote in remotes] File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/orchestra/remote.py", line 509, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_df69d6db9d68f5492f47c1dee57bca3b0a385796/teuthology/orchestra/run.py", line 434, in run transport = client.get_transport() AttributeError: 'NoneType' object has no attribute 'get_transport' 2021-09-10T05:14:08.562 INFO:teuthology.kill:No teuthology processes running 2021-09-10T05:14:09.539 INFO:teuthology.kill:Nuking machines: ['gibba001', 'gibba041'] 2021-09-10T05:14:10.376 INFO:teuthology.kill:2021-09-10 05:14:10,375.375 INFO:teuthology.nuke:targets: 2021-09-10T05:14:10.376 INFO:teuthology.kill: gibba001.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQD7Dwz5eZsjA4lRqJ5lAnlTHpnVPHYhTsrPLQy6CKb8Eoc5+gX05oyvpeKzNpGBsXw+R/hlp/Ti0e6teKxrTupL+HIDE6GwanV2J2iwcdWzr9iA2tXuJ4egbfrNpLDJbQnkX59D6Ia6uyEceS2N32CDP4vkwkJ4P64+K20+G3a63fiYWcYGxEGWJi3sWILuxS8rOPLy55GOY6GOntGV3sWluDsxxrT1JrZPG1tKM1JtT7vZT9ofe7N1Jro/FyUdW8vp9coRShM4RHIea/mj0QCmnohhoO+pS2ZnmOJNTAdMudsV/z3Ru8d/n+0sS5VjHwiC/hutTNZMaTkOFYSsIR8j+rSbeX3AICU22wSTFlyVVDaKByOt2qhMLxY9vRJVaxSKyFg0OWniK7igo/TxBtvMvVhn2Vpizb6w9H5Z74XExwxBtQx+TKEJncfGai7nkyR9Yhf0xOz+4489Tx/Ad8XzHKrY7B32KkaIu9e7XrJBW9grjwsWGrvuL5b7OrE5lss= 2021-09-10T05:14:10.377 INFO:teuthology.kill: gibba041.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDEcwXnlx05+pkQzUNij/F8XjjFK2w7q0rbnVzlOXaDW9fUbFHz+uWDlL+vR3LAZrqFLW7XXGazyyacjchn4+v9B8WlLT8pJysCnfukpQYNViL8tPU1+IK+mGG8pIzhKXVSEhsDHk9xvmwYBsq1JONz7Nw3hJrTCbEZD77XtA/JnOK1ksWwP1ZEOjZnQGmq4s7VPlUDb8HlH0dcd7IBXshWsTTDLk52y+CrJd/6VcgIeNDVsXXkzXwBwsKcQSvM20J3x0+ne8ETP6jFVa46eNXZUHcIwOyZ/6g1Sl0hasb6vOeiIwZafVte5H7an2GCrMbsGUkQ+nRizzH6fOZXRym3rlIDxUd8uKwCZpof9hrjkkhdmkb+cfwcdtTuPDEfZCaTKM8cBn7Wi46Enic0mdvR0AmnGGtwP47cCcLU4bPTqh54V3eSaFApNwnyWKY9iN/JSQ164GpbHCs26ixabDyNRUfGkdz44Qq9bOs0ZfFhghZ1YZD4gGTC7NshKsxChiE= 2021-09-10T05:14:10.396 INFO:teuthology.kill:2021-09-10 05:14:10,396.396 INFO:teuthology.task.internal.check_lock:Checking locks... 2021-09-10T05:14:10.400 INFO:teuthology.kill:2021-09-10 05:14:10,399.399 INFO:teuthology.task.internal.check_lock:Checking locks... 2021-09-10T05:14:10.456 INFO:teuthology.kill:2021-09-10 05:14:10,455.455 INFO:teuthology.orchestra.console:Power off gibba041 2021-09-10T05:14:10.466 INFO:teuthology.kill:2021-09-10 05:14:10,466.466 INFO:teuthology.orchestra.console:Power off gibba001 2021-09-10T05:14:10.602 INFO:teuthology.kill:2021-09-10 05:14:10,602.602 INFO:teuthology.orchestra.console:Power off for gibba041 completed 2021-09-10T05:14:10.611 INFO:teuthology.kill:2021-09-10 05:14:10,611.611 INFO:teuthology.orchestra.console:Power off for gibba001 completed 2021-09-10T05:14:10.741 INFO:teuthology.kill:2021-09-10 05:14:10,740.740 INFO:teuthology.lock.ops:unlocked gibba041.front.sepia.ceph.com 2021-09-10T05:14:10.747 INFO:teuthology.kill:2021-09-10 05:14:10,747.747 INFO:teuthology.lock.ops:unlocked gibba001.front.sepia.ceph.com 2021-09-10T05:16:10.954 ERROR:teuthology.dispatcher.supervisor:Child exited with code -15 2021-09-10T05:16:11.015 INFO:teuthology.dispatcher.supervisor:Nuking machines... 2021-09-10T05:16:11.016 INFO:teuthology.nuke:Checking targets against current locks 2021-09-10T05:16:11.093 INFO:teuthology.nuke:Not nuking gibba041.front.sepia.ceph.com because description doesn't match 2021-09-10T05:16:11.093 INFO:teuthology.nuke:Not nuking gibba001.front.sepia.ceph.com because description doesn't match
- The job starts nuking the machines (powers them off) before trying to collect logs.
- It says it's nuking the machines three times. Notice the last time, the supervisor is trying to nuke the machine after another job already picked it up.
Updated by Zack Cerza over 2 years ago
This commit has good intentions, but since kill_job
calls nuke_targets
, it looks like this will happen any time a job is killed for going overtime.
I'll get a patch together today.
Updated by Zack Cerza over 2 years ago
Updated by Kamoltat (Junior) Sirivadhna over 2 years ago
- Status changed from New to Resolved
- Assignee changed from Kamoltat (Junior) Sirivadhna to Zack Cerza
Actions