Actions
Bug #44381
closedkclient: crash/hang during qa/workunits/fs/snaps/snaptest-capwb.sh
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2020-02-29T09:35:22.472 INFO:tasks.workunit:Running workunit fs/snaps/snaptest-capwb.sh... 2020-02-29T09:35:22.473 INFO:teuthology.orchestra.run.smithi105:workunit test fs/snaps/snaptest-capwb.sh> mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=1b30588872aa57834eb528ae5a31abd968ddcfed TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/fs/snaps/snaptest-capwb.sh 2020-02-29T09:35:22.495 INFO:tasks.workunit.client.0.smithi105.stderr:+ set -e 2020-02-29T09:35:22.496 INFO:tasks.workunit.client.0.smithi105.stderr:+ mkdir foo 2020-02-29T09:35:22.501 INFO:tasks.workunit.client.0.smithi105.stderr:+ ceph fs set cephfs allow_new_snaps true --yes-i-really-mean-it ... 2020-02-29T09:35:24.393 INFO:tasks.workunit.client.0.smithi105.stderr:enabled new snapshots 2020-02-29T09:35:52.133 INFO:teuthology.orchestra.run.smithi012:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:35:52.136 INFO:teuthology.orchestra.run.smithi105:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:35:52.140 INFO:teuthology.orchestra.run.smithi167:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:36:22.299 INFO:teuthology.orchestra.run.smithi012:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:36:22.302 INFO:teuthology.orchestra.run.smithi105:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:36:22.307 INFO:teuthology.orchestra.run.smithi167:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:36:52.347 INFO:teuthology.orchestra.run.smithi012:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:36:52.349 INFO:teuthology.orchestra.run.smithi105:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:36:52.353 INFO:teuthology.orchestra.run.smithi167:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:37:22.482 INFO:teuthology.orchestra.run.smithi012:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:37:22.485 INFO:teuthology.orchestra.run.smithi105:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2020-02-29T09:52:31.974 ERROR:paramiko.transport:Socket exception: No route to host (113) 2020-02-29T09:52:32.002 DEBUG:teuthology.orchestra.run:got remote process result: None 2020-02-29T09:52:32.002 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0... 2020-02-29T09:52:32.002 INFO:teuthology.orchestra.remote:Trying to reconnect to host 2020-02-29T09:52:32.003 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'smithi105.front.sepia.ceph.com', 'timeout': 60} 2020-02-29T09:52:32.004 DEBUG:tasks.ceph:Missed logrotate, host unreachable 2020-02-29T09:52:35.078 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on 172.21.15.105 2020-02-29T09:52:35.078 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 86, in run_tasks manager = run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 65, in run_one_task return task(**kwargs) File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200229.001503/qa/tasks/workunit.py", line 140, in task cleanup=cleanup) File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200229.001503/qa/tasks/workunit.py", line 290, in _spawn_on_all_clients timeout=timeout) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 87, in __exit__ for result in self: File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 101, in __next__ resurrect_traceback(result) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 37, in resurrect_traceback reraise(*exc_info) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 24, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200229.001503/qa/tasks/workunit.py", line 426, in _run_tests args=args, File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 198, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 416, in run raise ConnectionLostError(command=quote(args), node=name) ConnectionLostError: SSH connection to smithi105 was lost: 'sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0'
From: /ceph/teuthology-archive/pdonnell-2020-02-29_02:56:38-kcephfs-wip-pdonnell-testing-20200229.001503-distro-basic-smithi/4811017/teuthology.log
See also:
Failure: SSH connection to smithi105 was lost: 'sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0' 5 jobs: ['4811017', '4810943', '4810906', '4811165', '4811128'] suites intersection: ['clusters/1-mds-1-client.yaml', 'conf/{client.yaml', 'k-testing.yaml}', 'kcephfs/cephfs/{begin.yaml', 'kclient/{mount.yaml', 'log-config.yaml', 'mds.yaml', 'mon.yaml', 'ms-die-on-skipped.yaml}}', 'osd-asserts.yaml', 'osd.yaml}', 'overrides/{frag_enable.yaml', 'tasks/kclient_workunit_snaps.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}'] suites union: ['clusters/1-mds-1-client.yaml', 'conf/{client.yaml', 'k-testing.yaml}', 'kcephfs/cephfs/{begin.yaml', 'kclient/{mount.yaml', 'log-config.yaml', 'mds.yaml', 'mon.yaml', 'ms-die-on-skipped.yaml}}', 'objectstore-ec/bluestore-bitmap.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd-asserts.yaml', 'osd.yaml}', 'overrides/{distro/testing/{flavor/centos_latest.yaml', 'overrides/{distro/testing/{flavor/ubuntu_latest.yaml', 'overrides/{frag_enable.yaml', 'tasks/kclient_workunit_snaps.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
I think the final error message is misleading. We didn't yet get to the point of cleaning up the workunit directory.
Actions