Actions
Bug #53288
openFailed jobs hanging for 12 hours
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
2021-11-14T07:49:49.697 INFO:teuthology.orchestra.run.smithi137.stderr:mount error 110 = Connection timed out 2021-11-14T07:49:49.698 INFO:teuthology.orchestra.run.smithi137.stdout:parsing options: rw,norequire_active_mds,name=1,conf=/etc/ceph/ceph.conf,norbytes 2021-11-14T07:49:49.698 INFO:teuthology.orchestra.run.smithi137.stdout:mount.ceph: options "norequire_active_mds,name=1,norbytes" will pass to kernel. 2021-11-14T07:49:49.701 DEBUG:teuthology.orchestra.run:got remote process result: 32 2021-11-14T07:49:49.701 INFO:tasks.cephfs.kernel_mount:mount command failed 2021-11-14T07:49:49.702 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/run_tasks.py", line 94, in run_tasks manager.__enter__() File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/home/teuthworker/src/github.com_ceph_ceph-c_bed1599fa788bf76a9a9c97632799d018a249f4e/qa/tasks/kclient.py", line 112, in task kernel_mount.mount(mntopts=client_config.get('mntopts', [])) File "/home/teuthworker/src/github.com_ceph_ceph-c_bed1599fa788bf76a9a9c97632799d018a249f4e/qa/tasks/cephfs/kernel_mount.py", line 49, in mount retval = self._run_mount_cmd(mntopts, check_status) File "/home/teuthworker/src/github.com_ceph_ceph-c_bed1599fa788bf76a9a9c97632799d018a249f4e/qa/tasks/cephfs/kernel_mount.py", line 70, in _run_mount_cmd stderr=mountcmd_stderr, omit_sudo=False) File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/remote.py", line 509, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/run.py", line 455, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/run.py", line 161, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/run.py", line 183, in _raise_for_status node=self.hostname, label=self.label teuthology.exceptions.CommandFailedError: Command failed on smithi137 with status 32: 'sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.1 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /bin/mount -t ceph :/ /home/ubuntu/cephtest/mnt.1 -v -o norequire_active_mds,name=1,conf=/etc/ceph/ceph.conf,norbytes' 2021-11-14T07:49:49.833 ERROR:teuthology.run_tasks: Sentry event: https://sentry.ceph.com/organizations/ceph/?query=e6eebe4b2f4a4c80bdd9a238b8404bb6 Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/run_tasks.py", line 94, in run_tasks manager.__enter__() File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/home/teuthworker/src/github.com_ceph_ceph-c_bed1599fa788bf76a9a9c97632799d018a249f4e/qa/tasks/kclient.py", line 112, in task kernel_mount.mount(mntopts=client_config.get('mntopts', [])) File "/home/teuthworker/src/github.com_ceph_ceph-c_bed1599fa788bf76a9a9c97632799d018a249f4e/qa/tasks/cephfs/kernel_mount.py", line 49, in mount retval = self._run_mount_cmd(mntopts, check_status) File "/home/teuthworker/src/github.com_ceph_ceph-c_bed1599fa788bf76a9a9c97632799d018a249f4e/qa/tasks/cephfs/kernel_mount.py", line 70, in _run_mount_cmd stderr=mountcmd_stderr, omit_sudo=False) File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/remote.py", line 509, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/run.py", line 455, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/run.py", line 161, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_d4737010a85099043cf081dc05b4069d301b23fb/teuthology/orchestra/run.py", line 183, in _raise_for_status node=self.hostname, label=self.label teuthology.exceptions.CommandFailedError: Command failed on smithi137 with status 32: 'sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.1 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /bin/mount -t ceph :/ /home/ubuntu/cephtest/mnt.1 -v -o norequire_active_mds,name=1,conf=/etc/ceph/ceph.conf,norbytes' 2021-11-14T07:49:49.836 DEBUG:teuthology.run_tasks:Unwinding manager kclient 2021-11-14T07:49:49.849 DEBUG:teuthology.run_tasks:Unwinding manager cephadm 2021-11-14T07:49:49.877 INFO:tasks.cephadm:Teardown begin
Later...
2021-11-14T07:50:19.435 INFO:teuthology.task.internal:Tidying up after the test... 2021-11-14T07:50:19.435 DEBUG:teuthology.orchestra.run.smithi073:> find /home/ubuntu/cephtest -ls ; rmdir -- /home/ubuntu/cephtest 2021-11-14T07:50:19.438 DEBUG:teuthology.orchestra.run.smithi137:> find /home/ubuntu/cephtest -ls ; rmdir -- /home/ubuntu/cephtest 2021-11-14T07:50:19.453 INFO:teuthology.orchestra.run.smithi137.stdout: 262155 4 drwxr-xr-x 3 ubuntu ubuntu 4096 Nov 14 07:50 /home/ubuntu/cephtest 2021-11-14T07:50:19.454 INFO:teuthology.orchestra.run.smithi137.stdout: 397627 4 d--------- 2 ubuntu ubuntu 4096 Nov 14 07:48 /home/ubuntu/cephtest/mnt.1 2021-11-14T07:50:19.455 INFO:teuthology.orchestra.run.smithi137.stderr:find: ‘/home/ubuntu/cephtest/mnt.1’: Permission denied 2021-11-14T07:50:19.455 INFO:teuthology.orchestra.run.smithi137.stderr:rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty 2021-11-14T19:30:07.385 DEBUG:teuthology.exit:Got signal 15; running 1 handler... 2021-11-14T19:30:07.444 DEBUG:teuthology.task.console_log:Killing console logger for smithi073 2021-11-14T19:30:07.448 DEBUG:teuthology.task.console_log:Killing console logger for smithi137 2021-11-14T19:30:07.449 DEBUG:teuthology.exit:Finished running handlers
Does not seem like a supervisor issue
2021-11-14T07:29:13.039 INFO:teuthology.dispatcher.supervisor:Job archive: /home/teuthworker/archive/yuriw-2021-11-13_15:31:06-rados-wip-yuriw-master-11.12.21-distro-basic-smithi/6501803 2021-11-14T07:29:13.040 INFO:teuthology.dispatcher.supervisor:Job PID: 20051 2021-11-14T07:29:13.040 INFO:teuthology.dispatcher.supervisor:Running with watchdog 2021-11-14T19:30:07.168 WARNING:teuthology.dispatcher.supervisor:Job ran longer than 43200s. Killing...
Updated by Patrick Donnelly over 2 years ago
- Related to Bug #53293: qa: v16.2.4 mds crash caused by centos stream kernel added
Updated by Zack Cerza over 2 years ago
I think we need to have the kclient task reboot all nodes during teardown if a job failure is detected.
Updated by Aishwarya Mathuria over 2 years ago
- Assignee set to Aishwarya Mathuria
Updated by Aishwarya Mathuria over 2 years ago
- Status changed from New to In Progress
Actions