Bug #50821
qa: untar_snap_rm failure during mds thrashing
Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: linux-2.6.33/arch/microblaze: Cannot stat: Permission denied 2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: linux-2.6.33/arch: Cannot stat: Permission denied 2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: linux-2.6.33: Cannot stat: Permission denied 2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: Error is not recoverable: exiting now 2021-05-14T22:51:46.079 DEBUG:teuthology.orchestra.run:got remote process result: 2 2021-05-14T22:51:46.080 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0... 2021-05-14T22:51:46.080 DEBUG:teuthology.orchestra.run.smithi094:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0 2021-05-14T22:51:46.264 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/run_tasks.py", line 91, in run_tasks manager = run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/run_tasks.py", line 70, in run_one_task return task(**kwargs) File "/home/teuthworker/src/github.com_batrick_ceph_e78e41c7f45263bfc3d22dafa953b7e485aac84d/qa/tasks/workunit.py", line 147, in task cleanup=cleanup) File "/home/teuthworker/src/github.com_batrick_ceph_e78e41c7f45263bfc3d22dafa953b7e485aac84d/qa/tasks/workunit.py", line 297, in _spawn_on_all_clients timeout=timeout) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 84, in __exit__ for result in self: File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 98, in __next__ resurrect_traceback(result) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 30, in resurrect_traceback raise exc.exc_info[1] File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 23, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/github.com_batrick_ceph_e78e41c7f45263bfc3d22dafa953b7e485aac84d/qa/tasks/workunit.py", line 425, in _run_tests label="workunit test {workunit}".format(workunit=workunit) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/remote.py", line 509, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/run.py", line 455, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/run.py", line 161, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/run.py", line 183, in _raise_for_status node=self.hostname, label=self.label teuthology.exceptions.CommandFailedError: Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi094 with status 2: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e78e41c7f45263bfc3d22dafa953b7e485aac84d TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/fs/snaps/untar_snap_rm.sh'
From: /ceph/teuthology-archive/pdonnell-2021-05-14_21:45:42-fs-master-distro-basic-smithi/6115751/teuthology.log
With RHEL stock kernel. Might be related to some other issues I've been suddenly seeing with the stock RHEL kernel.
Related issues
History
#1 Updated by Patrick Donnelly over 1 year ago
I don't think this is related to #50281 but may be.
#2 Updated by Patrick Donnelly over 1 year ago
- Related to Bug #50823: qa: RuntimeError: timeout waiting for cluster to stabilize added
#3 Updated by Patrick Donnelly over 1 year ago
- Related to Bug #50824: qa: snaptest-git-ceph bus error added
#4 Updated by Patrick Donnelly over 1 year ago
- Related to Bug #51278: mds: "FAILED ceph_assert(!segments.empty())" added
#5 Updated by Venky Shankar 10 months ago
Similar failure here: https://pulpito.ceph.com/vshankar-2022-04-11_12:24:06-fs-wip-vshankar-testing1-20220411-144044-testing-default-smithi/6786336/
although in this instance, we see ESTALE/EIO.
2022-04-11T15:56:23.599 INFO:teuthology.orchestra.run.smithi141.stderr:2022-04-11T15:56:23.590+0000 7f3cba9ff700 1 -- 172.21.15.141:0/3624046670 --> [v2:172.21.15.153:6808/205989,v1:172.21.15.153:6809/205989] -- command(tid 11: {"prefix": "get_command_descriptions"}) v1 -- 0x7f3c90018dc0 con 0x7f3c90011730 2022-04-11T15:56:23.599 INFO:teuthology.orchestra.run.smithi141.stderr:2022-04-11T15:56:23.590+0000 7f3cb37fe700 1 --2- 172.21.15.141:0/3624046670 >> [v2:172.21.15.153:6808/205989,v1:172.21.15.153:6809/205989] conn(0x7f3c90011730 0x7f3c90011b60 unknown :-1 s=BANNER_CONN ECTING pgs=0 cs=0 l=1 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0 2022-04-11T15:56:23.628 INFO:tasks.ceph.osd.7.smithi153.stderr:2022-04-11T15:56:23.619+0000 7f22a0340700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper kill ceph-osd -f --cluster ceph -i 7 (PID: 27672) UID: 0 2022-04-11T15:56:23.644 INFO:tasks.workunit.client.0.smithi141.stdout:'.snap/k' -> './k' 2022-04-11T15:56:23.644 INFO:tasks.workunit.client.0.smithi141.stdout:'.snap/k/linux-2.6.33.tar.bz2' -> './k/linux-2.6.33.tar.bz2' 2022-04-11T15:56:23.645 INFO:tasks.workunit.client.0.smithi141.stderr:cp: error writing './k/linux-2.6.33.tar.bz2': Stale file handle 2022-04-11T15:56:23.645 INFO:teuthology.orchestra.run.smithi141.stderr:umount: /home/ubuntu/cephtest/mnt.0: target is busy. 2022-04-11T15:56:23.646 INFO:tasks.workunit.client.0.smithi141.stderr:cp: cannot stat '.snap/k/linux-2.6.33': Input/output error 2022-04-11T15:56:23.646 INFO:tasks.workunit.client.0.smithi141.stderr:cp: preserving times for './k': Input/output error 2022-04-11T15:56:23.647 INFO:teuthology.orchestra.run.smithi141.stderr:2022-04-11T15:56:23.639+0000 7f3cb37fe700 1 --2- 172.21.15.141:0/3624046670 >> [v2:172.21.15.153:6808/205989,v1:172.21.15.153:6809/205989] conn(0x7f3c90011730 0x7f3c90011b60 crc :-1 s=READY pgs=222 cs=0 l=1 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).ready entity=osd.5 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0 2022-04-11T15:56:23.647 DEBUG:teuthology.orchestra.run:got remote process result: 1 2022-04-11T15:56:23.648 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0... 2022-04-11T15:56:23.648 DEBUG:teuthology.orchestra.run.smithi141:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0 2022-04-11T15:56:23.658 DEBUG:teuthology.orchestra.run:got remote process result: 32
#6 Updated by Patrick Donnelly 7 months ago
- Target version deleted (
v17.0.0)