Actions
Bug #55236
openqa: fs/snaps tests fails with "hit max job timeout"
Status:
Triaged
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
qa, task(medium)
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
yaml matrix
Description: fs/thrash/workloads/{begin/{0-install 1-ceph 2-logrotate} clusters/1a5s-mds-1c-client conf/{client mds mon osd} distro/{rhel_8} mount/fuse msgr-failures/osd-mds-delay objectstore-ec/bluestore-comp-ec-root overrides/{frag prefetch_dirfrags/no prefetch_entire_dirfrags/no races session_timeout thrashosds-health whitelist_health whitelist_wrongly_marked_down} ranks/3 tasks/{1-thrash/osd 2-workunit/fs/snaps}}
fs/snaps runs `untar_snap_rm.sh' while a task thrashes OSDs. The test reaches a point where we see the following error:
2022-04-07T18:55:55.943 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/check-segrel.lds' -> './k/linux-2.6.33/arch/ia64/scripts/check-segrel.lds' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/check-serialize.S' -> './k/linux-2.6.33/arch/ia64/scripts/check-serialize.S' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/check-text-align.S' -> './k/linux-2.6.33/arch/ia64/scripts/check-text-align.S' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/pvcheck.sed' -> './k/linux-2.6.33/arch/ia64/scripts/pvcheck.sed' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/toolchain-flags' -> './k/linux-2.6.33/arch/ia64/scripts/toolchain-flags' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/unwcheck.py' -> './k/linux-2.6.33/arch/ia64/scripts/unwcheck.py' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/sn' -> './k/linux-2.6.33/arch/ia64/sn' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/sn/Makefile' -> './k/linux-2.6.33/arch/ia64/sn/Makefile' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stderr:cp: error reading '.snap/k/linux-2.6.33/arch/ia64/sn/Makefile': Connection timed out 2022-04-07T18:55:56.034 INFO:tasks.ceph.osd.1.smithi131.stderr:2022-04-07T18:55:56.032+0000 7f1b106c1700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper kill ceph-osd -f --cluster ceph -i 1 (PID: 299246) UID: 0 2022-04-07T18:55:56.042 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.040+0000 7f38f759e700 1 Processor -- start 2022-04-07T18:55:56.042 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.040+0000 7f38f759e700 1 -- start start 2022-04-07T18:55:56.045 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.043+0000 7f38f759e700 1 --2- >> v2:172.21.15.173:3300/0 conn(0x7f38f004f3e0 0x7f38f00dee40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2022-04-07T18:55:56.046 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.044+0000 7f38f759e700 1 --2- >> [v2:172.21.15.173:3301/0,v1:172.21.15.173:6790/0] conn(0x7f38f00dcd40 0x7f38f00dc210 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect
Connection timed out when reading a file from a snapshot. This might not be related to CephFS as such and the timeout could be coming form the OSDs. That needs to be checked however. (If that's the case then this tracker could be moved to RADOS component).
Actions