Actions
Bug #55236
openqa: fs/snaps tests fails with "hit max job timeout"
Status:
Triaged
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
qa, task(medium)
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
yaml matrix
Description: fs/thrash/workloads/{begin/{0-install 1-ceph 2-logrotate} clusters/1a5s-mds-1c-client conf/{client mds mon osd} distro/{rhel_8} mount/fuse msgr-failures/osd-mds-delay objectstore-ec/bluestore-comp-ec-root overrides/{frag prefetch_dirfrags/no prefetch_entire_dirfrags/no races session_timeout thrashosds-health whitelist_health whitelist_wrongly_marked_down} ranks/3 tasks/{1-thrash/osd 2-workunit/fs/snaps}}
fs/snaps runs `untar_snap_rm.sh' while a task thrashes OSDs. The test reaches a point where we see the following error:
2022-04-07T18:55:55.943 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/check-segrel.lds' -> './k/linux-2.6.33/arch/ia64/scripts/check-segrel.lds' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/check-serialize.S' -> './k/linux-2.6.33/arch/ia64/scripts/check-serialize.S' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/check-text-align.S' -> './k/linux-2.6.33/arch/ia64/scripts/check-text-align.S' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/pvcheck.sed' -> './k/linux-2.6.33/arch/ia64/scripts/pvcheck.sed' 2022-04-07T18:55:55.944 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/toolchain-flags' -> './k/linux-2.6.33/arch/ia64/scripts/toolchain-flags' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/scripts/unwcheck.py' -> './k/linux-2.6.33/arch/ia64/scripts/unwcheck.py' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/sn' -> './k/linux-2.6.33/arch/ia64/sn' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stdout:'.snap/k/linux-2.6.33/arch/ia64/sn/Makefile' -> './k/linux-2.6.33/arch/ia64/sn/Makefile' 2022-04-07T18:55:55.945 INFO:tasks.workunit.client.0.smithi131.stderr:cp: error reading '.snap/k/linux-2.6.33/arch/ia64/sn/Makefile': Connection timed out 2022-04-07T18:55:56.034 INFO:tasks.ceph.osd.1.smithi131.stderr:2022-04-07T18:55:56.032+0000 7f1b106c1700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper kill ceph-osd -f --cluster ceph -i 1 (PID: 299246) UID: 0 2022-04-07T18:55:56.042 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.040+0000 7f38f759e700 1 Processor -- start 2022-04-07T18:55:56.042 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.040+0000 7f38f759e700 1 -- start start 2022-04-07T18:55:56.045 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.043+0000 7f38f759e700 1 --2- >> v2:172.21.15.173:3300/0 conn(0x7f38f004f3e0 0x7f38f00dee40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2022-04-07T18:55:56.046 INFO:teuthology.orchestra.run.smithi131.stderr:2022-04-07T18:55:56.044+0000 7f38f759e700 1 --2- >> [v2:172.21.15.173:3301/0,v1:172.21.15.173:6790/0] conn(0x7f38f00dcd40 0x7f38f00dc210 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect
Connection timed out when reading a file from a snapshot. This might not be related to CephFS as such and the timeout could be coming form the OSDs. That needs to be checked however. (If that's the case then this tracker could be moved to RADOS component).
Updated by Venky Shankar about 2 years ago
Another instance: https://pulpito.ceph.com/vshankar-2022-04-09_12:55:41-fs-wip-vshankar-testing-55110-20220408-203242-testing-default-smithi/6783880/
In this case, the job encountered an IO error:
2022-04-09T16:08:12.061 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/wm8903.h' 2022-04-09T16:08:12.062 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/wm8993.h' 2022-04-09T16:08:12.062 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/wm8523.c' 2022-04-09T16:08:12.062 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/wm8580.c' 2022-04-09T16:08:12.062 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/ad1938.h' 2022-04-09T16:08:12.062 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/ac97.h' 2022-04-09T16:08:12.063 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/ad1836.h' 2022-04-09T16:08:12.063 INFO:tasks.workunit.client.0.smithi050.stdout:removed 'k/linux-2.6.33/sound/soc/codecs/wm8960.c' 2022-04-09T16:08:12.064 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/ads117x.h': Input/output error 2022-04-09T16:08:12.064 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/max9877.h': Input/output error 2022-04-09T16:08:12.064 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/wm9713.c': Input/output error 2022-04-09T16:08:12.064 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/stac9766.h': Input/output error 2022-04-09T16:08:12.065 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/wm8728.c': Input/output error 2022-04-09T16:08:12.065 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/ad1938.c': Input/output error 2022-04-09T16:08:12.065 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/wm8903.c': Input/output error 2022-04-09T16:08:12.066 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/wm9081.c': Input/output error 2022-04-09T16:08:12.066 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/wm8510.c': Input/output error 2022-04-09T16:08:12.066 INFO:tasks.workunit.client.0.smithi050.stderr:rm: cannot remove 'k/linux-2.6.33/sound/soc/codecs/wm8350.c': Input/output error
The EIO might be from the MDS rather than the OSDs (in the earlier failed job from the tracker description).
Updated by Venky Shankar about 2 years ago
- Status changed from New to Triaged
- Assignee set to Kotresh Hiremath Ravishankar
Actions