Bug #57656
open[testing] dbench: write failed on handle 10009 (Resource temporarily unavailable)
0%
Description
When testing with my postgres changes:
https://github.com/ceph/ceph/labels/wip-pdonnell-testing2
I've observed failures with the kernel testing branch. The set of PRs notably introduce periodic snapshots to fs:workload which seems to catch this behavior:
2022-09-22T14:04:02.650 INFO:tasks.workunit.client.0.smithi124.stdout:[1415095] write failed on handle 10009 (Resource temporarily unavailable) 2022-09-22T14:04:02.650 INFO:tasks.workunit.client.0.smithi124.stdout:Child failed with status 1
I've not gone through all of these events but the failures seem to share: (a) testing branch; (b) snapshots turned on.
Updated by Patrick Donnelly over 1 year ago
/ceph/teuthology-archive/pdonnell-2022-09-26_19:11:10-fs-wip-pdonnell-testing-20220923.171109-distro-default-smithi/7044485/teuthology.log
/ceph/teuthology-archive/pdonnell-2022-09-26_19:11:10-fs-wip-pdonnell-testing-20220923.171109-distro-default-smithi/7044491/teuthology.log
Updated by Xiubo Li over 1 year ago
Another failure is :
2022-09-26T20:03:01.601 INFO:tasks.workunit.client.0.smithi066.stdout:Wrote -1 instead of 4096 bytes. 2022-09-26T20:03:01.601 INFO:tasks.workunit.client.0.smithi066.stdout:Probably out of disk space 2022-09-26T20:03:01.602 INFO:tasks.workunit.client.0.smithi066.stderr:write: Resource temporarily unavailable 2022-09-26T20:03:01.639 DEBUG:teuthology.orchestra.run:got remote process result: 1 2022-09-26T20:03:01.640 INFO:tasks.workunit:Stopping ['suites/ffsb.sh'] on client.0...
This is kclient, and I didn't see any error in kernel logs. And also checked the mds, osd logs, find nothing!
This should be a known issue: https://tracker.ceph.com/issues/51410
Patrick,
Could you reproduce this ?
Thanks!
Updated by Xiubo Li over 1 year ago
- Related to Bug #51410: kclient: fails to finish reconnect during MDS thrashing (testing branch) added
Updated by Xiubo Li over 1 year ago
- Status changed from In Progress to Need More Info
Today I spent more than half day to read the mds, osd side logs, but still couldn't find any suspect logs. Usually if we could some logs from kernel, it will be easy to know what exactly happen as the one in a previous tracker https://tracker.ceph.com/issues/54461.
I tried but couldn't reproduce it locally! Let's wait new ones to see whether could we get any useful info then.
Updated by Venky Shankar about 1 year ago
Updated by Patrick Donnelly 24 days ago
/teuthology/pdonnell-2024-03-24_04:56:01-fs-wip-batrick-testing-20240323.003144-squid-distro-default-smithi/7619048/teuthology.log
Updated by Venky Shankar 13 days ago
This shows up on almost every run. Latest: https://pulpito.ceph.com/yuriw-2024-04-01_20:57:46-fs-wip-yuri3-testing-2024-04-01-0837-squid-distro-default-smithi/7634520/
Nothing in the kernel ring buffer too.