Project

General

Profile

Actions

Bug #65309

open

qa: dbench.sh failed with "ERROR: handle 10318 was not found"

Added by Rishabh Dave 29 days ago. Updated 23 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Link to the job - https://pulpito.ceph.com/rishabh-2024-03-27_05:27:11-fs-wip-rishabh-testing-20240326.131558-testing-default-smithi/7625621

This failure on surface looks similar to https://tracker.ceph.com/issues/57656 but isn't because the failed job for this ticket doesn't contain following lines -

2022-09-22T14:04:02.650 INFO:tasks.workunit.client.0.smithi124.stdout:[1415095] write failed on handle 10009 (Resource temporarily unavailable)
2022-09-22T14:04:02.650 INFO:tasks.workunit.client.0.smithi124.stdout:Child failed with status 1

Instead it contains following lines -

2024-03-27T07:16:18.287 DEBUG:teuthology.orchestra.run.smithi033:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd pool get cephfs_metadata pg_num
2024-03-27T07:16:18.431 INFO:tasks.workunit.client.0.smithi033.stdout:[2060] open ./clients/client0/~dmtmp/PARADOX/__50172.DB failed for handle 10318 (Transport endpoint is not connected)
2024-03-27T07:16:18.431 INFO:tasks.workunit.client.0.smithi033.stdout:(2062) ERROR: handle 10318 was not found
2024-03-27T07:16:18.431 INFO:tasks.workunit.client.0.smithi033.stdout:Child failed with status 1
2024-03-27T07:16:18.435 DEBUG:teuthology.orchestra.run:got remote process result: 1
2024-03-27T07:16:18.435 INFO:tasks.workunit:Stopping ['suites/dbench.sh'] on client.0...

Following entries were found from /a/rishabh-2024-03-27_05:27:11-fs-wip-rishabh-testing-20240326.131558-testing-default-smithi/7625621/remote/smithi033/log/ceph-client.0.44701.log.gz -

169792:2024-03-27T07:15:16.165+0000 143de640 10 client.4605 _lookup concluded ENOENT locally for 0x1000000021f.head(faked_ino=0 nref=5 ll_ref=187 cap_refs={} open={} mode=40700 size=0/0 nlink=1 btime=2024-03-27T07:13:12.869114+0000 mtime=2024-03-27T07:15:13.929586+0000 ctime=2024-03-27T07:15:13.929586+0000 change_attr=27 caps=pAsLsXsFsx(0=pAsLsXsFsx) COMPLETE parents=0x10000000004.head["PARADOX"] 0x15d9f720) dn 'ERRORCHG.FAM'
169853:2024-03-27T07:15:16.197+0000 13bdd640 10 client.4605 _lookup concluded ENOENT locally for 0x1000000021f.head(faked_ino=0 nref=6 ll_ref=188 cap_refs={} open={} mode=40700 size=0/0 nlink=1 btime=2024-03-27T07:13:12.869114+0000 mtime=2024-03-27T07:15:13.929586+0000 ctime=2024-03-27T07:15:13.929586+0000 change_attr=27 caps=pAsLsXsFsx(0=pAsLsXsFsx) COMPLETE parents=0x10000000004.head["PARADOX"] 0x15d9f720) dn 'ERRORCHG.DB'

There are more entries in the same log for ERRORCHG.DB and ERRORCHG.FAM.

Actions #1

Updated by Rishabh Dave 29 days ago

  • Description updated (diff)
Actions #2

Updated by Rishabh Dave 29 days ago

  • Labels (FS) qa-failure added
Actions #3

Updated by Venky Shankar 23 days ago

  • Category set to Correctness/Safety
  • Assignee set to Xiubo Li
  • Target version set to v20.0.0
  • Backport set to quincy,reef,squid
Actions

Also available in: Atom PDF