Project

General

Profile

Actions

Bug #64504

open

aio ops queued but never executed

Added by Nitzan Mordechai 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Few of teuthology tests were failed when trying to execute aio_write and then wait_for_complete never completed.
the client shows the op sent:

2024-02-13T17:25:17.599+0000 7f67c83ce740 10 client.5066.objecter _op_submit op 0x555f070ec010
2024-02-13T17:25:17.599+0000 7f67c83ce740 20 client.5066.objecter _calc_target epoch 81 base foo @43;nspace precalc_pgid 0 pgid 0.0 is_write
2024-02-13T17:25:17.599+0000 7f67c83ce740 20 client.5066.objecter _calc_target target foo @43;nspace -> pgid 43.1272949
2024-02-13T17:25:17.599+0000 7f67c83ce740 10 client.5066.objecter _calc_target  raw pgid 43.1272949 -> actual 43.9 acting [1,6,7] primary 1
2024-02-13T17:25:17.599+0000 7f67c83ce740  1 --2- 172.21.15.84:0/686509261 >> [v2:172.21.15.84:6800/3364418886,v1:172.21.15.84:6801/3364418886] conn(0x555f071a2640 0x555f071a4a30 unknown :-1 s=NONE pgs=0 cs=0 l=1 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect
2024-02-13T17:25:17.599+0000 7f67c83ce740 20 client.5066.objecter _get_session s=0x555f070ece90 osd=1 3
2024-02-13T17:25:17.599+0000 7f67c83ce740 10 client.5066.objecter _op_submit oid foo '@43;nspace' '@43;nspace' [write 0~128 in=128b] tid 3 osd.1
2024-02-13T17:25:17.599+0000 7f67c83ce740 20 client.5066.objecter get_session s=0x555f070ece90 osd=1 3
2024-02-13T17:25:17.599+0000 7f67c83ce740 15 client.5066.objecter _session_op_assign 1 3
2024-02-13T17:25:17.599+0000 7f67c83ce740 15 client.5066.objecter _send_op 3 to 43.9 on osd.1
2024-02-13T17:25:17.599+0000 7f67c83ce740  1 -- 172.21.15.84:0/686509261 --> [v2:172.21.15.84:6800/3364418886,v1:172.21.15.84:6801/3364418886] -- osd_op(unknown.0.0:3 43.9 43:9294e480:nspace::foo:head [write 0~128 in=128b] snapc 0=[] ondisk+write+known_if_redirected+supports_pool_eio e81) v8 -- 0x555f071a4fb0 con 0x555f071a2640

the osd shows the op queued:

2024-02-13T17:25:17.599+0000 7fa2f661d640 15 osd.1 85 enqueue_op osd_op(client.5066.0:3 43.9 43.1272949 (undecoded) ondisk+write+known_if_redirected+supports_pool_eio e81) v8 prio 63 type 42 cost 128 latency 0.000016 epoch 81 osd_op(client.5066.0:3 43.9 43.1272949 (undecoded) ondisk+write+known_if_redirected+supports_pool_eio e81) v8
2024-02-13T17:25:17.599+0000 7fa2f661d640 20 osd.1 op_wq(1) _enqueue OpSchedulerItem(43.9 PGOpItem(op=osd_op(client.5066.0:3 43.9 43.1272949 (undecoded) ondisk+write+known_if_redirected+supports_pool_eio e81) v8) class_id 3 prio 63 cost 128 e81)

but not dequeue and execute that op.
/a/lflores-2024-01-22_04:12:37-rados-wip-lflores-testing-2024-01-18-2129-distro-default-smithi/7525368
/a/yuriw-2024-02-13_23:41:23-rados-wip-yuri6-testing-2024-02-13-0904-distro-default-smithi/7558996
/a/lflores-2024-02-13_16:18:32-rados-wip-yuri5-testing-2024-02-12-1152-distro-default-smithi/7558384


Related issues 1 (1 open0 closed)

Related to RADOS - Bug #58130: LibRadosAio.SimpleWrite hang and pkillIn ProgressNitzan Mordechai

Actions
Actions

Also available in: Atom PDF