Project

General

Profile

Actions

Bug #50510

open

OSD will return -EAGAIN on balance_reads although it can return the data

Added by Or Friedmann about 3 years ago. Updated almost 3 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

PrimaryLogPG.cc:

if (!is_primary()) {
if (!recovery_state.can_serve_replica_read(oid)) {
dout(20) << func
<< ": unstable write on replica, bouncing to primary "
<< *m << dendl;
osd->reply_op_error(op, -EAGAIN);
return;
}
dout(20) << func << ": serving replica read on oid " << oid
<< dendl;
}

changing this (!recovery_state.can_serve_replica_read(oid)) to (false) will allow the osd to return the object's data

osd.2 is the primary

The object is replicated and didn't change, it always returns EAGAIN, changing to false returns valid response.

ms log from version with false in this condition:

2021-04-25T17:40:44.664+0300 7f5b117da640 1 -- 192.168.31.119:0/2540045748 --> [v2:192.168.31.119:6808/103538,v1:192.168.31.119:6810/103538] -- osd_op(unknown.0.0:848 7.11 7:890d6271:::a45ef6be-ae65-4c8b-a097-0ae0fa8dbac9.24121.1_hosts:head [getxattrs,stat,read 0~4194304] snapc 0=[] ondisk+read+balance_reads+known_if_redirected e80) v8 -- 0x7f5bc8019950 con 0x7f5c2c03b190
2021-04-25T17:40:44.664+0300 7f5c3b7fe640 1 -- 192.168.31.119:0/2540045748 <== osd.0 v2:192.168.31.119:6808/103538 354 ==== osd_op_reply(848 a45ef6be-ae65-4c8b-a097-0ae0fa8dbac9.24121.1_hosts [getxattrs out=1250b,stat out=16b,read 0~215 out=215b] v0'0 uv3 ondisk = 0) v8 ==== 278+0+1481 (crc 0 0 0) 0x7f5c34045040 con 0x7f5c2c03b190

ms log from upstream:

2021-04-25T20:57:02.419+0300 7f79a07e8640 1 -- 192.168.31.119:0/1658518928 --> [v2:192.168.31.119:6809/116462,v1:192.168.31.119:6811/116462] -- osd_op(unknown.0.0:307 7.11 7:890d6271:::a45ef6be-ae65-4c8b-a097-0ae0fa8dbac9.24121.1_hosts:head [getxattrs,stat,read 0~4194304] snapc 0=[] ondisk+read+balance_reads+known_if_redirected e85) v8 -- 0x7f7a00041390 con 0x7f7aac03b3e0
2021-04-25T20:57:02.419+0300 7f7abd7fa640 1 -- 192.168.31.119:0/1658518928 <== osd.0 v2:192.168.31.119:6809/116462 114 ==== osd_op_reply(307 a45ef6be-ae65-4c8b-a097-0ae0fa8dbac9.24121.1_hosts [getxattrs,stat,read 0~4194304] v0'0 uv0 ondisk = 11 ((11) Resource temporarily unavailable)) v8 ==== 278+0+0 (crc 0 0 0) 0x7f7ab8045040 con 0x7f7aac03b3e0
2021-04-25T20:57:02.419+0300 7f7abd7fa640 1 -
192.168.31.119:0/1658518928 --> [v2:192.168.31.119:6808/116460,v1:192.168.31.119:6810/116460] -- osd_op(unknown.0.0:308 7.11 7:890d6271:::a45ef6be-ae65-4c8b-a097-0ae0fa8dbac9.24121.1_hosts:head [getxattrs,stat,read 0~4194304] snapc 0=[] RETRY=1 ondisk+retry+read+known_if_redirected e85) v8 -- 0x7f79e0004000 con 0x140e2d0

Code that is sending balance read flag: https://github.com/pleiadesian/ceph/pull/1/files


Files

osd.2.log (4.16 KB) osd.2.log Replica osd log Zulai Wang, 05/02/2021 01:43 PM
osd.1.log (9.36 KB) osd.1.log Primary osd log Zulai Wang, 05/02/2021 01:43 PM
osd.2.log.nocheck (11.7 KB) osd.2.log.nocheck Replica osd log with false in the condition Zulai Wang, 05/02/2021 01:43 PM
Actions

Also available in: Atom PDF