Project

General

Profile

Actions

Bug #62938

closed

RGW s3website API prefetches data for range requests

Added by Ondrej Kukla 7 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
% Done:

100%

Source:
Community (dev)
Tags:
website backport_processed
Backport:
pacific quincy reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Similar issue to a Bug #44508. Reproducible only when using s3website API not on s3 API.

You can replicate it by running this wrk command wrk -t56 -c500 -d5m http://${rgwipaddress}:8080/${bucket}/videos/ -s wrk-range-small.lua

It will send a range requests to 7 mp4 files.

wrk script

-- Initialize the pseudo random number generator
math.randomseed( os.time())
math.random(); math.random(); math.random()

i = 1

function request()
   if i == 8
   then
       i = 1
   end

   local nrangefrom = math.random()
   local nrangeto = math.random(100)
   local path = wrk.path
   url = path..i..".mp4" 
   wrk.headers["Range"] = nrangefrom.."-"..nrangeto
   i = i+1
   return wrk.format(nil, url)
end

When testing it was reading at rate 3Gb/s from compared to ~22Mb/s on s3 RGW. In both situation the bw towards client was ~20Mb/s

In the RGW log I was able to find this entry.

2023-09-20T12:52:06.670+0000 7f216d702700 1 -- xxx.xxx.58.15:0/758879303 --> [v2:xxx.xxx.58.2:6816/8556,v1:xxx.xxx.58.2:6817/8556] -- osd_op(unknown.0.0:238 18.651 18:8a75a7b2:::39078a70-7768-48c8-96a5-1e13ced83b5b.58017020.1_videos%2f7.mp4:head [getxattrs,stat,read 0~4194304] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e60419) v8 -- 0x7f21dc00a420 con 0x7f21dc007820

You can find the OSD parts of the log here - https://pastebin.com/nGQw4ugd


Related issues 3 (0 open3 closed)

Copied to rgw - Backport #63049: pacific: RGW s3website API prefetches data for range requestsResolvedCasey BodleyActions
Copied to rgw - Backport #63050: quincy: RGW s3website API prefetches data for range requestsResolvedCasey BodleyActions
Copied to rgw - Backport #63051: reef: RGW s3website API prefetches data for range requestsResolvedCasey BodleyActions
Actions #1

Updated by Casey Bodley 7 months ago

  • Status changed from New to Fix Under Review
  • Assignee set to Casey Bodley
  • Tags set to website
  • Backport set to pacific quincy reef
  • Pull request ID set to 53602

thanks Ondrej,

wrk.headers["Range"] = nrangefrom.."-"..nrangeto

from my reading of RGWGetObj::parse_range() at https://github.com/ceph/ceph/blob/9fedc1e0/src/rgw/rgw_op.cc#L112-L139, it expects the format of the Range header to look like "bytes=from-to". this parsing logic seems consistent with the syntax described in https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range

when passing the "Range: bytes=-500" header to an s3website endpoint, i see it correctly returning the final 500 bytes without prefetching from the given object. however, i do see an earlier object lookup incorrectly using s->prefetch_data=1:

2023-09-22T15:26:21.995-0400 7f540f4ec6c0 10 req 13424627313409767571 0.003000102s retarget Starting retarget
2023-09-22T15:26:21.995-0400 7f540f4ec6c0 20 req 13424627313409767571 0.003000102s get_obj_state: rctx=0x559a351b1ce0 obj=testbucket:8m.iso state=0x559a3c2a2de8 s->prefetch_data=1
2023-09-22T15:26:21.995-0400 7f540f4ec6c0 1 -- 192.168.245.130:0/3181500925 --> [v2:192.168.245.130:6800/1515984712,v1:192.168.245.130:6801/1515984712] -- osd_op(unknown.0.0:200 6.0 6:710b9184:::a648e116-a6fb-48ba-a8d0-888d37298654.4149.1_8m.iso:head [getxattrs,stat,read 0~4194304] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e19) v8 -- 0x559a3c3c6a80 con 0x559a39fa2480

this was coming from the function bool RGWHandler_REST_S3Website::web_dir(). i've opened https://github.com/ceph/ceph/pull/53602 to avoid prefetch there. with that fix applied, we only transfer the requested 500 bytes from the osd

Actions #2

Updated by Casey Bodley 7 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Backport Bot 7 months ago

  • Copied to Backport #63049: pacific: RGW s3website API prefetches data for range requests added
Actions #4

Updated by Backport Bot 7 months ago

  • Copied to Backport #63050: quincy: RGW s3website API prefetches data for range requests added
Actions #5

Updated by Backport Bot 7 months ago

  • Copied to Backport #63051: reef: RGW s3website API prefetches data for range requests added
Actions #6

Updated by Backport Bot 7 months ago

  • Tags changed from website to website backport_processed
Actions #7

Updated by Konstantin Shalygin 6 months ago

  • Status changed from Pending Backport to Resolved
  • Target version set to v19.0.0
  • % Done changed from 0 to 100
  • Source set to Community (dev)
Actions

Also available in: Atom PDF