Project

General

Profile

Actions

Bug #63017

closed

write_data failed: Connection reset by peer error observed against main while uploading a multipart object

Added by Pritha Srivastava 7 months ago. Updated 6 months ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

write_data failed: Connection reset by peer, was first observed in a 'get object' call while running test_multipart_upload() test against the d3n filter driver branch. Upon investigating this error, it was observed that when there is a latency introduced after the 'get object' statement like 'assert' statements, the error goes away.

I have created a boto3 script using test_multipart_upload() and the first 'get object' call leads to this error in the log file.

I brought up vstart using the following command:
MON=1 OSD=1 RGW=1 MGR=0 MDS=0 ../src/vstart.sh -n -d

I have attached the script here, some parts have been commented out to reproduce the error.

A snippet from the log file is below:
2023-09-28T15:35:10.052+0530 7f4e9b7c86c0 20 req 6030671164491735147 0.005000023s s3:get_obj RGWObjManifest::operator++(): rule->part_size=5242880 rules.size()=1
2023-09-28T15:35:10.052+0530 7f4e9b7c86c0 20 req 6030671164491735147 0.005000023s s3:get_obj RGWObjManifest::operator++(): stripe_ofs=18874368 part_ofs=10485760 rule->part_size=5242880
2023-09-28T15:35:10.052+0530 7f4e9b7c86c0 20 req 6030671164491735147 0.005000023s s3:get_obj RGWObjManifest::operator++(): result: ofs=15728640 stripe_ofs=15728640 part_ofs=15728640 rule->part_size=5242880
2023-09-28T15:35:10.052+0530 7f4e9b7c86c0 20 req 6030671164491735147 0.005000023s s3:get_obj rados->get_obj_iterate_cb oid=38f1b649-d8be-4772-a31a-ed016eba5ea6.4149.1__multipart_mymultipart.2~KCjBpxY4Zb7Z3s_Pw_j9Il1RUe11iXJ.4 obj-ofs=15728640 read_ofs=0 len=4194304
...
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 4 write_data failed: Connection reset by peer
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 0 req 6030671164491735147 0.044000205s s3:get_obj iterate_obj() failed with -104
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 2 req 6030671164491735147 0.044000205s s3:get_obj completing
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 10 req 6030671164491735147 0.044000205s cache get: name=default.rgw.log++script.postrequest. : hit (negative entry)
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 2 req 6030671164491735147 0.044000205s s3:get_obj op status=-104
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 2 req 6030671164491735147 0.044000205s s3:get_obj http status=200
2023-09-28T15:35:10.091+0530 7f4e947ba6c0 1 ====== req done req=0x7f4e69763710 op status=-104 http_status=200 latency=0.044000205s ======


Files

test_multipart.py (4.51 KB) test_multipart.py Pritha Srivastava, 09/28/2023 10:08 AM
Actions

Also available in: Atom PDF