https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2021-04-06T12:48:53ZCeph rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=1895932021-04-06T12:48:53ZMatt Benjaminmbenjamin@redhat.com
<ul><li><strong>Pull request ID</strong> set to <i>40594</i></li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=1897432021-04-07T17:47:18ZMark Koganmkogan@redhat.com
<ul></ul><p>reproducing below a flow that is similar to a Quay with geo-replication zone configuration using aws cli:</p>
<p>1st multipart-complete:<br /><pre>
dd if=/dev/zero of=container.img bs=1M count=4
aws --endpoint-url http://localhost:8000 s3 mb s3://alln-registry
aws --endpoint-url http://localhost:8000 s3api create-multipart-upload --bucket alln-registry --key 'datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f' | jq
{
"Bucket": "alln-registry",
"Key": "datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f",
"UploadId": "2~zVVHC7y1pqXH0zbp2_bCcZa61teKt8w"
}
export UPID='2~zVVHC7y1pqXH0zbp2_bCcZa61teKt8w'
aws --endpoint-url http://localhost:8000 s3api upload-part --bucket alln-registry --key 'datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f' --part-number 1 --body ./container.img --upload-id "${UPID}"
{
"ETag": "\"b5cfa9d6c8febd618f91ac2843d50a1c\""
}
aws --endpoint-url http://localhost:8000 s3api list-parts --bucket alln-registry --key 'datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f' --upload-id "${UPID}"
{
"Parts": [
{
"PartNumber": 1,
"LastModified": "2021-04-07T11:50:59.516Z",
"ETag": "\"b5cfa9d6c8febd618f91ac2843d50a1c\"",
"Size": 4194304
}
],
"Initiator": null,
"Owner": {
"DisplayName": "",
"ID": ""
},
"StorageClass": "STANDARD"
}
cat <<_EOF > fileparts.json
{
"Parts": [{
"ETag": "b5cfa9d6c8febd618f91ac2843d50a1c",
"PartNumber":1
}]
}
_EOF
aws --endpoint-url http://localhost:8000 s3api complete-multipart-upload --multipart-upload file://fileparts.json --bucket alln-registry --key 'datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f' --upload-id "${UPID}"
</pre></p>
<p>in gdb breakpoint at src/rgw/rgw_op.cc : RGWCompleteMultipart::execute()<br /><pre>
serializer = meta_obj->get_serializer("RGWCompleteMultipart");
op_ret = serializer->try_lock(dur, y);
Thread 99 "civetweb-worker" hit Breakpoint 2, RGWCompleteMultipart::execute (this=0x7ffed0016640, y=...) at ../src/rgw/rgw_op.cc:5956
5956 if (op_ret < 0) {
(gdb) p op_ret
$1 = 0
</pre></p>
<p>on the 1st multipart-complete request the execution flows as previously and will not enter new flow (does not check for the existence of the first part object before multipart completes)</p>
<p>2nd multipart-complete - when the 1st complete-multipart 200 OK gets lost in transmission (network problem occured), Quay will re-send a complete-multipart request:<br /><pre>
aws --endpoint-url http://localhost:8000 s3api complete-multipart-upload --multipart-upload file://fileparts.json --bucket alln-registry --key 'datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f' --upload-id "${UPID}"
</pre><br />in gdb breakpoint at src/rgw/rgw_op.cc : RGWCompleteMultipart::execute()<br /><pre>
serializer = meta_obj->get_serializer("RGWCompleteMultipart");
op_ret = serializer->try_lock(dur, y);
Thread 99 "civetweb-worker" hit Breakpoint 2, RGWCompleteMultipart::execute (this=0x7ffed0016640, y=...) at ../src/rgw/rgw_op.cc:5956
5956 if (op_ret < 0) {
(gdb) p op_ret
$2 = -2
</pre><br />on this scenario the new flow will be executed.</p>
<p>rgw log:<br /><pre>
2021-04-07T16:32:12.592+0300 7ffedeffd700 0 req 47 3.161037683s s3:complete_multipart failed to acquire lock
2021-04-07T16:32:12.592+0300 7ffedeffd700 20 req 47 3.161037683s s3:complete_multipart check already complete multipart: 1c816972-ff50-47f6-9d88-88c87c8b9343.4183.1__multipart_datastorage/registry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4f.2~zVVHC7y1pqXH0zbp2_bCcZa61teKt8w.1
2021-04-07T16:32:12.593+0300 7ffedeffd700 2 req 47 3.162037611s s3:complete_multipart completing
2021-04-07T16:32:12.594+0300 7ffedeffd700 20 req 47 3.163037539s get_system_obj_state: rctx=0x7ffedeff8470 obj=default.rgw.log:script.postrequest. state=0x7fffe40e10a0 s->prefetch_data=0
2021-04-07T16:32:12.594+0300 7ffedeffd700 20 cache get: touching lru, lru_counter=59 promotion_ts=55
2021-04-07T16:32:12.594+0300 7ffedeffd700 10 moving default.rgw.log++script.postrequest. to cache LRU end
2021-04-07T16:32:12.594+0300 7ffedeffd700 10 cache get: name=default.rgw.log++script.postrequest. : hit (negative entry)
2021-04-07T16:32:12.594+0300 7ffedeffd700 2 req 47 3.163037539s s3:complete_multipart op status=1901
2021-04-07T16:32:12.594+0300 7ffedeffd700 2 req 47 3.163037539s s3:complete_multipart http status=202
2021-04-07T16:32:12.594+0300 7ffedeffd700 1 ====== req done req=0x7ffedeffa5a0 op status=1901 http_status=202 latency=3.163037539s ======
</pre></p>
<p>Potential false-positive scenarios that are handled correctly:<br />1. wrong key<br />2. wring bucket<br />3. wrong UploadId</p>
<p>caught by:<br /><pre>
ret = data_ioctx.stat(chk_oid, nullptr, nullptr);
ioctx.stat() ret = " << ret << std::endl;
if (ret < 0) {
ldpp_dout(this, 0) << __func__ << ": ERROR: ioctx.stat(" << chk_oid << ") returned ret = " << ret << dendl;
op_ret = -ERR_NO_SUCH_UPLOAD;
return;
</pre><br />(ret will be = -2)</p>
<p>example:<br /><pre>
2021-04-07T16:23:48.823+0300 7ffedeffd700 0 req 44 6.753080368s s3:complete_multipart failed to acquire lock
2021-04-07T16:23:48.823+0300 7ffedeffd700 20 req 44 6.753080368s s3:complete_multipart check already complete multipart: 1c816972-ff50-47f6-9d88-88c87c8b9343.4183.1__multipart_datastorage/re
gistry/sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4fZ.2~zVVHC7y1pqXH0zbp2_bCcZa61teKt8w.1
2021-04-07T16:23:48.824+0300 7ffedeffd700 0 req 44 6.754080296s s3:complete_multipart execute: ERROR: ioctx.stat(1c816972-ff50-47f6-9d88-88c87c8b9343.4183.1__multipart_datastorage/registry/
sha256/07/07680f8abcbce9cff4b73de2d6ce0aa9d65f35554da0f856fbd4cb4a4492dd4fZ.2~zVVHC7y1pqXH0zbp2_bCcZa61teKt8w.1) returned ret = -2
2021-04-07T16:23:48.824+0300 7ffedeffd700 2 req 44 6.754080296s s3:complete_multipart op status=-2009
2021-04-07T16:23:48.824+0300 7ffedeffd700 2 req 44 6.754080296s s3:complete_multipart http status=404
</pre><br />an 404 is returned above.</p>
<p>Potential false-positive scenarios that are NOT handled correctly yet (related to Etag verification):<br />1. specifying a wrong ETag returns 200 OK (as noted in the github comments ETags are not verified)<br />2. a complete-multipart request with any previously successful UploadId will returns 200 OK.</p>
<p>In tests with the Quay client, reproducing the network error scenario preventing the 200 OK transmition, <br />the false-positive scenarios above related to Etag verificationdo do not occur.</p> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2039912021-10-06T15:20:09ZMark Koganmkogan@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2083802022-01-13T15:23:38ZJ. Eric Ivancichivancich@redhat.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-3 priority-4 priority-default closed" href="/issues/53808">Bug #53808</a>: Invalid response for CompleteMultipartUpload 500 'InternalError': This multipart completion is already in progress</i> added</li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2083812022-01-13T15:24:42ZCasey Bodleycbodley@redhat.com
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Pending Backport</i></li><li><strong>Tags</strong> set to <i>multipart</i></li><li><strong>Backport</strong> set to <i>octopus pacific</i></li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2083832022-01-13T15:26:28ZBackport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/53867">Backport #53867</a>: octopus: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500</i> added</li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2083852022-01-13T15:26:36ZBackport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/53868">Backport #53868</a>: pacific: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500</i> added</li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2228022022-08-08T16:34:10ZBackport Bot
<ul><li><strong>Tags</strong> changed from <i>multipart</i> to <i>multipart backport_processed</i></li></ul> rgw - Bug #50141: consecutive complete-multipart-upload requests following the 1st (successful) one on the same upload-id result in 500https://tracker.ceph.com/issues/50141?journal_id=2268732022-09-30T10:08:59ZKonstantin Shalygink0ste@k0ste.ru
<ul><li><strong>Tracker</strong> changed from <i>Fix</i> to <i>Bug</i></li><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li><li><strong>Tags</strong> changed from <i>multipart backport_processed</i> to <i>multipart</i></li><li><strong>Regression</strong> set to <i>No</i></li><li><strong>Severity</strong> set to <i>3 - minor</i></li></ul>