Project

General

Profile

Actions

Bug #17698

closed

multisite: ECANCELED & 500 error on bucket delete

Added by Abhishek Lekshmanan over 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

sometimes on bucket delete at the secondary, we're seeing secondary returning 500 on a bucket delete (caused by reraising ECANCELED from OSDs). However the bucket in question gets deleted actually and subsequent client requests fail with a 404

2016-10-25 13:59:21.999214 7f7a48417700  1 -- 127.0.0.1:0/1778501402 <== osd.0 127.0.0.1:6812/27412 2962 ==== osd_op_reply(6705 bucket26 [call,call,delete] v0'0 uv0 ondisk = -125 ((125) Operation canceled)) v7 ==== 212+0+0 (1575458175 0 0) 0x7f7a20013ba0 con 0x564acad41190
2016-10-25 13:59:21.999244 7f7920ff1700  2 req 1192:0.312583:s3:DELETE /bucket26/:delete_bucket:completing
2016-10-25 13:59:21.999247 7f7920ff1700  0 WARNING: set_req_state_err err_no=125 resorting to 500
2016-10-25 13:59:21.999286 7f7920ff1700  2 req 1192:0.312625:s3:DELETE /bucket26/:delete_bucket:op status=-125
2016-10-25 13:59:21.999289 7f7920ff1700  2 req 1192:0.312628:s3:DELETE /bucket26/:delete_bucket:http status=500
2016-10-25 13:59:21.999291 7f7920ff1700  1 ====== req done req=0x7f7920fee7e0 op status=-125 http_status=500 ======


Files

rgw-bucket26.log (108 KB) rgw-bucket26.log Abhishek Lekshmanan, 10/25/2016 04:35 PM

Related issues 1 (0 open1 closed)

Copied to rgw - Backport #17886: jewel: multisite: ECANCELED & 500 error on bucket deleteResolvedAbhishek VarshneyActions
Actions #1

Updated by Abhishek Lekshmanan over 7 years ago

Adding the rgw log of the relevant bucket, it seems we're issuing delete on the bucket object twice, once at
2016-10-25 13:59:21.931712 7f7a01ffb700 1 -- 127.0.0.1:0/1778501402 --> 127.0.0.1:6812/27412 -- osd_op(client.4111.0:6691 3.c8ef106b bucket26 [call version.check_conds,call version.set,delete] snapc 0=[] ondisk+write+known_if_redirected e29) v7 -- ?+0 0x7f79e0018850 con 0x564acad41190

(which is probably mdlog/admin log?) and second when the secondary itself processes the delete.

Actions #2

Updated by Yehuda Sadeh over 7 years ago

Sounds like secondary is racing with the request forwarded from the master (which the secondary itself initiated). We should handle that ECANCELED gracefully.

Actions #4

Updated by Yehuda Sadeh over 7 years ago

  • Priority changed from Normal to High
Actions #5

Updated by Casey Bodley over 7 years ago

  • Status changed from New to Pending Backport
Actions #6

Updated by Nathan Cutler over 7 years ago

  • Source deleted (other)
  • Backport set to jewel
Actions #7

Updated by Nathan Cutler over 7 years ago

  • Copied to Backport #17886: jewel: multisite: ECANCELED & 500 error on bucket delete added
Actions #8

Updated by Nathan Cutler almost 7 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF