Bug #53199: bucket stats inconsistency from racing multipart completions - rgw - Ceph

Actions

Copy link

Bug #53199

closed

bucket stats inconsistency from racing multipart completions

Added by Casey Bodley over 2 years ago. Updated about 1 year ago.

Status:

Resolved

Priority:

High

Assignee:

Casey Bodley

Target version:

% Done:

100%

Source:

Development

Tags:

stats multipart backport_processed

Backport:

octopus pacific

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

43854

Crash signature (v1):

Crash signature (v2):

Description

steps to reproduce:

1. create a fresh bucket

$ s3cmd mb s3://testbucket
Bucket 's3://testbucket/' created

2. create a 32MB object (any size over 15MB multipart threshold)

$ truncate -s 32M 32m.iso

3. perform 8 concurrent multipart uploads to the same object

$ for i in {0..7}; do s3cmd put 32m.iso s3://testbucket & done
...
[8]+  Done                    s3cmd put 32m.iso s3://testbucket

4. check 'bucket stats'


$ radosgw-admin bucket stats --bucket testbucket
...
    "usage": {                                                                                                                                                
        "rgw.main": {                                                                                                                                         
            "size": 134217728,                                                                                                                                
            "size_actual": 134217728,                                                                                                                         
            "size_utilized": 134217728,                                                                                                                       
            "size_kb": 131072,                                                                                                                                
            "size_kb_actual": 131072,                                                                                                                         
            "size_kb_utilized": 131072,                                                                                                                       
            "num_objects": 10                                                                                                                                 
        },

expected stats: 1 object of size=33554432

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Casey Bodley over 2 years ago

it looks like this is the result of ECANCELED errors on the head object during CompleteMultipart. the rgw log shows 3 of these errors out of the 8 multipart uploads:

2021-11-08T15:56:35.266-0500 7f0756ffd640  1 -- 192.168.245.128:0/2416006518 <== osd.0 v2:192.168.245.128:6800/920291 583 ==== osd_op_reply(589 f9658b86-62b3-45c0-831b-e7bcabd754a6.4137.1_32m.iso [cmpxattr (63) op 1 mode 1,create,call,setxattr (62),setxattr (62),setxattr (347),setxattr (147),setxattr (25),setxattr (34),setxattr (8),setxattr (4),setxattr (65),setxattr (17),setxattr (145)] v20'103 uv101 ondisk = -125 ((125) Operation canceled)) v8 ==== 741+0+0 (crc 0 0 0) 0x7f074c0c2040 con 0x5648d187cda0
2021-11-08T15:56:35.280-0500 7f0756ffd640  1 -- 192.168.245.128:0/2416006518 <== osd.0 v2:192.168.245.128:6800/920291 597 ==== osd_op_reply(603 f9658b86-62b3-45c0-831b-e7bcabd754a6.4137.1_32m.iso [cmpxattr (62) op 1 mode 1,create,call,setxattr (62),setxattr (62),setxattr (347),setxattr (147),setxattr (25),setxattr (34),setxattr (8),setxattr (4),setxattr (65),setxattr (17),setxattr (145)] v20'108 uv107 ondisk = -125 ((125) Operation canceled)) v8 ==== 741+0+0 (crc 0 0 0) 0x7f074c0c2040 con 0x5648d187cda0
2021-11-08T15:56:35.282-0500 7f0756ffd640  1 -- 192.168.245.128:0/2416006518 <== osd.0 v2:192.168.245.128:6800/920291 598 ==== osd_op_reply(604 f9658b86-62b3-45c0-831b-e7bcabd754a6.4137.1_32m.iso [cmpxattr (62) op 1 mode 1,create,call,setxattr (62),setxattr (62),setxattr (347),setxattr (147),setxattr (25),setxattr (34),setxattr (8),setxattr (4),setxattr (65),setxattr (17),setxattr (145)] v20'109 uv107 ondisk = -125 ((125) Operation canceled)) v8 ==== 741+0+0 (crc 0 0 0) 0x7f074c0c2040 con 0x5648d187cda0

when the head object write fails during a bucket index transation, we cancel the transaction by calling cls_rgw_bucket_complete_op() with op=CLS_RGW_OP_CANCEL instead of CLS_RGW_OP_ADD. the main difference between CLS_RGW_OP_CANCEL and the other ops, is that the cls call bails out early on "rgw_bucket_complete_op(): cancel requested" and doesn't try to handle remove_objs

in multipart uploads, remove_objs contains the list of multipart part entries, and we rely on cls to remove their entries and unaccount for their stats

a grep for 'remove_objs' in the osd log only shows 5 cases where remove_objs was non-empty:

...
2021-11-08T15:56:35.244-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=0
2021-11-08T15:56:35.265-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=3
2021-11-08T15:56:35.279-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=3
2021-11-08T15:56:35.286-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=3
2021-11-08T15:56:35.290-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=0
2021-11-08T15:56:35.301-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=3
2021-11-08T15:56:35.304-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=0
2021-11-08T15:56:35.308-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=0
2021-11-08T15:56:35.310-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=3
2021-11-08T15:56:35.313-0500 7f9c33fff640 20 <cls> ../src/cls/rgw/cls_rgw.cc:1121: rgw_bucket_complete_op(): remove_objs.size()=0
...

these correspond to the 5 multipart uploads which completed successfully. the 3 canceled transactions did not clean up the index entries for their parts (3 each), which explains why num_objects=10 instead of 1

Actions

Copy link