Project

General

Profile

Actions

Bug #62709

open

num_objects in bucket stats mismatch between primary and secondary sites which can be fixed by a bucket list

Added by Jane Zhu 8 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
multisite
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

"num_objects", along with other fields in the "usage" section, in "radosgw-admin bucket stats" on the secondary site is 1 or 2 less than on the primary site after the replication is done. However this is not a real replication mismatch, since it can be fixed by running "radosgw-admin bucket list" on the bucket on the secondary site. This can be reproduced with the main branch (as of Aug. 29th, 2023).

An example:
This is after 1 hr of write-only cosbench traffic against a pair of Ceph multisite clusters.
Sync status shows all caught up on both sites.

On Primary site:

$ sudo radosgw-admin -n client.rgw.`hostname`.dev.1 bucket stats --bucket=prod-mixed-1k-thisisbcstestuser0081015
...
    "usage": {
        "rgw.main": {
            "size": 1357000,
            "size_actual": 5558272,
            "size_utilized": 1357000,
            "size_kb": 1326,
            "size_kb_actual": 5428,
            "size_kb_utilized": 1326,
            "num_objects": 1357
        }
    },
...

On Secondary site:

$ sudo radosgw-admin -n client.rgw.`hostname`.dev.1 bucket stats --bucket=prod-mixed-1k-thisisbcstestuser0081015
...
    "usage": {
        "rgw.main": {
            "size": 1356000,
            "size_actual": 5554176,
            "size_utilized": 1356000,
            "size_kb": 1325,
            "size_kb_actual": 5424,
            "size_kb_utilized": 1325,
            "num_objects": 1356
        }
    },
...

$ sudo radosgw-admin -n client.rgw.`hostname`.dev.1 bucket list --max-entries=10000 --bucket=prod-mixed-1k-thisisbcstestuser0081015 > /dev/null

$ sudo radosgw-admin -n client.rgw.`hostname`.dev.1 bucket stats --bucket=prod-mixed-1k-thisisbcstestuser0081015
..
    "usage": {
        "rgw.main": {
            "size": 1357000,
            "size_actual": 5558272,
            "size_utilized": 1357000,
            "size_kb": 1326,
            "size_kb_actual": 5428,
            "size_kb_utilized": 1326,
            "num_objects": 1357
        }
    },
...

Actions #1

Updated by Jane Zhu 8 months ago

I accidently put this in a wrong project. Can somebody please move it to "rgw" project? Thanks!

Actions #2

Updated by Casey Bodley 8 months ago

  • Project changed from teuthology to rgw
Actions #3

Updated by Jane Zhu 8 months ago

More findings:

Take this bucket as an example: prod-mixed-1m-thisisbcstestuser0035009

$  sudo radosgw-admin -n client.rgw.`hostname`.dev.1 bucket stats --bucket=prod-mixed-1m-thisisbcstestuser0035009
...
    "num_shards": 101,
...
    "id": "3cedf2a0-fc58-4236-857c-36815b850df6.18443.23",
...

On primary:

$ sudo rados -p dev-zone-bcc-master.rgw.buckets.index getomapheader .dir.3cedf2a0-fc58-4236-857c-36815b850df6.18443.23.91 omapheader.91

$ ceph-dencoder type rgw_bucket_dir_header import omapheader.91 decode dump_json
{
    "ver": 15,
    "master_ver": 0,
    "stats": [
        1,
        {
            "total_size": 14000000,
            "total_size_rounded": 14049280,
            "num_entries": 14,
            "actual_size": 14000000
        }
    ],
    "new_instance": {
        "reshard_status": "not-resharding" 
    }
}

$ sudo rados -p dev-zone-bcc-master.rgw.buckets.index listomapkeys .dir.3cedf2a0-fc58-4236-857c-36815b850df6.18443.23.91 | wc -l
14

On secondary:

$ sudo rados -p dev-zone-bcc-secondary.rgw.buckets.index getomapheader .dir.3cedf2a0-fc58-4236-857c-36815b850df6.18443.23.91 omapheader.91

$ ceph-dencoder type rgw_bucket_dir_header import omapheader.91 decode dump_json
{
    "ver": 14,
    "master_ver": 0,
    "stats": [
        1,
        {
            "total_size": 13000000,
            "total_size_rounded": 13045760,
            "num_entries": 13,
            "actual_size": 13000000
        }
    ],
    "new_instance": {
        "reshard_status": "not-resharding" 
    }
}

$ sudo rados -p dev-zone-bcc-secondary.rgw.buckets.index listomapkeys .dir.3cedf2a0-fc58-4236-857c-36815b850df6.18443.23.91 | wc -l
14

The entries for all the S3 objects have been added to the OMAP properly, however, the OMAP header has not been updated propery on the secondary cluster. We can see the "ver" is 14 on the secondary, whereas it's 15 on the primary.

Actions #4

Updated by Shilpa MJ 7 months ago

  • Translation missing: en.field_tag_list set to multisite multisite-backlog
Actions

Also available in: Atom PDF