Project

General

Profile

Actions

Bug #46400

closed

Space usage accounting overestimated

Added by Liam Monahan almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
octopus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Since upgrading from Nautilus 14.2.9 -> Octopus 15.2.3 we are seeing
large upticks in the reported size (both space and object count) for a number of our RGW
users. It does not seem to be isolated to just one user, so I don't think it's
something wrong in the users' usage patterns. Users are hitting their quotas very
quickly even though they are not writing anywhere near the reported space usage.

For example, here is a bucket that all of a sudden reports that it has
18446744073709551615 objects. The actual count I know to be around 20,000 objects.

[root@objproxy01 ~]# radosgw-admin bucket stats --bucket=droot-2020
{
    "bucket": "droot-2020",
    "num_shards": 32,
    "tenant": "",
    "zonegroup": "29946069-33ce-49b7-b93d-de8c95a0c344",
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "",
        "data_extra_pool": "",
        "index_pool": "" 
    },
    "id": "8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.93433056.64",
    "marker": "8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.93433056.64",
    "index_type": "Normal",
    "owner": "-droot",
    "ver":
"0#12052,1#15700,2#11033,3#11079,4#11521,5#13708,6#12427,7#10442,8#12769,9#11965,10#12820,11#11015,12#12073,13#11741,14#11851,15#124
97,16#10611,17#11652,18#10162,19#13699,20#9519,21#14224,22#13575,23#12635,24#9413,25#11450,26#12700,27#13122,28#10762,29#14674,30#10809,31#1223
2",
    "master_ver":
"0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0,11#0,12#0,13#0,14#0,15#0,16#0,17#0,18#0,19#0,20#0,21#0,22#0,23#0,24#0,25#0,26#0
,27#0,28#0,29#0,30#0,31#0",
    "mtime": "2020-06-29T15:14:49.363664Z",
    "creation_time": "2020-02-04T20:36:40.752748Z",
    "max_marker":
"0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#,11#,12#,13#,14#,15#,16#,17#,18#,19#,20#,21#,22#,23#,24#,25#,26#,27#,28#,29#,30#,31#",
    "usage": {
        "rgw.none": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 18446744073709551615
        },
        "rgw.main": {
            "size": 11612169555286,
            "size_actual": 11612211085312,
            "size_utilized": 11612169555286,
            "size_kb": 11340009332,
            "size_kb_actual": 11340049888,
            "size_kb_utilized": 11340009332,
            "num_objects": 20034
        },
        "rgw.multimeta": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

A second (but possibly related) issue is that the user who owns that bucket above is reportedly using 1.3PB of space, but the known
usage is 96.3TB from before we did the upgrade.

[root@objproxy01 ~]# radosgw-admin user stats --uid=-droot
{
    "stats": {
        "size": 1428764900976977,
        "size_actual": 1428770491326464,
        "size_utilized": 0,
        "size_kb": 1395278223611,
        "size_kb_actual": 1395283682936,
        "size_kb_utilized": 0,
        "num_objects": 2604800
    },
    "last_stats_sync": "2020-06-29T13:42:26.474035Z",
    "last_stats_update": "2020-06-29T13:42:26.471413Z" 
}

Downgrading back to Nautilus 14.2.10 and running "radosgw-admin user stats --uid=<uid> --sync-stats" on every user in our cluster corrected the space accounting.

This seems to be happening with all users who write data to the cluster since stats are only updated when writing data I believe. This is more than a minor annoyance because the reported space is being used to calculate if users are over their quota.


Files

ceph.conf (735 Bytes) ceph.conf Liam Monahan, 07/16/2020 02:56 PM
ceph-config-dump.txt (118 KB) ceph-config-dump.txt Output from running "ceph config dump" Liam Monahan, 07/16/2020 02:56 PM

Related issues 2 (0 open2 closed)

Related to rgw - Bug #45970: rgw: bucket index entries marked rgw.none not accounted for correctly during reshardResolved

Actions
Copied to rgw - Backport #47037: octopus: rgw: Space usage accounting overestimatedResolvedNathan CutlerActions
Actions

Also available in: Atom PDF