Project

General

Profile

Actions

Feature #54972

open

[Feature specification] Introduce Storage classes in usage stats

Added by Rafael Weingartner about 2 years ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Introduce Storage classes in usage stats

This specification proposes the addition of storage classes information in the GET "/admin/usage" and "GET /bucket" data. The goal of the specification is to validate the proposals here presented with the community before we implement and create the pull requests. Any feeedback is welcome here.

Problem Description

In RadosGW one can configure different storage classes to provide different quality of services (QoS); for instance, using HDDs to store non-critical and non-latency sensitive information at a lower cost. On the other hand, it is also possible to use a custom storage class to store objects in NVMe storage pools to provide better response times. It is also possible to store data in erasure code pools, which in consequence will use less storage space, and may have some impact on performance. There are many other possibilities and different use cases that can be achieved with the use of storage classes. We just mentioned a few of them here.

One example is the following configuration (obtained with "radosgw-admin zone get").

{
    ..
    Many data that we do not care for now here
    ..
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "default.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "rgw.buckets.hdd" 
                    },
                    "STANDARD_EC": {
                        "data_pool": "rgw.buckets.hdd.data_ec" 
                    },
                    "SSD": {
                        "data_pool": "rgw.buckets.ssd" 
                    },
                    "NVME": {
                        "data_pool": "rgw.buckets.nvme" 
                    }
                },
                "data_extra_pool": "default.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ]
}

Everything works fine with respect to RadosGW APIs to upload object using specific storage classes. However, there is no practical method to discover/find out the volume of data and number of objects in a bucket that are using the storage classes provided in the RadosGW. As follows, we can see one example of the output of the buckets stats API.

{
    "bucket": "bucket1",
    "num_shards": 11,
    "tenant": "",
    "zonegroup": "2a761aef-610c-4da1-9eff-f5287252c513",
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "",
        "data_extra_pool": "",
        "index_pool": "" 
    },
    "id": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "marker": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "index_type": "Normal",
    "owner": "rafael",
    "ver": "0#201,1#1,2#201,3#1,4#201,5#799,6#799,7#400,8#401,9#1,10#401",
    "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
    "mtime": "2021-11-26T11:23:47.375664Z",
    "creation_time": "2021-11-26T11:23:47.363417Z",
    "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#",
    "usage": {
        "rgw.main": {
            "size": 9215803392,
            "size_actual": 9215803392,
            "size_utilized": 9215803392,
            "size_kb": 8999808,
            "size_kb_actual": 8999808,
            "size_kb_utilized": 8999808,
            "num_objects": 3
        },
        "rgw.multimeta": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

As shown above, there is no indication regarding the storage classes used to store data. The same situation happens with the usage admin API. There is no indication regarding the storage classes of objects in operations to upload/download objects to/from RadosGW. As follows, we show an example of output of such an API.

{
    "entries": [
        {
            "buckets": [
                {
                    "bucket": "",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 35015,
                            "category": "list_buckets",
                            "ops": 45,
                            "successful_ops": 45
                        }
                    ],
                    "epoch": 1609520400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-01 17:00:00.000000Z" 
                },
                {
                    "bucket": "-",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 293,
                            "category": "get_bucket_policy",
                            "ops": 1,
                            "successful_ops": 0
                        }
                    ],
                    "epoch": 1609866000,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 17:00:00.000000Z" 
                },
                {
                    "bucket": "bucket_test-1",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "create_bucket",
                            "ops": 1,
                            "successful_ops": 1
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 6669,
                            "category": "get_bucket_policy",
                            "ops": 27,
                            "successful_ops": 27
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "put_bucket_policy",
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ],
                    "epoch": 1609862400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 16:00:00.000000Z" 
                }
            ],
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }
    ],
    "summary": [
        {
            "categories": [
                {
                    "bytes_received": 0,
                    "bytes_sent": 8,
                    "category": "list_buckets",
                    "ops": 4,
                    "successful_ops": 4
                }
            ],
            "total": {
                "bytes_received": 0,
                "bytes_sent": 8,
                "ops": 4,
                "successful_ops": 4
            },
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }]
}

Having said all that, the use of storage classes in RadosGW works, but it (RadosGW) does not provide a mechanism to enable billing/rating objects stored and operations that affect objects that have different storage classes. This has also already been reported in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/D7GDNGCJYJMDF5JP74MOGC6EYLKZ3S7S/, and later registered as a feature request via: https://tracker.ceph.com/issues/47342.

Proposed Change

To address the reported problem, we propose to extend the admin usage API (/admin/usage), which uses usage data stored in the Ceph file system regarding operations (PUT/POST/DELETE/LIST/and so on). These operations are grouped/counted by user and bucket fashion. They are also stored. This means that every interaction with RadosGW API generates a usage entry, which is persisted in Ceph itself. To extend this part, We would need to load the storage class of the object being handled (for methods PUT/POST/DELETE); only operations for objects would be affected here, and then extend the usage entry to hold this new attribute, and later consider it when aggregating data that is presented in the response of the request.

One example of a response for the API with the proposed changes are the following:


{
    "entries": [
        {
            "buckets": [
                {
                    "bucket": "",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 35015,
                            "category": "list_buckets",
                            "ops": 45,
                            "successful_ops": 45
                        }
                    ],
                    "epoch": 1609520400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-01 17:00:00.000000Z" 
                },
                {
                    "bucket": "-",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 293,
                            "category": "get_bucket_policy",
                            "ops": 1,
                            "successful_ops": 0
                        }
                    ],
                    "epoch": 1609866000,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 17:00:00.000000Z" 
                },
                {
                    "bucket": "bucket_test-1",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "create_bucket",
                            "ops": 1,
                            "successful_ops": 1
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 6669,
                            "category": "get_bucket_policy",
                            "ops": 27,
                            "successful_ops": 27
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "put_bucket_policy",
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ],
                    "categories-<storage-class-name>": [
                         {
                            "bytes_received": 129836684,
                            "bytes_sent": 0,
                            "category": "put_obj",
                            "ops": 323,
                            "successful_ops": 323
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 119318246,
                            "category": "get_obj",
                            "ops": 17956,
                            "successful_ops": 17956
                        }
                      <Many other operations here>
                                ],
                    "epoch": 1609862400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 16:00:00.000000Z" 
                }
            ],
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }
    ],
    "summary": [
        {
            "categories": [
                {
                    "bytes_received": 0,
                    "bytes_sent": 8,
                    "category": "list_buckets",
                    "ops": 4,
                    "successful_ops": 4
                }
            ],
            "categories-<storage-class-name>": [
                 {
                    "bytes_received": 129836684,
                    "bytes_sent": 0,
                    "category": "put_obj",
                    "ops": 323,
                    "successful_ops": 323
                },
                {
                    "bytes_received": 0,
                    "bytes_sent": 119318246,
                    "category": "get_obj",
                    "ops": 17956,
                    "successful_ops": 17956
                }
              <Many other operations here>
            ],
            "total": {
                "bytes_received": 0,
                "bytes_sent": 8,
                "ops": 4,
                "successful_ops": 4
            },
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }]
}

As one can see, new entries would be created with the pattern "category-<storage-class-name>", where the data regarding the operations that affected objects with the given category is presented.

The other API, which is normally used via "radosgw-admin bucket stats --bucket=bucket1" (/bucket), which presents the total amount of resources (objects) that the bucket has. This API is implemented by checking the objects/files stored in the bucket in Ceph, and then counting/summing them up. The objects (when using different storage classes) would be stored in different storage pools. Therefore, we can distinguish between them, and provide more granular accounting based on the storage classes being used in a bucket. One example of a response for the API with the proposed changes are the following:


{
    "bucket": "bucket1",
    "num_shards": 11,
    "tenant": "",
    "zonegroup": "2a761aef-610c-4da1-9eff-f5287252c513",
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "",
        "data_extra_pool": "",
        "index_pool": "" 
    },
    "id": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "marker": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "index_type": "Normal",
    "owner": "rafael",
    "ver": "0#201,1#1,2#201,3#1,4#201,5#799,6#799,7#400,8#401,9#1,10#401",
    "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
    "mtime": "2021-11-26T11:23:47.375664Z",
    "creation_time": "2021-11-26T11:23:47.363417Z",
    "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#",
    "usage": {
        "rgw.main": {
            "size": 9215803392,
            "size_actual": 9215803392,
            "size_utilized": 9215803392,
            "size_kb": 8999808,
            "size_kb_actual": 8999808,
            "size_kb_utilized": 8999808,
            "num_objects": 3
        },
        "rgw.multimeta": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }
        "rgw.storage-classes": [
        {
            "name": "STANDARD" 
            "size": 9215803392,
            "size_actual": 9215803392,
            "size_utilized": 9215803392,
            "size_kb": 8999808,
            "size_kb_actual": 8999808,
            "size_kb_utilized": 8999808,
            "num_objects": 3
        },
        {
            "name": "STANDARD_EC" 
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        },
        {
            "name": "SSD" 
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        },
        {
            "name": "NVME" 
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }]
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

Then, one would be able to see how many objects, their size, and other information for each one of the storage classes that the objects stored in a bucket have.

Actions #1

Updated by hoan nv about 2 years ago

what abount radosgw-admin user stats.

Need to change it?

Actions #2

Updated by hoan nv about 2 years ago

hoan nv wrote:

what abount radosgw-admin user stats command.

Need to change it?

Actions #3

Updated by Rafael Weingartner about 2 years ago

hoan nv wrote:

hoan nv wrote:

what abount radosgw-admin user stats command.

Need to change it?

Yes, this other API also needs changes. There is a method "RGWUserAdminOp_User::info" that implements the user API. Thanks for reminding me of that! I will amend the proposal.

Actions #4

Updated by Rafael Weingartner about 2 years ago

Introduce Storage classes in usage stats

This specification proposes the addition of storage classes information in the GET "/admin/usage", "GET /bucket", and "GET /user" data. The goal of the specification is to validate the proposals here presented with the community before we implement and create the pull requests. Any feeedback is welcome here.

Problem Description

In RadosGW one can configure different storage classes to provide different quality of services (QoS); for instance, using HDDs to store non-critical and non-latency sensitive information at a lower cost. On the other hand, it is also possible to use a custom storage class to store objects in NVMe storage pools to provide better response times. It is also possible to store data in erasure code pools, which in consequence will use less storage space, and may have some impact on performance. There are many other possibilities and different use cases that can be achieved with the use of storage classes. We just mentioned a few of them here.

One example is the following configuration (obtained with "radosgw-admin zone get").

{
    ..
    Many data that we do not care for now here
    ..

    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "default.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "rgw.buckets.hdd" 
                    },
                    "STANDARD_EC": {
                        "data_pool": "rgw.buckets.hdd.data_ec" 
                    },
                    "SSD": {
                        "data_pool": "rgw.buckets.ssd" 
                    },
                    "NVME": {
                        "data_pool": "rgw.buckets.nvme" 
                    }
                },
                "data_extra_pool": "default.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ]
}

Everything works fine with respect to RadosGW APIs to upload object using specific storage classes. However, there is no practical method to discover/find out the volume of data and number of objects in a bucket that are using the storage classes provided in the RadosGW. As follows, we can see one example of the output of the buckets stats A

{
    "bucket": "bucket1",
    "num_shards": 11,
    "tenant": "",
    "zonegroup": "2a761aef-610c-4da1-9eff-f5287252c513",
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "",
        "data_extra_pool": "",
        "index_pool": "" 
    },
    "id": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "marker": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "index_type": "Normal",
    "owner": "rafael",
    "ver": "0#201,1#1,2#201,3#1,4#201,5#799,6#799,7#400,8#401,9#1,10#401",
    "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
    "mtime": "2021-11-26T11:23:47.375664Z",
    "creation_time": "2021-11-26T11:23:47.363417Z",
    "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#",
    "usage": {
        "rgw.main": {
            "size": 9215803392,
            "size_actual": 9215803392,
            "size_utilized": 9215803392,
            "size_kb": 8999808,
            "size_kb_actual": 8999808,
            "size_kb_utilized": 8999808,
            "num_objects": 3
        },
        "rgw.multimeta": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

As shown above, there is no indication regarding the storage classes used to store data. The same situation happens with the usage admin API. There is no indication regarding the storage classes of objects in operations to upload/download objects to/from RadosGW. As follows, we show an example of output of such an API.

{
    "entries": [
        {
            "buckets": [
                {
                    "bucket": "",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 35015,
                            "category": "list_buckets",
                            "ops": 45,
                            "successful_ops": 45
                        }
                    ],
                    "epoch": 1609520400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-01 17:00:00.000000Z" 
                },
                {
                    "bucket": "-",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 293,
                            "category": "get_bucket_policy",
                            "ops": 1,
                            "successful_ops": 0
                        }
                    ],
                    "epoch": 1609866000,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 17:00:00.000000Z" 
                },
                {
                    "bucket": "bucket_test-1",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "create_bucket",
                            "ops": 1,
                            "successful_ops": 1
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 6669,
                            "category": "get_bucket_policy",
                            "ops": 27,
                            "successful_ops": 27
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "put_bucket_policy",
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ],
                    "epoch": 1609862400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 16:00:00.000000Z" 
                }
            ],
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }
    ],
    "summary": [
        {
            "categories": [
                {
                    "bytes_received": 0,
                    "bytes_sent": 8,
                    "category": "list_buckets",
                    "ops": 4,
                    "successful_ops": 4
                }
            ],
            "total": {
                "bytes_received": 0,
                "bytes_sent": 8,
                "ops": 4,
                "successful_ops": 4
            },
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }]
}

Having said all that, the use of storage classes in RadosGW works, but it (RadosGW) does not provide a mechanism to enable billing/rating objects stored and operations that affect objects that have different storage classes. This has also already been reported in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/D7GDNGCJYJMDF5JP74MOGC6EYLKZ3S7S/, and later registered as a feature request via: https://tracker.ceph.com/issues/47342.

Proposed Change

To address the reported problem, we propose to extend the admin usage API (/admin/usage), which uses usage data stored in the Ceph file system regarding operations (PUT/POST/DELETE/LIST/and so on). These operations are grouped/counted by user and bucket fashion. They are also stored. This means that every interaction with RadosGW API generates a usage entry, which is persisted in Ceph itself. To extend this part, We would need to load the storage class of the object being handled (for methods PUT/POST/DELETE); only operations for objects would be affected here, and then extend the usage entry to hold this new attribute, and later consider it when aggregating data that is presented in the response of the request.

One example of a response for the API with the proposed changes are the following:

{
    "entries": [
        {
            "buckets": [
                {
                    "bucket": "",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 35015,
                            "category": "list_buckets",
                            "ops": 45,
                            "successful_ops": 45
                        }
                    ],
                    "epoch": 1609520400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-01 17:00:00.000000Z" 
                },
                {
                    "bucket": "-",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 293,
                            "category": "get_bucket_policy",
                            "ops": 1,
                            "successful_ops": 0
                        }
                    ],
                    "epoch": 1609866000,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 17:00:00.000000Z" 
                },
                {
                    "bucket": "bucket_test-1",
                    "categories": [
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "create_bucket",
                            "ops": 1,
                            "successful_ops": 1
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 6669,
                            "category": "get_bucket_policy",
                            "ops": 27,
                            "successful_ops": 27
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 0,
                            "category": "put_bucket_policy",
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ],
                    "categories-<storage-class-name>": [
                         {
                            "bytes_received": 129836684,
                            "bytes_sent": 0,
                            "category": "put_obj",
                            "ops": 323,
                            "successful_ops": 323
                        },
                        {
                            "bytes_received": 0,
                            "bytes_sent": 119318246,
                            "category": "get_obj",
                            "ops": 17956,
                            "successful_ops": 17956
                        }
                      <Many other operations here>
                                ],
                    "epoch": 1609862400,
                    "owner": "72431a5a-4a34-4319-a7f5-5f150f370284",
                    "time": "2021-01-05 16:00:00.000000Z" 
                }
            ],
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }
    ],
    "summary": [
        {
            "categories": [
                {
                    "bytes_received": 0,
                    "bytes_sent": 8,
                    "category": "list_buckets",
                    "ops": 4,
                    "successful_ops": 4
                }
            ],
            "categories-<storage-class-name>": [
                 {
                    "bytes_received": 129836684,
                    "bytes_sent": 0,
                    "category": "put_obj",
                    "ops": 323,
                    "successful_ops": 323
                },
                {
                    "bytes_received": 0,
                    "bytes_sent": 119318246,
                    "category": "get_obj",
                    "ops": 17956,
                    "successful_ops": 17956
                }
              <Many other operations here>
            ],
            "total": {
                "bytes_received": 0,
                "bytes_sent": 8,
                "ops": 4,
                "successful_ops": 4
            },
            "user": "e6bde5e6-718c-4137-8438-f99883388042" 
        }]
}

As one can see, new entries would be created with the pattern "category-<storage-class-name>", where the data regarding the operations that affected objects with the given category is presented.

The other API, which is normally used via "radosgw-admin bucket stats --bucket=bucket1" (/bucket), which presents the total amount of resources (objects) that the bucket has. This API is implemented by checking the objects/files stored in the bucket in Ceph, and then counting/summing them up. The objects (when using different storage classes) would be stored in different storage pools. Therefore, we can distinguish between them, and provide more granular accounting based on the storage classes being used in a bucket. One example of a response for the API with the proposed changes are the following:

{
    "bucket": "bucket1",
    "num_shards": 11,
    "tenant": "",
    "zonegroup": "2a761aef-610c-4da1-9eff-f5287252c513",
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "",
        "data_extra_pool": "",
        "index_pool": "" 
    },
    "id": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "marker": "1f878f22-f69c-486b-81de-144920efcbdb.55411.2",
    "index_type": "Normal",
    "owner": "rafael",
    "ver": "0#201,1#1,2#201,3#1,4#201,5#799,6#799,7#400,8#401,9#1,10#401",
    "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
    "mtime": "2021-11-26T11:23:47.375664Z",
    "creation_time": "2021-11-26T11:23:47.363417Z",
    "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#",
    "usage": {
        "rgw.main": {
            "size": 9215803392,
            "size_actual": 9215803392,
            "size_utilized": 9215803392,
            "size_kb": 8999808,
            "size_kb_actual": 8999808,
            "size_kb_utilized": 8999808,
            "num_objects": 3
        },
        "rgw.multimeta": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }
        "rgw.storage-classes": [
        {
            "name": "STANDARD" 
            "size": 9215803392,
            "size_actual": 9215803392,
            "size_utilized": 9215803392,
            "size_kb": 8999808,
            "size_kb_actual": 8999808,
            "size_kb_utilized": 8999808,
            "num_objects": 3
        },
        {
            "name": "STANDARD_EC" 
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        },
        {
            "name": "SSD" 
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        },
        {
            "name": "NVME" 
            "size": 0,
            "size_actual": 0,
            "size_utilized": 0,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 0,
            "num_objects": 0
        }]
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

Then, one would be able to see how many objects, their size, and other information for each one of the storage classes that the objects stored in a bucket have.

The last API that requires an extension, is used via "radosgw-admin user stats --uid=rafael" (/rafael), which presents the total amount of resources (objects) that the user has. This API is implemented with a method "rgw_user_sync_all_stats" that executes the accounting of objects and volume of storage used for each user, and then it returns the result of that method. The goal would be to extend this method, which is already used by the "/bucket" API, to distinguish between storage classes when executing the accounting, and to provide more granular accounting based on the storage classes being used in the user's bucket. One example of a response for the API with the proposed changes is the following:


{
    "stats": {
        "size": 9215803392,
        "size_actual": 9215803392,
        "size_utilized": 0,
        "size_kb": 8999808,
        "size_kb_actual": 8999808,
        "size_kb_utilized": 0,
        "num_objects": 3
    },
    "stats.storage-classes": [
      "STANDARD": {
          "size": 9215803392,
          "size_actual": 9215803392,
          "size_utilized": 0,
          "size_kb": 8999808,
          "size_kb_actual": 8999808,
          "size_kb_utilized": 0,
          "num_objects": 3
    },
      "STANDARD_EC": {
          "size": 0,
          "size_actual": 0,
          "size_utilized": 0,
          "size_kb": 0,
          "size_kb_actual": 0,
          "size_kb_utilized": 0,
          "num_objects": 0
    },
      "SSD": {
          "size": 0,
          "size_actual": 0,
          "size_utilized": 0,
          "size_kb": 0,
          "size_kb_actual": 0,
          "size_kb_utilized": 0,
          "num_objects": 0
    },
      "NVME": {
          "size": 0,
          "size_actual": 0,
          "size_utilized": 0,
          "size_kb": 0,
          "size_kb_actual": 0,
          "size_kb_utilized": 0,
          "num_objects": 0
    }
    ],
    "last_stats_sync": "2022-01-21T11:38:50.465577Z",
    "last_stats_update": "2022-01-21T11:38:50.460830Z" 
}

The granular information would be found in the "stats.storage-classes" attribute of the response, and each storage class would have an entry in that list.

Actions #5

Updated by Rafael Weingartner about 2 years ago

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Actions #6

Updated by hoan nv over 1 year ago

Rafael Weingartner wrote:

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Do you have any updates.
thanks.

Actions #7

Updated by Rafael Weingartner over 1 year ago

hoan nv wrote:

Rafael Weingartner wrote:

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Do you have any updates.
thanks.

Some priorities changed on our side. Probably we will be working on this only at the first quarter of 2023. Moreover, we opened two PRs in Ceph, to improve some other parts of its integration with Keystone, but they do not seem to be moving forward, which is something that worries us when we think about a big job as this one.

Actions #8

Updated by hoan nv about 1 year ago

Rafael Weingartner wrote:

hoan nv wrote:

Rafael Weingartner wrote:

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Do you have any updates.
thanks.

Some priorities changed on our side. Probably we will be working on this only at the first quarter of 2023. Moreover, we opened two PRs in Ceph, to improve some other parts of its integration with Keystone, but they do not seem to be moving forward, which is something that worries us when we think about a big job as this one.

Now it is end of first quarter, this feature will be implement on next quarter ?
Thanks

Actions #9

Updated by Huy Nguyen 11 months ago

+1 need this feature

Actions #10

Updated by Lam Nguyen 7 months ago

Rafael Weingartner wrote:

hoan nv wrote:

Rafael Weingartner wrote:

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Do you have any updates.
thanks.

Some priorities changed on our side. Probably we will be working on this only at the first quarter of 2023. Moreover, we opened two PRs in Ceph, to improve some other parts of its integration with Keystone, but they do not seem to be moving forward, which is something that worries us when we think about a big job as this one.

Now this is the last quarter of 2023, have you got any updates?
Thanks!

Actions #11

Updated by Rafael Weingartner 7 months ago

Lam Nguyen wrote:

Rafael Weingartner wrote:

hoan nv wrote:

Rafael Weingartner wrote:

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Do you have any updates.
thanks.

Some priorities changed on our side. Probably we will be working on this only at the first quarter of 2023. Moreover, we opened two PRs in Ceph, to improve some other parts of its integration with Keystone, but they do not seem to be moving forward, which is something that worries us when we think about a big job as this one.

Now this is the last quarter of 2023, have you got any updates?
Thanks!

We implemented it already internally, and finished the validations; it will go for production soon. The push upstream though, will not be executed yet, until we can move on with other PRs that are simpler, and got stuck.

Actions #12

Updated by hoan nv 3 months ago

Rafael Weingartner wrote:

Lam Nguyen wrote:

Rafael Weingartner wrote:

hoan nv wrote:

Rafael Weingartner wrote:

I just posted a new comment with the full spec here. Sadly, I did not find a method to overwrite the initial post.

Do you have any updates.
thanks.

Some priorities changed on our side. Probably we will be working on this only at the first quarter of 2023. Moreover, we opened two PRs in Ceph, to improve some other parts of its integration with Keystone, but they do not seem to be moving forward, which is something that worries us when we think about a big job as this one.

Now this is the last quarter of 2023, have you got any updates?
Thanks!

We implemented it already internally, and finished the validations; it will go for production soon. The push upstream though, will not be executed yet, until we can move on with other PRs that are simpler, and got stuck.

I am looking forward for your PRs. After few months, your code is ok. Do you have plan to public this PRs.
Thanks

Actions

Also available in: Atom PDF