Project

General

Profile

Bug #55675

[rgw-ms][dbr]:50K object did not sync to the secondary after doing a 20M workload on a versioned bucket.

Added by Vidushi Mishra 9 months ago. Updated 8 months ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Target version:
% Done:

0%

Source:
Tags:
multisite-reshard
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

1. Description:

When performing a 20M object workload on a versioned bucket, we observed that 50K objects did not sync to the secondary.

2. ceph version:
17.0.0-12145-g4520f3ad (4520f3adf7d8d678c39fd74a9c4aa538e6edbb5c) quincy (dev)

3. Steps to reproduce:

a. create a bucket 'version-lb-1'. Enable bucket versioning.
b. Ensure bucket is synced on both sites.
c. Write a 20M object workload [10M current + 10M non-current].
d. The workload is bi-directional, written from both sites simultaneously.

The workload xml is as below. The same xml is run simultaneously from primary and secondary.

<workload name="fillCluster" description="RGW testing">
<!-- Initialization -->
<storage type="s3" config="timeout=900000;accesskey=123;secretkey=123;endpoint=http://localhost:5000;path_style_access=true" retry="3"/>
<workflow>
<workstage name="preparing_cluster">
<work type="prepare" workers="400" config="cprefix=ver-lb-;containers=r(1,1);objects=r(1,10000000);sizes=c(1)KB"/>
</workstage>
</workflow>
</workload>

4. Multisite configuration used:

Multisite Configuration = LB for ms-sync
Total RGW daemons per zone = 14
sync endpoints = 4 rgws behind 1 Load balancer per zone.
IO via = Haproxy ( client IO for 10 rgws)
Object size = small size( 1- 10KB)
Object PUT = bi-directional on same bucket
objects per bucket = 20M (10M from each site simultaneously)
Cluster utilization < 40% total cluster size
IO tool = Cosbench
Configs set on all rgws rgw_data_notify_interval_msec=0
debug_ms 0
debug_rgw 5
debug_rgw_sync 20

History

#1 Updated by Vidushi Mishra 9 months ago

Via sync error list , we observe errors like :

- "message": "failed to sync bucket instance: (125) Operation canceled"' and
- "message": "failed to sync object(2300) Unknown error 2300"

#2 Updated by Mark Kogan 9 months ago

  • Assignee set to Mark Kogan

#3 Updated by Mark Kogan 8 months ago

attempted to reproduce but was not successful (high performance setup with balancer):

flow was as following:
a. write 10M objects bi-directionally (5M <-> 5M)
b. wait for sync to complete (caught up)
c. enable versioning
d. write the same 10M objects again bi-directionally (5M <-> 5M)


>> bi-dir 5M * 2 = 10M
// terminal #1
nice numactl -N 0 -m 0 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8000 -z 4K -d -1 -t  $(( $(numactl -N 0 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op useast |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups
// terminal #2
nice numactl -N 1 -m 1 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8002 -z 4K -d -1 -t  $(( $(numactl -N 1 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op uswest |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups

# wait for 'caught up'

## enable versioning on master side:
aws --endpoint-url http://127.0.0.1:8000 s3api put-bucket-versioning --bucket b01b000000000000 --versioning-configuration Status=Enabled
aws --endpoint-url http://127.0.0.1:8000 s3api get-bucket-versioning --bucket b01b000000000000
    "Status": "Enabled",
aws --endpoint-url http://127.0.0.1:8002 s3api get-bucket-versioning --bucket b01b000000000000
    "Status": "Enabled",

>> bi-dir 5M * 2 = 10M - version 2 of objs
// terminal #1
nice numactl -N 0 -m 0 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8000 -z 4K -d -1 -t  $(( $(numactl -N 0 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op useast |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups
// terminal #2
nice numactl -N 1 -m 1 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8002 -z 4K -d -1 -t  $(( $(numactl -N 1 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op uswest |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups

zonegroup features enabled: resharding 
  metadata sync no sync (zone is master) 
      data sync source: 3a55d235-1d0e-4987-92bd-8c809e2431f3 (us-west) 
                        syncing 
                        full sync: 0/128 shards 
                        incremental sync: 128/128 shards 
                        data is caught up with source 
POOL_NAME                     USED   OBJECTS  CLONES    COPIES  MISSING_ON_PRIMARY  UNFOUND  DEGRADED     RD_OPS       RD     WR_OPS       WR  USED COMPR  UNDER COMPR 
.rgw.root                   80 KiB        21       0        21                   0        0         0     713438  743 MiB         45   35 KiB         0 B          0 B 
default.rgw.buckets.data       0 B         0       0         0                   0        0         0          4    5 KiB         14   23 KiB         0 B          0 B 
default.rgw.buckets.index      0 B        11       0        11                   0        0         0         10   10 KiB         17    3 KiB         0 B          0 B                     
default.rgw.control            0 B         8       0         8                   0        0         0          0      0 B          0      0 B         0 B          0 B                     
default.rgw.log            136 KiB       177       0       177                   0        0         0       1740  1.5 MiB        867   34 KiB         0 B          0 B                     
default.rgw.meta            79 KiB        25       0        25                   0        0         0         70   54 KiB         46   23 KiB         0 B          0 B                     
us-east.rgw.buckets.data    76 GiB  20000000       0  20000000                   0        0         0   80005562   67 GiB  290000754   76 GiB         0 B          0 B 

zonegroup features enabled: resharding 
  metadata sync syncing 
                full sync: 0/64 shards 
                incremental sync: 64/64 shards 
                metadata is caught up with master 
      data sync source: 6164cba2-4ff9-4485-b021-fb7875aff356 (us-east) 
                        syncing 
                        full sync: 0/128 shards 
                        incremental sync: 128/128 shards 
                        data is caught up with source 
POOL_NAME                     USED   OBJECTS  CLONES    COPIES  MISSING_ON_PRIMARY  UNFOUND  DEGRADED     RD_OPS       RD     WR_OPS       WR  USED COMPR  UNDER COMPR                     
.rgw.root                   80 KiB        21       0        21                   0        0         0     697805  727 MiB         43   37 KiB         0 B          0 B                     
default.rgw.control            0 B         8       0         8                   0        0         0          0      0 B          0      0 B         0 B          0 B                     
default.rgw.log            136 KiB       177       0       177                   0        0         0       1324  1.1 MiB        666   34 KiB         0 B          0 B                     
default.rgw.meta            59 KiB        19       0        19                   0        0         0         40   30 KiB         27   15 KiB         0 B          0 B                     
us-west.rgw.buckets.data    76 GiB  20000000       0  20000000                   0        0         0   80004240   67 GiB  290000034   76 GiB         0 B          0 B                     
us-west.rgw.buckets.index   19 GiB      1103       0      1103                   0        0         0  211866136  227 GiB  201048440  163 GiB         0 B          0 B                     
us-west.rgw.control            0 B         8       0         8                   0        0         0          0      0 B          0      0 B         0 B          0 B                     
us-west.rgw.log            221 MiB      2057       0      2057                   0        0         0   28389217   50 GiB    5176947  2.6 GiB         0 B          0 B                     
us-west.rgw.meta            28 KiB         9       0         9                   0        0         0     608709  503 MiB        216   66 KiB         0 B          0 B 

aws s3api --endpoint-url http://127.0.0.1:8000 list-object-versions --bucket b01b000000000000 --prefix useast000000000000 | jq
aws s3api --endpoint-url http://127.0.0.1:8002 list-object-versions --bucket b01b000000000000 --prefix useast000000000000 | jq
{
    "Versions": [
        {
            "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"",
            "Size": 4096,
            "StorageClass": "STANDARD",
            "Key": "useast000000000000",
            "VersionId": "ZRc.HYV5qSnWFFbG8r0Si6CVBsKq-p-",
            "IsLatest": true,
            "LastModified": "2022-05-22T16:58:49.415Z",
            "Owner": {
                "DisplayName": "cosbench_user",
                "ID": "cosbench" 
            }
        },
        {
            "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"",
            "Size": 4096,
            "StorageClass": "STANDARD",
            "Key": "useast000000000000",
            "VersionId": "null",
            "IsLatest": false,
            "LastModified": "2022-05-22T15:57:59.130Z",
            "Owner": {
                "DisplayName": "cosbench_user",
                "ID": "cosbench" 
            }
        }
    ]
}

aws s3api --endpoint-url http://127.0.0.1:8000 list-object-versions --bucket b01b000000000000 --prefix uswest000000000000 | jq
aws s3api --endpoint-url http://127.0.0.1:8002 list-object-versions --bucket b01b000000000000 --prefix uswest000000000000 | jq
{
  "Versions": [
    {
      "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"",
      "Size": 4096,
      "StorageClass": "STANDARD",
      "Key": "uswest000000000000",
      "VersionId": "Hz0qVesP1xU7s91EQV7FCvs4yMXU384",
      "IsLatest": true,
      "LastModified": "2022-05-22T16:58:53.526Z",
      "Owner": {
        "DisplayName": "cosbench_user",
        "ID": "cosbench" 
      }
    },
    {
      "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"",
      "Size": 4096,
      "StorageClass": "STANDARD",
      "Key": "uswest000000000000",
      "VersionId": "null",
      "IsLatest": false,
      "LastModified": "2022-05-22T15:58:19.344Z",
      "Owner": {
        "DisplayName": "cosbench_user",
        "ID": "cosbench" 
      }
    }
  ]
}

#4 Updated by Vidushi Mishra 8 months ago

Hi Mark,

We tried a bi-directional workload of 20M objects on a versioned bucket.

Please find the steps below.

1. create a bucket 'version-lb-1'. Enable bucket versioning.
2. Ensure the bucket is synced on both sites.
3. Write a 20M object workload [10M current + 10M non-current].
4. The workload is bi-directional, written from both sites simultaneously.

We are trying to reproduce with the same steps on ceph version 17.0.0-12145-g4520f3ad.

#5 Updated by Vidushi Mishra 8 months ago

1. Test: Bi-directional sync on a versioned bucket with a workload of 20M objects.

2. Result: Not seeing sync issue

3. Steps: Mentioned in the description

4. Iterations: 2

5. ceph version : 17.0.0-12762-g63f84c50 (63f84c50e0851d456fc38b3330945c54162dd544) quincy (dev)

6.However, 'radosgw-admin bucket stats --bucket <bucket_name>' reports the incorrect number of objects.

=========
primary =========

[root@magna051 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_objects
Mon Jun 6 04:02:10 UTC 2022
"num_objects": 19999968
[root@magna051 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_shards
Mon Jun 6 04:02:15 UTC 2022
"num_shards": 853,
[root@magna051 ~]#

===========
secondary ===========

[root@magna121 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_objects
Mon Jun 6 04:01:43 UTC 2022
"num_objects": 19999991
[root@magna121 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_shards
Mon Jun 6 04:01:59 UTC 2022
"num_shards": 839,
[root@magna121 ~]#

#6 Updated by Casey Bodley 8 months ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF