Bug #55675
[rgw-ms][dbr]:50K object did not sync to the secondary after doing a 20M workload on a versioned bucket.
0%
Description
1. Description:
When performing a 20M object workload on a versioned bucket, we observed that 50K objects did not sync to the secondary.
2. ceph version:
17.0.0-12145-g4520f3ad (4520f3adf7d8d678c39fd74a9c4aa538e6edbb5c) quincy (dev)
3. Steps to reproduce:
a. create a bucket 'version-lb-1'. Enable bucket versioning.
b. Ensure bucket is synced on both sites.
c. Write a 20M object workload [10M current + 10M non-current].
d. The workload is bi-directional, written from both sites simultaneously.
The workload xml is as below. The same xml is run simultaneously from primary and secondary.
<workload name="fillCluster" description="RGW testing">
<!-- Initialization -->
<storage type="s3" config="timeout=900000;accesskey=123;secretkey=123;endpoint=http://localhost:5000;path_style_access=true" retry="3"/>
<workflow>
<workstage name="preparing_cluster">
<work type="prepare" workers="400" config="cprefix=ver-lb-;containers=r(1,1);objects=r(1,10000000);sizes=c(1)KB"/>
</workstage>
</workflow>
</workload>
4. Multisite configuration used:
Multisite Configuration = LB for ms-sync
Total RGW daemons per zone = 14
sync endpoints = 4 rgws behind 1 Load balancer per zone.
IO via = Haproxy ( client IO for 10 rgws)
Object size = small size( 1- 10KB)
Object PUT = bi-directional on same bucket
objects per bucket = 20M (10M from each site simultaneously)
Cluster utilization < 40% total cluster size
IO tool = Cosbench
Configs set on all rgws rgw_data_notify_interval_msec=0
debug_ms 0
debug_rgw 5
debug_rgw_sync 20
History
#1 Updated by Vidushi Mishra almost 2 years ago
Via sync error list , we observe errors like :
- "message": "failed to sync bucket instance: (125) Operation canceled"' and
- "message": "failed to sync object(2300) Unknown error 2300"
#2 Updated by Mark Kogan almost 2 years ago
- Assignee set to Mark Kogan
#3 Updated by Mark Kogan almost 2 years ago
attempted to reproduce but was not successful (high performance setup with balancer):
flow was as following:
a. write 10M objects bi-directionally (5M <-> 5M)
b. wait for sync to complete (caught up)
c. enable versioning
d. write the same 10M objects again bi-directionally (5M <-> 5M)
>> bi-dir 5M * 2 = 10M // terminal #1 nice numactl -N 0 -m 0 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8000 -z 4K -d -1 -t $(( $(numactl -N 0 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op useast |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups // terminal #2 nice numactl -N 1 -m 1 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8002 -z 4K -d -1 -t $(( $(numactl -N 1 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op uswest |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups # wait for 'caught up' ## enable versioning on master side: aws --endpoint-url http://127.0.0.1:8000 s3api put-bucket-versioning --bucket b01b000000000000 --versioning-configuration Status=Enabled aws --endpoint-url http://127.0.0.1:8000 s3api get-bucket-versioning --bucket b01b000000000000 "Status": "Enabled", aws --endpoint-url http://127.0.0.1:8002 s3api get-bucket-versioning --bucket b01b000000000000 "Status": "Enabled", >> bi-dir 5M * 2 = 10M - version 2 of objs // terminal #1 nice numactl -N 0 -m 0 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8000 -z 4K -d -1 -t $(( $(numactl -N 0 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op useast |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups // terminal #2 nice numactl -N 1 -m 1 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8002 -z 4K -d -1 -t $(( $(numactl -N 1 -- nproc) / 2 )) -b 1 -n 5000000 -m p -bp b01b -op uswest |& tee hsbench.log | stdbuf -oL -eL colrm 170 | ccze -Aonolookups zonegroup features enabled: resharding metadata sync no sync (zone is master) data sync source: 3a55d235-1d0e-4987-92bd-8c809e2431f3 (us-west) syncing full sync: 0/128 shards incremental sync: 128/128 shards data is caught up with source POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR .rgw.root 80 KiB 21 0 21 0 0 0 713438 743 MiB 45 35 KiB 0 B 0 B default.rgw.buckets.data 0 B 0 0 0 0 0 0 4 5 KiB 14 23 KiB 0 B 0 B default.rgw.buckets.index 0 B 11 0 11 0 0 0 10 10 KiB 17 3 KiB 0 B 0 B default.rgw.control 0 B 8 0 8 0 0 0 0 0 B 0 0 B 0 B 0 B default.rgw.log 136 KiB 177 0 177 0 0 0 1740 1.5 MiB 867 34 KiB 0 B 0 B default.rgw.meta 79 KiB 25 0 25 0 0 0 70 54 KiB 46 23 KiB 0 B 0 B us-east.rgw.buckets.data 76 GiB 20000000 0 20000000 0 0 0 80005562 67 GiB 290000754 76 GiB 0 B 0 B zonegroup features enabled: resharding metadata sync syncing full sync: 0/64 shards incremental sync: 64/64 shards metadata is caught up with master data sync source: 6164cba2-4ff9-4485-b021-fb7875aff356 (us-east) syncing full sync: 0/128 shards incremental sync: 128/128 shards data is caught up with source POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR .rgw.root 80 KiB 21 0 21 0 0 0 697805 727 MiB 43 37 KiB 0 B 0 B default.rgw.control 0 B 8 0 8 0 0 0 0 0 B 0 0 B 0 B 0 B default.rgw.log 136 KiB 177 0 177 0 0 0 1324 1.1 MiB 666 34 KiB 0 B 0 B default.rgw.meta 59 KiB 19 0 19 0 0 0 40 30 KiB 27 15 KiB 0 B 0 B us-west.rgw.buckets.data 76 GiB 20000000 0 20000000 0 0 0 80004240 67 GiB 290000034 76 GiB 0 B 0 B us-west.rgw.buckets.index 19 GiB 1103 0 1103 0 0 0 211866136 227 GiB 201048440 163 GiB 0 B 0 B us-west.rgw.control 0 B 8 0 8 0 0 0 0 0 B 0 0 B 0 B 0 B us-west.rgw.log 221 MiB 2057 0 2057 0 0 0 28389217 50 GiB 5176947 2.6 GiB 0 B 0 B us-west.rgw.meta 28 KiB 9 0 9 0 0 0 608709 503 MiB 216 66 KiB 0 B 0 B aws s3api --endpoint-url http://127.0.0.1:8000 list-object-versions --bucket b01b000000000000 --prefix useast000000000000 | jq aws s3api --endpoint-url http://127.0.0.1:8002 list-object-versions --bucket b01b000000000000 --prefix useast000000000000 | jq { "Versions": [ { "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"", "Size": 4096, "StorageClass": "STANDARD", "Key": "useast000000000000", "VersionId": "ZRc.HYV5qSnWFFbG8r0Si6CVBsKq-p-", "IsLatest": true, "LastModified": "2022-05-22T16:58:49.415Z", "Owner": { "DisplayName": "cosbench_user", "ID": "cosbench" } }, { "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"", "Size": 4096, "StorageClass": "STANDARD", "Key": "useast000000000000", "VersionId": "null", "IsLatest": false, "LastModified": "2022-05-22T15:57:59.130Z", "Owner": { "DisplayName": "cosbench_user", "ID": "cosbench" } } ] } aws s3api --endpoint-url http://127.0.0.1:8000 list-object-versions --bucket b01b000000000000 --prefix uswest000000000000 | jq aws s3api --endpoint-url http://127.0.0.1:8002 list-object-versions --bucket b01b000000000000 --prefix uswest000000000000 | jq { "Versions": [ { "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"", "Size": 4096, "StorageClass": "STANDARD", "Key": "uswest000000000000", "VersionId": "Hz0qVesP1xU7s91EQV7FCvs4yMXU384", "IsLatest": true, "LastModified": "2022-05-22T16:58:53.526Z", "Owner": { "DisplayName": "cosbench_user", "ID": "cosbench" } }, { "ETag": "\"883abc8623528daf10b7e1bb1b2774ba\"", "Size": 4096, "StorageClass": "STANDARD", "Key": "uswest000000000000", "VersionId": "null", "IsLatest": false, "LastModified": "2022-05-22T15:58:19.344Z", "Owner": { "DisplayName": "cosbench_user", "ID": "cosbench" } } ] }
#4 Updated by Vidushi Mishra almost 2 years ago
Hi Mark,
We tried a bi-directional workload of 20M objects on a versioned bucket.
Please find the steps below.
1. create a bucket 'version-lb-1'. Enable bucket versioning.
2. Ensure the bucket is synced on both sites.
3. Write a 20M object workload [10M current + 10M non-current].
4. The workload is bi-directional, written from both sites simultaneously.
We are trying to reproduce with the same steps on ceph version 17.0.0-12145-g4520f3ad.
#5 Updated by Vidushi Mishra almost 2 years ago
1. Test: Bi-directional sync on a versioned bucket with a workload of 20M objects.
2. Result: Not seeing sync issue
3. Steps: Mentioned in the description
4. Iterations: 2
5. ceph version : 17.0.0-12762-g63f84c50 (63f84c50e0851d456fc38b3330945c54162dd544) quincy (dev)
6.However, 'radosgw-admin bucket stats --bucket <bucket_name>' reports the incorrect number of objects.
=========
primary
=========
[root@magna051 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_objects
Mon Jun 6 04:02:10 UTC 2022
"num_objects": 19999968
[root@magna051 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_shards
Mon Jun 6 04:02:15 UTC 2022
"num_shards": 853,
[root@magna051 ~]#
===========
secondary
===========
[root@magna121 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_objects
Mon Jun 6 04:01:43 UTC 2022
"num_objects": 19999991
[root@magna121 ~]# date; radosgw-admin bucket stats --bucket version-2 | grep num_shards
Mon Jun 6 04:01:59 UTC 2022
"num_shards": 839,
[root@magna121 ~]#
#6 Updated by Casey Bodley almost 2 years ago
- Status changed from New to Can't reproduce