Bug #55390: rgw-ms/resharding: Observing sync inconsistencies ~50K out of 20M objects, did not sync. - rgw - Ceph

Actions

Copy link

Bug #55390

closed

rgw-ms/resharding: Observing sync inconsistencies ~50K out of 20M objects, did not sync.

Added by Vidushi Mishra about 2 years ago. Updated almost 2 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Target version:

Ceph - v17.0.0

% Done:

Source:

Q/A

Tags:

multisite-reshard

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Files

55390_ sync error list - sec (6.08 KB) 55390_ sync error list - sec

sync error list on the secondary

Vidushi Mishra, 04/21/2022 07:09 AM

Actions

Copy link

Updated by Vidushi Mishra about 2 years ago

1. ceph version 17.0.0-10783-ge38464a1 (e38464a10ae9e8c7b43bae5a9a7395eb2cbb2444) quincy (dev)

2. Steps to reproduce:

i. Create a multi-site with 14 rgws on each site [4 for ms sync and 10 for client IO.]
ii. The 4 rgws for ms sync are not behind any LB.
iii. create a bucket 'test-sync-no-lb-1' and upload 20M objects [10M from either site.]
iv. Wait for the workload to complete.
v. Monitor sync, and wait for the sync to complete.

3. Result:

i. We observe 411672 objects not synced to the secondary.
ii. sync status on the secondary site reports 128 shards recovering.

4. Additional info:

i. On both the sites, 'ceph s' shows pgs are in backfilling state.
ii. logs:
- period get : http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/upstream-dbr-2022/55390/period-get
- primary bucket stats : http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/upstream-dbr-2022/55390/bucket-stats-pri
- secondary bucket stats http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/upstream-dbr-2022/55390/bucket-stats-sec
- ceph status secondary- http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/upstream-dbr-2022/55390/ceph-s_sec