Project

General

Profile

Actions

Bug #62196

closed

multisite sync fairness : "sync status " in I/O error

Added by Tejas C 9 months ago. Updated 7 months ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version
ceph version 18.0.0-5151-gf82b9942 (f82b9942d6dc16ef3b57c7b0c551cde2e85f4a81) reef (dev)

steps:
1. Multisite with 1 rgw sync each , 20k objects written bidirectionally on 2 buckets.
2. add 4 rgw sync daemons each site.
3. 20k bidirectional writes done on same 2 buckets.
4. during sync on secondary , the sync status goes into "I/O error", but secondary continued to sync at a very slow pace. All RGW daemons are up on both sites but sync status remains at IO error.

~]# radosgw-admin sync status
realm 908031d2-c71b-45d9-8a1b-bb85b738f316 (india)
zonegroup 4e8919b1-6fc1-4cc4-bc0c-1abcc2172ff8 (shared)
zone 0564223f-2006-49be-bee5-56adecaeb89e (secondary)
current time 2023-07-27T10:07:38Z
zonegroup features enabled: resharding
disabled: compress-encrypted
metadata sync syncing
full sync: 0/64 shards
failed to fetch master sync status: (5) Input/output error
data sync source: 59b35d7a-a431-4922-bada-615643c529f2 (primary)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards

objects synced on both sites :
~]# radosgw-admin bucket stats | grep num
"num_shards": 11,
"num_objects": 40000
"num_shards": 11,
"num_objects": 40000

PRi:
~]# ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
....
prometheus ?:9095 1/1 - 27h count:1
rgw.foo ?:80 4/4 - 20h ceph-pri-tj1-jp7gab-node2;ceph-pri-tj1-jp7gab-node3;ceph-pri-tj1-jp7gab-node4;ceph-pri-tj1-jp7gab-node6;count:4
rgw.shared.pri ?:80 1/1 - 27h ceph-pri-tj1-jp7gab-node5

~]# radosgw-admin sync status
realm 908031d2-c71b-45d9-8a1b-bb85b738f316 (india)
zonegroup 4e8919b1-6fc1-4cc4-bc0c-1abcc2172ff8 (shared)
zone 59b35d7a-a431-4922-bada-615643c529f2 (primary)
current time 2023-07-27T10:14:05Z
zonegroup features enabled: resharding
disabled: compress-encrypted
metadata sync no sync (zone is master)
data sync source: 0564223f-2006-49be-bee5-56adecaeb89e (secondary)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source

SEC:
~]# ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
....
rgw.foo ?:80 4/4 - 20h ceph-sec-tj1-jp7gab-node2;ceph-sec-tj1-jp7gab-node3;ceph-sec-tj1-jp7gab-node4;ceph-sec-tj1-jp7gab-node6;count:4
rgw.shared.sec ?:80 1/1 - 27h ceph-sec-tj1-jp7gab-node5


Related issues 1 (0 open1 closed)

Is duplicate of rgw - Bug #53029: radosgw-admin fails on "sync status" if a single RGW process is downResolvedJane Zhu

Actions
Actions #1

Updated by Shilpa MJ 9 months ago

  • Assignee set to Shilpa MJ
Actions #2

Updated by Ilya Dryomov 9 months ago

  • Target version deleted (v18.2.0)
Actions #3

Updated by Casey Bodley 7 months ago

  • Is duplicate of Bug #53029: radosgw-admin fails on "sync status" if a single RGW process is down added
Actions #4

Updated by Casey Bodley 7 months ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF