Project

General

Profile

Actions

Bug #53029

closed

radosgw-admin fails on "sync status" if a single RGW process is down

Added by David Piper over 2 years ago. Updated 17 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
multisite multisite-backlog
Backport:
reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We're using ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable) in a containerized deployment.
We have two RGW zones in the same zonegroup.
Each zone is hosted in a separate ceph cluster, and has four RGW endpoints.
The master zone's endpoints are configured as endpoints for the zonegroup.
(We're also using pubsub zones but I don't think this is related.)

When a single RGW endpoint from the master zone is stopped / crashes, the 'radosgw-admin sync status' command returns an error on the cluster hosting the non-master zone:

[qs-admin@newbrunswick0 ~]$ radosgw-admin sync status
+ sudo docker ps --filter name=ceph-rgw-.*rgw -q
+
sudo docker exec aa87acb445c5 radosgw-admin
realm 9d76aa86-99d1-41c3-966f-cc97eab2bfb3 (geored_realm)
zonegroup 384c36ac-374b-4ae2-bf9f-ae951f25920a (geored_zg)
zone b113b104-9c84-44ff-9058-4658c6e1df52 (siteB)
metadata sync syncing
full sync: 0/64 shards
failed to fetch master sync status: (5) Input/output error
data sync source: 0bbdd7ae-6e2a-4ad0-996b-5f0ed38443c1 (siteA)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
source: 9be18697-7423-41a7-a338-926aa938f9de (siteBpubsub)
not syncing from zone
source: a2a5b39a-3df5-4be3-9270-68bf90bc2a51 (siteApubsub)
not syncing from zone

This is easy to repro by stopping any of the RGW containers in the master zone. As far as we can tell, sync is still taking place. Once the container is restarted, the sync status command returns normally again.

[qs-admin@newbrunswick0 ~]$ radosgw-admin sync status
+ sudo docker ps --filter name=ceph-rgw-.*rgw -q
+
sudo docker exec aa87acb445c5 radosgw-admin
realm 9d76aa86-99d1-41c3-966f-cc97eab2bfb3 (geored_realm)
zonegroup 384c36ac-374b-4ae2-bf9f-ae951f25920a (geored_zg)
zone b113b104-9c84-44ff-9058-4658c6e1df52 (siteB)
metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
data sync source: 0bbdd7ae-6e2a-4ad0-996b-5f0ed38443c1 (siteA)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is behind on 3 shards
behind shards: [90,101,107]
oldest incremental change not applied: 2021-10-25T13:59:52.007974+0000 [90]
6 shards are recovering
recovering shards: [2,3,54,57,107,116]
source: 9be18697-7423-41a7-a338-926aa938f9de (siteBpubsub)
not syncing from zone
source: a2a5b39a-3df5-4be3-9270-68bf90bc2a51 (siteApubsub)
not syncing from zone

Unless we have misconfigured something, this feels like a bug: the other RGW endpoints should be suitable for reporting sync status?

RGW config:

(newbrunswick0 = 10.245.0.40)

[qs-admin@newbrunswick0 ~]$ radosgw-admin zonegroup get
+ sudo docker ps --filter name=ceph-rgw-.*rgw -q
+
sudo docker exec aa87acb445c5 radosgw-admin {
"id": "384c36ac-374b-4ae2-bf9f-ae951f25920a",
"name": "geored_zg",
"api_name": "geored_zg",
"is_master": "true",
"endpoints": [
"https://10.245.0.20:7480",
"https://10.245.0.21:7480",
"https://10.245.0.22:7480",
"https://10.245.0.23:7480"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "0bbdd7ae-6e2a-4ad0-996b-5f0ed38443c1",
"zones": [ {
"id": "0bbdd7ae-6e2a-4ad0-996b-5f0ed38443c1",
"name": "siteA",
"endpoints": [
"https://10.245.0.20:7480",
"https://10.245.0.21:7480",
"https://10.245.0.22:7480",
"https://10.245.0.23:7480"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}, {
"id": "9be18697-7423-41a7-a338-926aa938f9de",
"name": "siteBpubsub",
"endpoints": [
"https://10.245.0.40:7481",
"https://10.245.0.41:7481",
"https://10.245.0.42:7481",
"https://10.245.0.43:7481"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "pubsub",
"sync_from_all": "false",
"sync_from": [
"siteB"
],
"redirect_zone": ""
}, {
"id": "a2a5b39a-3df5-4be3-9270-68bf90bc2a51",
"name": "siteApubsub",
"endpoints": [
"https://10.245.0.20:7481",
"https://10.245.0.21:7481",
"https://10.245.0.22:7481",
"https://10.245.0.23:7481"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "pubsub",
"sync_from_all": "false",
"sync_from": [
"siteA"
],
"redirect_zone": ""
}, {
"id": "b113b104-9c84-44ff-9058-4658c6e1df52",
"name": "siteB",
"endpoints": [
"https://10.245.0.40:7480",
"https://10.245.0.41:7480",
"https://10.245.0.42:7480",
"https://10.245.0.43:7480"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [ {
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "9d76aa86-99d1-41c3-966f-cc97eab2bfb3",
"sync_policy": {
"groups": []
}
}


Related issues 1 (0 open1 closed)

Has duplicate rgw - Bug #62196: multisite sync fairness : "sync status " in I/O error DuplicateShilpa MJ

Actions
Actions #1

Updated by Casey Bodley over 2 years ago

  • Status changed from New to Triaged
  • Assignee set to Casey Bodley
  • Tags set to multisite
Actions #2

Updated by Casey Bodley over 1 year ago

  • Assignee deleted (Casey Bodley)
Actions #3

Updated by Casey Bodley 12 months ago

  • Tags changed from multisite to multisite multisite-backlog
Actions #4

Updated by Jane Zhu 11 months ago

The root cause:
radosgw-admin sync status command sends request to retrieve the info for each metadata/data log shard to each individual endpoint listed in the zonegroup settings in a round-robin way. The entire command fails if any of the requests fails. There is no retry in place.

Proposed fix:
Introduce a retry logic in the radosgw-admin sync status command.
The retry can be done in the way that it simply goes to the next endpoint if the current one fails. But it can be very inefficient in case of multiple endpoint failures.
In order to do this more efficiently, we may want to maintain a connection status for each endpoint in RGWRestConn. Maintaining the status in RGWRestConn can benefit other places where a retry logic is needed. The status can come with a timestamp so it can be invalidated after a short period (assuming the corresponding rgw instance may recover quickly).

@Casey, we had a brief discussion on this solution in last week's refactoring meeting that you were absent from. I would like to go through with you as well to see if you think the RGWRestConn is the right place to maintain the connection status for endpoints.

Actions #6

Updated by Casey Bodley 8 months ago

  • Status changed from Triaged to Fix Under Review
  • Assignee set to Jane Zhu
  • Backport set to reef
  • Pull request ID set to 52812
Actions #7

Updated by Casey Bodley 7 months ago

  • Has duplicate Bug #62196: multisite sync fairness : "sync status " in I/O error added
Actions #8

Updated by Jane Zhu 6 months ago

All the changes in https://github.com/ceph/ceph/pull/52812 have been covered in https://github.com/ceph/ceph/pull/53320. So I closed the first one and will go with the latter.

Actions #9

Updated by Jane Zhu 17 days ago

  • Status changed from Fix Under Review to Resolved
  • Pull request ID changed from 52812 to 53320

Fixed in the solution for https://tracker.ceph.com/issues/62710

Actions

Also available in: Atom PDF