Project

General

Profile

Actions

Support #38125

open

Multisite ceph cluster storage for data replication

Added by Krish Verma about 5 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Pull request ID:

Description

Hi Team,

We are looking for the solution to replicate data over multiple geo location through Ceph Cluster Storage. To achieve this we deploy 2 ceph cluster in India and US respectively and setup 2 gateway on each location but when we are trying to check the sync status, its getting failed.

Below is the setup detail

India Ceph Cluster detail :

[cephuser@vlno-ceph01 cluster]$ sudo ceph -s
cluster d52e50a4-ed2e-44cc-aa08-9309bc539a55
health HEALTH_OK
monmap e1: 1 mons at {vlno-ceph01=172.23.16.67:6789/0}
election epoch 3, quorum 0 vlno-ceph01
osdmap e62: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v56684: 152 pgs, 12 pools, 2638 MB data, 885 objects
5615 MB used, 129 GB / 134 GB avail
152 active+clean
[cephuser@vlno-ceph01 cluster]$

India Gateway (Zonegroup detail at Master):

[cephuser@zabbix-server ~]$ radosgw-admin zonegroup get 2>/dev/null {
"id": "74ad391b-fbca-4c05-b9e7-c90fd4851223",
"name": "noida",
"api_name": "noida",
"is_master": "true",
"endpoints": [
"http:\/\/zabbix-server:7480"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "71931e0e-1be6-449f-af34-edb4166c4e4a",
"zones": [ {
"id": "71931e0e-1be6-449f-af34-edb4166c4e4a",
"name": "noida1",
"endpoints": [
"http:\/\/vlno-ceph01:7480"
],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 0,
"read_only": "false"
}
],
"placement_targets": [ {
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": "1102c891-d81c-480e-9487-c9f874287d13"
}

[cephuser@zabbix-server ~]$

US Ceph Cluster detail:

[cephuser@vlsj-kverma1 cluster]$ sudo ceph -s
cluster c626be3a-4536-48b9-8db8-470437052313
health HEALTH_OK
monmap e1: 1 mons at {vlsj-kverma1=172.18.84.131:6789/0}
election epoch 3, quorum 0 vlsj-kverma1
osdmap e42: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v38272: 120 pgs, 8 pools, 24996 kB data, 210 objects
372 MB used, 134 GB / 134 GB avail
120 active+clean
[cephuser@vlsj-kverma1 cluster]$

US Gateway (zonegroup detail at slave)

[cephuser@zabbix-client ~]$ radosgw-admin zonegroup get 2>/dev/null {
"id": "74ad391b-fbca-4c05-b9e7-c90fd4851223",
"name": "noida",
"api_name": "noida",
"is_master": "true",
"endpoints": [
"http:\/\/zabbix-server:7480"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "71931e0e-1be6-449f-af34-edb4166c4e4a",
"zones": [ {
"id": "45c690a8-f39c-4b1d-9faf-e0e991ceaaac",
"name": "san-jose",
"endpoints": [
"http:\/\/zabbix-client:7480"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false"
}, {
"id": "71931e0e-1be6-449f-af34-edb4166c4e4a",
"name": "noida1",
"endpoints": [
"http:\/\/vlno-ceph01:7480"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false"
}
],
"placement_targets": [ {
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": "1102c891-d81c-480e-9487-c9f874287d13"
}

[cephuser@zabbix-client ~]$

Sync status from Master :

[cephuser@zabbix-server ~]$ radosgw-admin sync status --source-zone san-jose 2>/dev/null
realm 1102c891-d81c-480e-9487-c9f874287d13 (georep)
zonegroup 74ad391b-fbca-4c05-b9e7-c90fd4851223 (noida)
zone 71931e0e-1be6-449f-af34-edb4166c4e4a (noida1)
metadata sync no sync (zone is master)
[cephuser@zabbix-server ~]$

Sync status from slave:

[cephuser@zabbix-client ~]$ radosgw-admin sync status --source-zone noida1 2>/dev/null
realm 1102c891-d81c-480e-9487-c9f874287d13 (georep)
zonegroup 74ad391b-fbca-4c05-b9e7-c90fd4851223 (noida)
zone 45c690a8-f39c-4b1d-9faf-e0e991ceaaac (san-jose)
metadata sync failed to read sync status: (2) No such file or directory
data sync source: 71931e0e-1be6-449f-af34-edb4166c4e4a (noida1)
failed to retrieve sync info: (5) Input/output error
[cephuser@zabbix-client ~]$

You can see that slave node "san-jose" is not getting updated at master. We did the commit its always with same status and at slave its giveing below error

Thu Jan 31 12:12:57 2019
/admin/realm/period
2019-01-31 17:42:57.791779 7f7a17d989c0 15 generated auth header: AWS 99ZO3154HDPWCHYOFBBF:zERxw9GO7bPYnn2emFkPKlKuX7A=
2019-01-31 17:42:57.791834 7f7a17d989c0 20 sending request to http://vlno-ceph01:7480/admin/realm/period?rgwx-zonegroup=74ad391b-fbca-4c05-b9e7-c90fd4851223
2019-01-31 17:42:57.798110 7f7a17d989c0 0 curl_easy_perform returned error: Failed connect to vlno-ceph01:7480; Connection refused
request failed: (22) Invalid argument
failed to commit period: (22) Invalid argument

Please provide your expert advise.


Files

commit-output-master.txt (97.6 KB) commit-output-master.txt Krish Verma, 01/31/2019 01:18 PM
commit-output-slave.txt (94.6 KB) commit-output-slave.txt Krish Verma, 01/31/2019 01:18 PM
Actions #1

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to rgw
  • Category deleted (obsync)

You may find faster answers from the ceph-users mailing list too.

Actions #2

Updated by Casey Bodley almost 5 years ago

  • Status changed from New to 4

Failed connect to vlno-ceph01:7480; Connection refused

you need to use fully qualified domain names in the zone endpoints, or they won't be able to communicate with each other

Actions #3

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 4 to New
Actions

Also available in: Atom PDF