Project

General

Profile

Actions

Bug #46062

closed

File Corruption in Multisite Replication with Encryption

Added by Howard Brown almost 4 years ago. Updated about 1 month ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

100%

Source:
Tags:
multisite encryption backport_processed
Backport:
pacific quincy reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This may be related to https://tracker.ceph.com/issues/39992 - though I didn't see any mention of encryption in that issue.

We've noticed that anything over ~8MB is consistently being modified ("corrupted") when replicated to the other site.

In the following example output, the first file being compared is the original that was uploaded (via aws cli) to the source zone and the second file (preceded by "checks/") is being downloaded (via aws cli) from the secondary/replicated zone:

$ cmp data.file_8M checks/data.file_8M
$ cmp data.file_9M checks/data.file_9M
data.file_9M checks/data.file_9M differ: byte 8388622, line 32962
$ cmp data.file_10M checks/data.file_10M
data.file_10M checks/data.file_10M differ: byte 8388622, line 32301
$ cmp data.file_90M checks/data.file_90M
data.file_90M checks/data.file_90M differ: byte 8388622, line 32593

Example of how the test file is being created w/ dd and /dev/urandom:

$ dd if=/dev/urandom of=data.file_9M bs=9k count=1k
1024+0 records in
1024+0 records out
9437184 bytes (9.4 MB) copied, 1.15529 s, 8.2 MB/s

We are seeing this only when encryption is turned on. Also, this is using the "automatic encryption" method by adding the following config parameter in ceph.conf for all of the RGWs:

rgw_crypt_default_encryption_key = [base64-encoded 256 bit key]

We recently noticed this when turning replication on in version 14.2.4 - and subsequently updated to version 14.2.9 to see if the issue is still present (it is).


Related issues 3 (0 open3 closed)

Copied to rgw - Backport #62321: quincy: File Corruption in Multisite Replication with EncryptionResolvedCasey BodleyActions
Copied to rgw - Backport #62322: pacific: File Corruption in Multisite Replication with EncryptionRejectedCasey BodleyActions
Copied to rgw - Backport #62323: reef: File Corruption in Multisite Replication with EncryptionResolvedCasey BodleyActions
Actions

Also available in: Atom PDF