Bug #21772
closedmultisite: multipart uploads fail to sync
0%
Description
Reported on ceph-users. I added a test case to test_multi.py, and it reproduces the issue.
Files
Updated by Casey Bodley over 6 years ago
- File multipart.bilog multipart.bilog added
I've attached the output from `radosgw-admin bilog list` on the source bucket, after a 4-part upload to an object named MULTIPART.
Notable in the output are the entries from the multipart complete operation. Both entries have the same op_tag, but the first is a pending write to 'MULTIPART' and the second is a completed del on the last part object.
{ "op_id": "00000000011.11.1", "op_tag": "f23e6bbc-1ae8-4e7f-8a6f-5b79071c74c4.4109.422", "op": "write", "object": "MULTIPART", "instance": "", "state": "pending", "index_ver": 11, "timestamp": "0.000000", "ver": { "pool": -1, "epoch": 0 }, "bilog_flags": 0, "versioned": false, "owner": "", "owner_display_name": "", "zones_trace": [ "f23e6bbc-1ae8-4e7f-8a6f-5b79071c74c4" ] }, { "op_id": "00000000012.12.1", "op_tag": "f23e6bbc-1ae8-4e7f-8a6f-5b79071c74c4.4109.422", "op": "del", "object": "_multipart_MULTIPART.2~9IIANYVJ4zyGiaT9YSl3x3ttxbcmKba.4", "instance": "", "state": "complete", "index_ver": 12, "timestamp": "2017-10-12 13:49:59.266862311Z", "ver": { "pool": 7, "epoch": 2 }, "bilog_flags": 0, "versioned": false, "owner": "", "owner_display_name": "", "zones_trace": [] },
We don't attempt to sync MULTIPART, because we never see an entry with state=complete.
Updated by Casey Bodley over 6 years ago
test case in https://github.com/ceph/ceph/pull/18271
Updated by Casey Bodley over 6 years ago
This rgw_bucket_complete_op() cls call for the multipart complete also includes the 4 multipart parts in remove_objs, so they can be removed from the index at the same time.
We also call log_index_operation() to add bilog entries for those removes - but we don't increment the header.ver for each, which means that each of those operations writes to the same omap key, overwriting the previous entries.
Some osd log snippets to illustrate:
rgw_bucket_complete_op(): request: op=0 name=MULTIPART instance= ver=7:4 tag=e6094249-4ff0-4912-9ea3-08d0f1b200e3.4109.38 log_index_operation name=MULTIPART key=<80>0_00000000012.12.1 op=0 state=1 tag=e6094249-4ff0-4912-9ea3-08d0f1b200e3.4109.38 rgw_bucket_complete_op(): remove_objs.size()=4 rgw_bucket_complete_op(): removing entries, read_index_entry name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.1 instance= rgw_bucket_complete_op(): entry.name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.1 entry.instance= entry.meta.category=1 log_index_operation name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.1 key=<80>0_00000000012.12.1 op=1 state=1 tag=e6094249-4ff0-4912-9ea3-08d0f1b200e3.4109.38 rgw_bucket_complete_op(): removing entries, read_index_entry name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.2 instance= rgw_bucket_complete_op(): entry.name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.2 entry.instance= entry.meta.category=1 log_index_operation name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.2 key=<80>0_00000000012.12.1 op=1 state=1 tag=e6094249-4ff0-4912-9ea3-08d0f1b200e3.4109.38 rgw_bucket_complete_op(): removing entries, read_index_entry name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.3 instance= rgw_bucket_complete_op(): entry.name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.3 entry.instance= entry.meta.category=1 log_index_operation name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.3 key=<80>0_00000000012.12.1 op=1 state=1 tag=e6094249-4ff0-4912-9ea3-08d0f1b200e3.4109.38 rgw_bucket_complete_op(): removing entries, read_index_entry name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.4 instance= rgw_bucket_complete_op(): entry.name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.4 entry.instance= entry.meta.category=1 log_index_operation name=_multipart_MULTIPART.2~vtcUTJNUDdfY4O6pLxOUTXjJjoNlI_Y.4 key=<80>0_00000000012.12.1 op=1 state=1 tag=e6094249-4ff0-4912-9ea3-08d0f1b200e3.4109.38
Updated by Casey Bodley over 6 years ago
- Status changed from 12 to Fix Under Review
Updated https://github.com/ceph/ceph/pull/18271 with a proposed fix.
Though we may want to go a step further, and avoid writing these multipart part entries to the bilog in the first place.
Updated by Casey Bodley over 6 years ago
- Status changed from Fix Under Review to 7
Updated by Casey Bodley over 6 years ago
- Related to Bug #21800: multisite: avoid writing multipart parts to the bucket index log added
Updated by Yuri Weinstein over 6 years ago
Updated by Casey Bodley over 6 years ago
- Status changed from 7 to Pending Backport
Updated by Anonymous over 6 years ago
- Copied to Backport #21816: luminous: multisite: multipart uploads fail to sync added
Updated by Casey Bodley over 6 years ago
- Related to Bug #21591: RGW multisite does not sync all objects added
Updated by Nathan Cutler over 6 years ago
- Status changed from Pending Backport to Resolved