Project

General

Profile

Actions

Bug #59696

open

RGW crashes when replication rules are set using PutBucketReplication S3 API

Added by Soumya Koduri 12 months ago. Updated 11 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
multisite backport_processed
Backport:
quincy reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

RGW crashed after setting replication rules using S3 put-bucket-replication API (https://docs.aws.amazon.com/cli/latest/reference/s3api/put-bucket-replication.html)

[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8101 s3api put-bucket-replication --bucket bucket1 --replication-configuration file://../../scripts/replication.json
[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8101 s3api get-bucket-replication --bucket bucket1 {
"ReplicationConfiguration": {
"Role": "",
"Rules": [ {
"ID": "rule1",
"Priority": 1,
"Filter": {
"Prefix": "lc"
},
"Status": "Enabled",
"Destination": {
"Bucket": "arn:aws:s3:::bucket1"
}
}
]
}
}
[root@localhost build]#

After a while

ceph version 18.0.0-3644-g0f690117439 (0f690117439384827cdfffaf662b71a99f73b34a) reef (dev)
1: /lib64/libc.so.6(+0x3ea70) [0x7f3b9003ea70]
2: /lib64/libc.so.6(+0x1578ee) [0x7f3b901578ee]
3: /root/workspace/ceph_dbstore/build/bin/radosgw(+0x864d83) [0x55bf8994bd83]
4: (rgw_bucket::operator<(rgw_bucket const&) const+0x33) [0x55bf89956253]
5: /root/workspace/ceph_dbstore/build/bin/radosgw(+0xc6cde2) [0x55bf89d53de2]
6: (RGWSI_Bucket_Sync_SObj::handle_bi_update(DoutPrefixProvider const*, RGWBucketInfo&, RGWBucketInfo*, optional_yield)+0x63c) [0x55bf89d55e9c]
7: (RGWSI_Bucket_SObj::store_bucket_instance_info(ptr_wrapper<RGWSI_MetaBackend::Context, 4>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWBucketInfo&, std::optional<RGWBucketInfo*>, bool, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >, optional_yield, DoutPrefixProvider const)+0x21c) [0x55bf89d4f05c]
8: (RGWBucketCtl::do_store_bucket_instance_info(ptr_wrapper<RGWSI_MetaBackend::Context, 4>&, rgw_bucket const&, RGWBucketInfo&, optional_yield, DoutPrefixProvider const*, RGWBucketCtl::BucketInstance::PutParams const&)+0xfd) [0x55bf89e3f66d]
9: /root/workspace/ceph_dbstore/build/bin/radosgw(+0xd586eb) [0x55bf89e3f6eb]
10: (std::_Function_handler<int (RGWSI_MetaBackend_Handler::Op*), RGWBucketInstanceMetadataHandler::call(std::optional<std::variant<RGWSI_MetaBackend_CtxParams_SObj> >, std::function<int (ptr_wrapper<RGWSI_MetaBackend::Context, 4>&)>)::{lambda(RGWSI_MetaBackend_Handler::Op*)#1}>::_M_invoke(std::_Any_data const&, RGWSI_MetaBackend_Handler::Op*&&)+0x30) [0x55bf89e55590]
11: /root/workspace/ceph_dbstore/build/bin/radosgw(+0xc8b5af) [0x55bf89d725af]
12: (RGWSI_MetaBackend_SObj::call(std::optional<std::variant<RGWSI_MetaBackend_CtxParams_SObj> >, std::function<int (RGWSI_MetaBackend::Context*)>)+0x5c) [0x55bf89d7403c]
13: (RGWSI_MetaBackend_Handler::call(std::optional<std::variant<RGWSI_MetaBackend_CtxParams_SObj> >, std::function<int (RGWSI_MetaBackend_Handler::Op*)>)+0x7c) [0x55bf89d729cc]
14: (RGWBucketCtl::store_bucket_instance_info(rgw_bucket const&, RGWBucketInfo&, optional_yield, DoutPrefixProvider const*, RGWBucketCtl::BucketInstance::PutParams const&)+0x15e) [0x55bf89e3e6fe]
15: (RGWRados::put_bucket_instance_info(RGWBucketInfo&, bool, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >, DoutPrefixProvider const, optional_yield)+0x4f) [0x55bf89ba710f]
16: (rgw::sal::RadosBucket::put_info(DoutPrefixProvider const*, bool, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >)+0x36) [0x55bf89c2bde6]
17: /root/workspace/ceph_dbstore/build/bin/radosgw(+0x89c7bd) [0x55bf899837bd]
18: (RGWPutBucketReplication::execute(optional_yield)+0x18a) [0x55bf89984c3a]
19: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, rgw::sal::Driver*, bool)+0xb94) [0x55bf897c1404]
20: (process_request(RGWProcessEnv const&, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWRestfulIO*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >, int*)+0x236f) [0x55bf897c3fef]
21: /root/workspace/ceph_dbstore/build/bin/radosgw(+0x63ee5b) [0x55bf89725e5b]
22: /root/workspace/ceph_dbstore/build/bin/radosgw(+0x63fda7) [0x55bf89726da7]
23: make_fcontext()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues 3 (1 open2 closed)

Copied to rgw - Bug #61369: [reef] RGW crashes when replication rules are set using PutBucketReplication S3 APIDuplicateSoumya Koduri

Actions
Copied to rgw - Backport #61376: reef: RGW crashes when replication rules are set using PutBucketReplication S3 APIResolvedSoumya KoduriActions
Copied to rgw - Backport #61481: quincy: RGW crashes when replication rules are set using PutBucketReplication S3 APIIn ProgressSoumya KoduriActions
Actions #1

Updated by Soumya Koduri 12 months ago

  • Assignee set to Soumya Koduri
Actions #2

Updated by Casey Bodley 12 months ago

  • Status changed from New to Triaged
  • Tags set to multisite
Actions #3

Updated by Soumya Koduri 12 months ago

  • Status changed from Triaged to In Progress

Other issues found are:
2) Issues with syncing replication rules . For eg., Destination/Bucket is not copied onto the rules synced to secondary site.

Primary:
[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8101 s3api get-bucket-replication --bucket bucket1

{
"ReplicationConfiguration": {
"Role": "",
"Rules": [ {
"ID": "rule1",
"Priority": 1,
"Filter": {
"Prefix": "lc"
},
"Status": "Enabled",
"Destination": {
"Bucket": "arn:aws:s3:::bucket2"
}
....

Secondary:
[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8201 s3api get-bucket-replication --bucket bucket1

{
"ReplicationConfiguration": {
"Role": "",
"Rules": [ {
"ID": "rule1",
"Priority": 1,
"Filter": {
"Prefix": "lc"
},
"Status": "Enabled",
"Destination": {
"Bucket": ""
}
.....

3) The objects are not getting synced to the destination bucket as set in the rules.

eg:
[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8101 s3 ls s3://bucket1/
2023-05-12 20:07:10 12 f9
2023-05-12 20:29:23 12 hjk
2023-05-12 20:07:07 12 lc_f9
2023-05-12 20:07:53 12 lc_f97
2023-05-12 20:27:55 12 lc_f98
2023-05-12 20:31:50 12 lc_fhjk
[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8201 s3 ls s3://bucket1/
2023-05-12 20:07:10 12 f9
2023-05-12 20:07:07 12 lc_f9
2023-05-12 20:07:53 12 lc_f97
2023-05-12 20:27:55 12 lc_f98
2023-05-12 20:31:50 12 lc_fhjk
2023-05-12 20:32:33 12 lc_sec_fhjk
2023-05-12 20:32:37 12 sec_fhjk
[root@localhost build]# aws --no-verify-ssl --endpoint-url http://localhost:8201 s3 ls s3://bucket2/
[root@localhost build]#

Actions #4

Updated by Soumya Koduri 12 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 51511

Issue (2) and (3) seem to be due to misconfiguration. Below rules work as expected..

{
"ReplicationConfiguration": {
"Role": "",
"Rules": [ {
"ID": "rule1",
"Priority": 1,
"Filter": {
"Prefix": "lc"
},
"Status": "Enabled",
"Destination": {
"Bucket": "bucket2"
}
}....

The crash issue is fixed in https://tracker.ceph.com/issues/59696

Actions #5

Updated by Casey Bodley 12 months ago

  • Backport set to reef
Actions #6

Updated by Soumya Koduri 11 months ago

  • Copied to Bug #61369: [reef] RGW crashes when replication rules are set using PutBucketReplication S3 API added
Actions #7

Updated by Soumya Koduri 11 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #8

Updated by Backport Bot 11 months ago

  • Copied to Backport #61376: reef: RGW crashes when replication rules are set using PutBucketReplication S3 API added
Actions #9

Updated by Backport Bot 11 months ago

  • Tags changed from multisite to multisite backport_processed
Actions #10

Updated by Soumya Koduri 11 months ago

  • Backport changed from reef to quincy reef
Actions #11

Updated by Soumya Koduri 11 months ago

  • Tags changed from multisite backport_processed to multisite
Actions #12

Updated by Backport Bot 11 months ago

  • Copied to Backport #61481: quincy: RGW crashes when replication rules are set using PutBucketReplication S3 API added
Actions #13

Updated by Backport Bot 11 months ago

  • Tags changed from multisite to multisite backport_processed
Actions

Also available in: Atom PDF