Activity
From 11/01/2022 to 11/30/2022
11/30/2022
- 04:02 PM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Krunal Chheda wrote:
> > Thanks once again for explanation, i was trying to figure out how t... - 02:48 AM Bug #57562: multisite replication issue on Quincy
- Krunal Chheda wrote:
> Thanks once again for explanation, i was trying to figure out how the assert_exists() works ... - 02:15 AM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Krunal Chheda wrote:
> > Regarding the ENOENT, with assert_exists in place now, will the rea... - 01:38 AM Bug #57562: multisite replication issue on Quincy
- Specifically in cls_fifo_legacy.cc/push_part()
- 01:37 AM Bug #57562: multisite replication issue on Quincy
- Krunal Chheda wrote:
> Regarding the ENOENT, with assert_exists in place now, will the read_part_header on the trim... - 01:19 AM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> I've pushed a commit that uses assert_exists, then fetches metadata on -ENOENT.
Thanks for ... - 01:07 AM Bug #57562: multisite replication issue on Quincy
- I've pushed a commit that uses assert_exists, then fetches metadata on -ENOENT.
- 03:47 PM Bug #58127 (Resolved): multisite: test_zg_master_zone_delete fails
- ex. http://qa-proxy.ceph.com/teuthology/cbodley-2022-09-29_01:41:10-rgw-wip-rgw-sal-bootstrap-distro-default-smithi/7...
- 01:57 PM Backport #58119 (In Progress): pacific: check-generated.sh failures for rgw_log_entry
- https://github.com/ceph/ceph/pull/49142
- 08:26 AM Bug #58125 (Won't Fix - EOL): In the nautilus version ceph, the notification message "awsRegion" ...
- The content of the message is as follows:
{
"Records": [{
"eventVersion": "2.2",
"eventSource...
11/29/2022
- 11:40 PM Bug #57562: multisite replication issue on Quincy
- This is very useful, thank you, and might explain why it's happening.
So from your logs, does it seem like the par... - 11:18 PM Bug #57562: multisite replication issue on Quincy
- So coming back to EIO analysis, what we have found so far is this issue happens for more than 1 RGW instances running...
- 09:08 PM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Can you point me to the PR for retrying on EIO?
https://github.com/adamemerson/ceph/pull/4/... - 04:39 PM Bug #57562: multisite replication issue on Quincy
- Can you point me to the PR for retrying on EIO?
- 09:35 PM Backport #58119 (Resolved): pacific: check-generated.sh failures for rgw_log_entry
- 09:35 PM Backport #58118 (In Progress): quincy: check-generated.sh failures for rgw_log_entry
- 09:28 PM Bug #58115 (Pending Backport): check-generated.sh failures for rgw_log_entry
- 06:36 PM Bug #58115 (Fix Under Review): check-generated.sh failures for rgw_log_entry
- 06:35 PM Bug #58115 (Pending Backport): check-generated.sh failures for rgw_log_entry
- ...
- 07:44 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
- J. Eric Ivancich wrote:
> The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a te... - 07:21 PM Bug #58111 (Fix Under Review): crash: verify_bucket_owner_or_policy
- 09:42 AM Bug #58111: crash: verify_bucket_owner_or_policy
- ...
- 09:28 AM Bug #58111 (Resolved): crash: verify_bucket_owner_or_policy
- When executing 's3cmd ls s3://a:', rgw was terminated. ...
- 10:40 AM Bug #44660: Multipart re-uploads cause orphan data
- It is very big problem for us.
We have a lot of big buckets with orphaned parts which use hundreds TBs of space.
... - 10:39 AM Bug #16767: RadosGW Multipart Cleanup Failure
- It is very big problem for us.
We have a lot of big buckets with orphaned parts which use hundreds TBs of space.
... - 02:13 AM Bug #54908: crash: double const md_config_t::get_val<double>(ConfigValues const&, std::basic_stri...
- Similar problem.
(gdb) bt full ... - 01:35 AM Bug #58105 (Won't Fix - EOL): `DeleteBucketPolicy` can not delete policy in slave zonegroup
- 01:11 AM Bug #58104 (Won't Fix - EOL): `putlc` failed in slave zonegroup
- radosgw version: nautilus(14.2.15)
11/28/2022
- 07:29 PM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Yes, I just pushed it. We're still testing for regression, but you're welcome to try it.
Than... - 05:31 PM Bug #57562: multisite replication issue on Quincy
- Yes, I just pushed it. We're still testing for regression, but you're welcome to try it.
- 05:13 PM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Also the goal is to remove tags as part of the fix.
>
> We had some confusion over where a ... - 03:36 PM Bug #57562: multisite replication issue on Quincy
- So currently without the tag changes we see a issue as mentioned here in "comment":https://tracker.ceph.com/issues/57...
- 03:17 PM Bug #57562: multisite replication issue on Quincy
- Also the goal is to remove tags as part of the fix.
We had some confusion over where a regression is and it's conf... - 03:17 PM Bug #57562: multisite replication issue on Quincy
- > Also based on our previous test analysis, we think removal of tags will still not completely solve the race conditi...
- 11:42 AM Documentation #58092 (New): rgw_enable_gc_threads / lc_threads not documented on web
- Options rgw_enable_gc_threads and rgw_enable_lc_threads are not rendered for docs.ceph.com.
I would expect those t... - 09:11 AM Backport #57238 (In Progress): pacific: crash: RGWCoroutinesStack::wakeup()
- 09:11 AM Backport #57237 (In Progress): quincy: crash: RGWCoroutinesStack::wakeup()
- 09:08 AM Backport #55228 (In Progress): pacific: crash: RGWGC::send_chain(cls_rgw_obj_chain&, std::basic_s...
- 09:06 AM Backport #55227 (In Progress): quincy: crash: RGWGC::send_chain(cls_rgw_obj_chain&, std::basic_st...
- 09:05 AM Backport #54497 (In Progress): pacific: bucket index completions may not retry after reshard
- 09:04 AM Backport #54496 (In Progress): quincy: bucket index completions may not retry after reshard
- 09:03 AM Backport #54155 (In Progress): pacific: rgw: "reshard cancel" errors with "invalid argument"
- 09:03 AM Backport #54157 (In Progress): quincy: rgw: "reshard cancel" errors with "invalid argument"
- 09:00 AM Backport #55505 (In Progress): pacific: radosgw rejects some requests without Content-MD5 Header
- 08:59 AM Backport #55506 (In Progress): quincy: radosgw rejects some requests without Content-MD5 Header
- 08:37 AM Backport #57409 (In Progress): pacific: rgw: bucket list operation slow down in special scenario
- 08:32 AM Backport #57410 (In Progress): quincy: rgw: bucket list operation slow down in special scenario
- 08:20 AM Backport #57752 (In Progress): quincy: Log status of individual object deletions for multi-object...
- 07:55 AM Backport #54493 (In Progress): quincy: segmentation fault in UserAsyncRefreshHandler::init_fetch
11/25/2022
- 03:57 PM Backport #58087 (In Progress): quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
- 03:56 PM Backport #58087: quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
- https://github.com/ceph/ceph/pull/49061
- 03:23 PM Backport #58087 (In Progress): quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
- 03:17 PM Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
11/23/2022
- 06:06 PM Bug #58059 (Resolved): s3tests v2 SignatureDoesNotMatch failures on ubuntu
- 04:19 PM Bug #57853 (Fix Under Review): multisite sync process block after long time running
11/22/2022
- 11:10 PM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
- *ceph-qa-suite* should be RGW
- 06:20 AM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
- Prepare a policy.json file that contains the bucket policy for testing....
- 06:13 AM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
- The bug can be reproduced with the following script:
- 07:48 PM Bug #58059 (Fix Under Review): s3tests v2 SignatureDoesNotMatch failures on ubuntu
- https://github.com/ceph/s3-tests/pull/476
- 05:03 PM Bug #58059: s3tests v2 SignatureDoesNotMatch failures on ubuntu
- bisected botocore versions down to good=botocore-1.27.96 bad=botocore-1.28.0
botocore debug log output from good=b... - 03:11 PM Bug #58059: s3tests v2 SignatureDoesNotMatch failures on ubuntu
- boto versions from a failing run on ubuntu:...
- 04:42 AM Bug #57562: multisite replication issue on Quincy
- Hi Adam, another question regarding your changes to remove the use of "tags". I'd like to understand your opinion/pla...
- 04:34 AM Bug #57562: multisite replication issue on Quincy
- Did some investigation on the latest failure of the tests on `lastest - 1` "PR":https://github.com/ceph/ceph/pull/486...
11/21/2022
- 09:03 PM Bug #58059 (Resolved): s3tests v2 SignatureDoesNotMatch failures on ubuntu
- from main branch results: https://pulpito.ceph.com/cbodley-2022-11-21_18:00:47-rgw-main-distro-default-smithi/
s3t... - 08:41 PM Bug #57562: multisite replication issue on Quincy
- Hi Adam,
Wanted to provide you with an update about the testing that we did over the weekend,
We took 2 PR's, one w... - 03:56 PM Bug #55310 (Duplicate): [pacific] RadosGW instance of Cloud Sync zone crashes when objects are up...
- 12:00 AM Bug #58053 (Need More Info): bucket is list in s3cmd but can not be queried after deleting
- ceph version 16.2.10
There will be a race between s3.DeleteBucket and s3.DeleteBucketPolicy. When the race happens...
11/17/2022
- 03:10 PM Bug #58034 (In Progress): RGW misplaces index entries after dynamically resharding bucket
- 08:30 AM Bug #55498 (Duplicate): "AssertionError: rgw multisite test failures" in upgrade:octopus
- will be fixed as part of: https://tracker.ceph.com/issues/58036
- 08:25 AM Backport #58036 (In Progress): pacific: pubsub test failures
11/16/2022
- 07:30 PM Backport #58036 (Resolved): pacific: pubsub test failures
- https://github.com/ceph/ceph/pull/48928
- 07:09 PM Bug #56572: pubsub test failures
- this should also be backported to pacific, in order to fix the upgrade issues. see: https://tracker.ceph.com/issues/5...
- 06:28 PM Bug #57562: multisite replication issue on Quincy
- Hey Adam,
So after all of our current testing and debugging of issue, the current race condition is that the same pa... - 04:43 PM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Hold off for now, I've introduced one problem I need to debug.
Ack.
We are currently tes... - 11:02 AM Bug #57562: multisite replication issue on Quincy
- Hold off for now, I've introduced one problem I need to debug.
- 10:47 AM Bug #57562: multisite replication issue on Quincy
- I have pushed a new commit. It disables all use of the part tags. I believe this should eliminate any remaining diffi...
- 06:08 PM Bug #58020 (Fix Under Review): notifications: zero timestamp in complete multipart upload event
- 06:06 PM Backport #57561 (In Progress): quincy: pubsub test failures
- 03:26 PM Bug #58035 (Fix Under Review): Copying an object to itself crashes de RGW if executed as admin user.
- 10:44 AM Bug #58035 (Pending Backport): Copying an object to itself crashes de RGW if executed as admin user.
- This was observed after executing the test *s3tests_boto3.functional.test_s3:test_object_copy_to_itself* with a user ...
- 03:28 AM Bug #57980 (Fix Under Review): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
- >>> From the http packets -> MCG is returning 403 for HEAD request and 400 for PUT request (failed). Maybe the header...
11/15/2022
- 09:56 PM Bug #58034 (Resolved): RGW misplaces index entries after dynamically resharding bucket
- When RGW reshards buckets with ~250k index entries*, I've noticed some s3:PutObject requests that return 200 end up w...
- 09:54 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
- The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a test at scale and it behaved ...
- 07:34 PM Bug #58033 (New): multipart copy part: use refcount optimization when possible
- rgw only supports CopyObject[1] for object sizes up to rgw_max_put_size=5GB, and requires multipart with UploadPartCo...
- 07:07 PM Bug #50076 (Fix Under Review): route librdkafka log messages to rgw log
- 05:45 PM Bug #50076 (In Progress): route librdkafka log messages to rgw log
11/14/2022
- 03:57 PM Bug #58020 (Pending Backport): notifications: zero timestamp in complete multipart upload event
- this is a regression due to: https://github.com/ceph/ceph/pull/42266
(original fix was: https://github.com/ceph/ceph... - 03:13 PM Bug #58014 (Fix Under Review): notifications: metadata does not work for COPY events
- 01:42 PM Bug #57562: multisite replication issue on Quincy
- Adam, I submitted a PR on top of yours. My 8h test has passed with this PR. The same test usually failed on earlier v...
- 01:37 PM Bug #57562: multisite replication issue on Quincy
- Two more racing conditions found. These two are all on the journal entries.
h3. *Race condition 1:*...
11/13/2022
- 09:37 AM Bug #58014 (Pending Backport): notifications: metadata does not work for COPY events
- this is a regression due to: https://github.com/ceph/ceph/pull/39192/commits/35a4eb4410394a0014648dda7df92642f3b536d3...
11/11/2022
- 01:01 AM Bug #57562: multisite replication issue on Quincy
- It should, thank you. I don't think it's the underlying cause, but it's a good catch.
11/10/2022
- 09:06 PM Bug #57562: multisite replication issue on Quincy
- A potential bug?
https://github.com/ceph/ceph/blob/main/src/cls/fifo/cls_fifo_types.h#L66
Should it be the follow... - 03:25 PM Bug #57706 (Need More Info): When creating a new user, if the 'uid' is not provided, error report...
- Hi Kevin Wang,
Could I get what version of Ceph this issue occurred on? The issue does seem to be resolved in the ... - 03:07 PM Bug #57724 (Fix Under Review): Keys returned by Admin API during user creation on secondary zone ...
11/09/2022
- 10:11 PM Bug #57706: When creating a new user, if the 'uid' is not provided, error reported as 'Permission...
- On a branch close to the master branch from a vstart cluster when I try this same scenario I see:
[ali@acadia buil... - 09:51 PM Bug #57562: multisite replication issue on Quincy
- We also found a place that might potentially cause issues.
Rgw locks the mutex and gets some data from "info" befo... - 09:22 PM Bug #57562: multisite replication issue on Quincy
- Here is some more detailed explanation on how the -EINVAL(-22) error (hence datalog writing failure) happens based on...
11/08/2022
- 02:49 PM Bug #57911 (Fix Under Review): Segmentation fault when uploading file with bucket policy on Quincy
11/07/2022
- 09:45 PM Bug #57562: multisite replication issue on Quincy
- > I think if the create_part is made exclusive, one of them would fail at part creation and let the other complete pa...
- 06:23 AM Bug #57980: rgw/cloud-transition: transition fails when using MCG Azure Namespacestore with a pre...
- Few observations:
- 2022-11-03T08:42:29.718+0000 7fa1bf7e6640 0 lifecycle: ERROR: failed to check object on the ... - 06:21 AM Bug #57980 (Pending Backport): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
- Reported by - dparkes@redhat.com
>>>>
Found Errors during cloud transition when using MCG Azure Namespacestore wit... - 06:07 AM Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
- Below issues were observed while testing cloud-transition feature using MCG (Noobaa) endpoint
1) Creation of targe...
11/04/2022
- 03:00 PM Bug #57911 (In Progress): Segmentation fault when uploading file with bucket policy on Quincy
11/03/2022
- 07:11 PM Bug #57562: multisite replication issue on Quincy
- We are still testing the latest evidence (HEAD at https://github.com/ceph/ceph/commit/cfc3bde36dbc9c6e0b7182bbb325390...
- 07:02 PM Bug #57936 (Fix Under Review): 'radosgw-admin bucket chown' doesn't set bucket instance owner or ...
- 02:16 PM Bug #57936 (In Progress): 'radosgw-admin bucket chown' doesn't set bucket instance owner or unlin...
- 02:13 PM Bug #57968 (New): Partial fix for XML responses returning different order of XML elements
- Hi
This is a follow up on original problem reported here
https://tracker.ceph.com/issues/52027
I've added my com... - 02:13 PM Bug #57951 (Fix Under Review): rgw: lc: lc for a single large bucket can run too long
- 02:03 PM Bug #57724 (In Progress): Keys returned by Admin API during user creation on secondary zone not v...
- 08:50 AM Bug #44660: Multipart re-uploads cause orphan data
- As it was discussed in [1] there is already a wip PR with more generic solution [2].
[1] https://github.com/ceph/c...
11/02/2022
- 08:27 PM Feature #57965 (Resolved): Add new zone option to control whether an object's first data stripe i...
- Delete requests are quite slow on clusters that have a data pool backed by HDDs, especially with an EC pool. For exam...
- 06:58 PM Bug #57562: multisite replication issue on Quincy
- Yeah, both those commits are gone, make sure you only have the newest one.
- 06:33 PM Bug #57562: multisite replication issue on Quincy
- Adam Emerson wrote:
> Pushed a new version with what should be a fix for multi-thread and multi-client races.
We ... - 07:22 AM Bug #57562: multisite replication issue on Quincy
- Pushed a new version with what should be a fix for multi-thread and multi-client races.
- 08:07 AM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
- FI pull-request https://github.com/ceph/ceph/pull/48704
11/01/2022
- 08:12 PM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
- FI Working on https://github.com/pgoron/ceph/commits/fix_rgw_rados_leaks_57942 to fix both issues (index entry leaks ...
- 07:25 PM Bug #57562: multisite replication issue on Quincy
- Agree as you mentioned, the other solution could be, secondary not limited to just listening on to orpan part, but co...
- 06:37 PM Bug #57562: multisite replication issue on Quincy
- Ah, I see, I need to update the async lister.
- 06:36 PM Bug #57562: multisite replication issue on Quincy
- That's the point of the commit `rgw/fifo: `part_full` is not a reliable indicator`. There is no 'orphan part' in that...
- 05:39 PM Bug #57562: multisite replication issue on Quincy
- Hey Adam,
Just a heads-up we tested with latest commit and we still see the issue.
The issue is seen when running M... - 02:11 PM Bug #57562: multisite replication issue on Quincy
- Thank you Adam. We'll test with the latest change.
- 04:27 PM Bug #44660 (Fix Under Review): Multipart re-uploads cause orphan data
- Actually it looks like there is a simpler solution to this problem, which uses the meta object lock when checking if ...
Also available in: Atom