Project

General

Profile

Activity

From 11/08/2022 to 12/07/2022

12/07/2022

11:17 PM Bug #58190: Large RGW GC queue might prevent OSD from starting
Matt Benjamin wrote:
> That's certainly very interesting and, hopefully, trivially tunable?
>
> Matt
Hopefully...
Igor Fedotov
11:16 PM Bug #58190 (Fix Under Review): Large RGW GC queue might prevent OSD from starting
Igor Fedotov
07:18 PM Backport #58212 (In Progress): quincy: Improve performance of multi-object delete by handling ind...
https://github.com/ceph/ceph/pull/50208 Backport Bot
07:18 PM Backport #58211 (Resolved): pacific: Improve performance of multi-object delete by handling indiv...
https://github.com/ceph/ceph/pull/49327 Backport Bot
07:11 PM Feature #57947 (Pending Backport): Improve performance of multi-object delete by handling individ...
Casey Bodley
05:17 PM Backport #56407 (Resolved): pacific: rgw gc object leak when gc omap set entry failed with a larg...
Konstantin Shalygin
04:27 PM Bug #58167 (Fix Under Review): No Authentication/Authorization for creating topics on RGW
Casey Bodley
08:52 AM Bug #58167: No Authentication/Authorization for creating topics on RGW
https://github.com/ceph/ceph/pull/49297, i try a PR to avoid anonymous authentication when create topic. lei cao
04:25 PM Bug #57324 (Fix Under Review): RGWBucketInstanceMetadataObject is set after being passed to base ...
Casey Bodley
09:48 AM Bug #57324: RGWBucketInstanceMetadataObject is set after being passed to base class constructors
https://github.com/ceph/ceph/pull/49298 lei cao
01:48 AM Bug #57562: multisite replication issue on Quincy
Hey Adam,
Quick correction about the testing update for the 2nd pr (using vector), we do see race condition with the...
Krunal Chheda

12/06/2022

11:46 PM Bug #58190: Large RGW GC queue might prevent OSD from starting
That's certainly very interesting and, hopefully, trivially tunable?
Matt
Matt Benjamin
10:41 PM Bug #58190: Large RGW GC queue might prevent OSD from starting
Matt Benjamin wrote:
> I think this could also be interacting with the issue being addressed here:
>
> https://gi...
Igor Fedotov
05:08 PM Bug #58190: Large RGW GC queue might prevent OSD from starting
I think this could also be interacting with the issue being addressed here:
https://github.com/ceph/ceph/pull/4883...
Matt Benjamin
04:26 PM Bug #58190 (Resolved): Large RGW GC queue might prevent OSD from starting
It looks like rgw_gc_queue_list_entries might cause HDD-based OSD to load the queue for more than half an hour.
Whi...
Igor Fedotov

12/05/2022

11:48 PM Backport #58119: pacific: check-generated.sh failures for rgw_log_entry
Casey Bodley wrote:
> https://github.com/ceph/ceph/pull/49142
merged
Yuri Weinstein
08:52 PM Bug #57562: multisite replication issue on Quincy
Hey Adam,
Quick update on the testing that was done on both the latest PR commits (multimap and vector one), we did ...
Krunal Chheda
06:58 PM Backport #58171 (Resolved): quincy: RGW misplaces index entries after dynamically resharding bucket
https://github.com/ceph/ceph/pull/49795 Backport Bot
06:58 PM Backport #58170 (Duplicate): pacific: RGW misplaces index entries after dynamically resharding bu...
Backport Bot
06:51 PM Bug #58034 (Pending Backport): RGW misplaces index entries after dynamically resharding bucket
J. Eric Ivancich
05:04 PM Bug #58167: No Authentication/Authorization for creating topics on RGW
In my example in the original comment the curl was run on a node inside the Ceph test cluster (of Apple M1 Max VMs).
...
Ulrich Klein
04:34 PM Bug #58167: No Authentication/Authorization for creating topics on RGW
* creating a topic by using curl without any user credential is a critical securuty issue.
* since topics are global...
Yuval Lifshitz
04:09 PM Bug #58167 (Pending Backport): No Authentication/Authorization for creating topics on RGW
I'm on a containerized Ceph 17.2.5 serving only RGW/S3 clients.
I'm experimenting with notifications for S3 bucket...
Ulrich Klein

12/02/2022

04:22 PM Bug #57562: multisite replication issue on Quincy
Thanks Adam, we looked at the new PR and see that you are using a vector instead of multi-map. And then do a find on ... Krunal Chheda

12/01/2022

05:54 PM Bug #57562: multisite replication issue on Quincy
I have a more thoroughly cleaned up and refactored fix at.
Apart from other changes, it collapses identical journa...
Adam Emerson
03:47 PM Bug #58127 (Fix Under Review): multisite: test_zg_master_zone_delete fails
Casey Bodley
03:12 PM Bug #58104 (Won't Fix - EOL): `putlc` failed in slave zonegroup
Casey Bodley
03:12 PM Bug #58105 (Won't Fix - EOL): `DeleteBucketPolicy` can not delete policy in slave zonegroup
the nautilus release is no longer supported. this was fixed in pacific Casey Bodley
03:08 PM Bug #58125 (Won't Fix - EOL): In the nautilus version ceph, the notification message "awsRegion" ...
the nautilus release is no longer supported so won't receive any more backports Casey Bodley
06:14 AM Bug #58125: In the nautilus version ceph, the notification message "awsRegion" parameter is null
https://tracker.ceph.com/issues/53186, can be backport to N. lei cao
03:07 PM Bug #58136 (Fix Under Review): usage trim has infinite loop problem
Casey Bodley
09:04 AM Bug #58136: usage trim has infinite loop problem

https://github.com/ceph/ceph/pull/49168
lei cao
06:35 AM Bug #58136 (Fix Under Review): usage trim has infinite loop problem
try usage trim only specifying "--bucket", when first MAX_USAGE_TRIM_ENTRIES entries in cls method RGW_USER_USAGE_LOG... lei cao

11/30/2022

04:02 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Krunal Chheda wrote:
> > Thanks once again for explanation, i was trying to figure out how t...
Krunal Chheda
02:48 AM Bug #57562: multisite replication issue on Quincy
Krunal Chheda wrote:
> Thanks once again for explanation, i was trying to figure out how the assert_exists() works ...
Adam Emerson
02:15 AM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Krunal Chheda wrote:
> > Regarding the ENOENT, with assert_exists in place now, will the rea...
Krunal Chheda
01:38 AM Bug #57562: multisite replication issue on Quincy
Specifically in cls_fifo_legacy.cc/push_part() Adam Emerson
01:37 AM Bug #57562: multisite replication issue on Quincy
Krunal Chheda wrote:
> Regarding the ENOENT, with assert_exists in place now, will the read_part_header on the trim...
Adam Emerson
01:19 AM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> I've pushed a commit that uses assert_exists, then fetches metadata on -ENOENT.
Thanks for ...
Krunal Chheda
01:07 AM Bug #57562: multisite replication issue on Quincy
I've pushed a commit that uses assert_exists, then fetches metadata on -ENOENT. Adam Emerson
03:47 PM Bug #58127 (Resolved): multisite: test_zg_master_zone_delete fails
ex. http://qa-proxy.ceph.com/teuthology/cbodley-2022-09-29_01:41:10-rgw-wip-rgw-sal-bootstrap-distro-default-smithi/7... Casey Bodley
01:57 PM Backport #58119 (In Progress): pacific: check-generated.sh failures for rgw_log_entry
https://github.com/ceph/ceph/pull/49142 Casey Bodley
08:26 AM Bug #58125 (Won't Fix - EOL): In the nautilus version ceph, the notification message "awsRegion" ...
The content of the message is as follows:
{
"Records": [{
"eventVersion": "2.2",
"eventSource...
wang kevin

11/29/2022

11:40 PM Bug #57562: multisite replication issue on Quincy
This is very useful, thank you, and might explain why it's happening.
So from your logs, does it seem like the par...
Adam Emerson
11:18 PM Bug #57562: multisite replication issue on Quincy
So coming back to EIO analysis, what we have found so far is this issue happens for more than 1 RGW instances running... Krunal Chheda
09:08 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Can you point me to the PR for retrying on EIO?
https://github.com/adamemerson/ceph/pull/4/...
Krunal Chheda
04:39 PM Bug #57562: multisite replication issue on Quincy
Can you point me to the PR for retrying on EIO? Adam Emerson
09:35 PM Backport #58119 (Resolved): pacific: check-generated.sh failures for rgw_log_entry
Backport Bot
09:35 PM Backport #58118 (In Progress): quincy: check-generated.sh failures for rgw_log_entry
Backport Bot
09:28 PM Bug #58115 (Pending Backport): check-generated.sh failures for rgw_log_entry
Casey Bodley
06:36 PM Bug #58115 (Fix Under Review): check-generated.sh failures for rgw_log_entry
Casey Bodley
06:35 PM Bug #58115 (Pending Backport): check-generated.sh failures for rgw_log_entry
... Casey Bodley
07:44 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
J. Eric Ivancich wrote:
> The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a te...
Nick Janus
07:21 PM Bug #58111 (Fix Under Review): crash: verify_bucket_owner_or_policy
Casey Bodley
09:42 AM Bug #58111: crash: verify_bucket_owner_or_policy
... Ilsoo Byun
09:28 AM Bug #58111 (Resolved): crash: verify_bucket_owner_or_policy
When executing 's3cmd ls s3://a:', rgw was terminated. ... Ilsoo Byun
10:40 AM Bug #44660: Multipart re-uploads cause orphan data
It is very big problem for us.
We have a lot of big buckets with orphaned parts which use hundreds TBs of space.
...
Aleksandr Rudenko
10:39 AM Bug #16767: RadosGW Multipart Cleanup Failure
It is very big problem for us.
We have a lot of big buckets with orphaned parts which use hundreds TBs of space.
...
Aleksandr Rudenko
02:13 AM Bug #54908: crash: double const md_config_t::get_val<double>(ConfigValues const&, std::basic_stri...
Similar problem.
(gdb) bt full ...
chao wang
01:35 AM Bug #58105 (Won't Fix - EOL): `DeleteBucketPolicy` can not delete policy in slave zonegroup
Huber ming
01:11 AM Bug #58104 (Won't Fix - EOL): `putlc` failed in slave zonegroup
radosgw version: nautilus(14.2.15) Huber ming

11/28/2022

07:29 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Yes, I just pushed it. We're still testing for regression, but you're welcome to try it.
Than...
Jane Zhu
05:31 PM Bug #57562: multisite replication issue on Quincy
Yes, I just pushed it. We're still testing for regression, but you're welcome to try it. Adam Emerson
05:13 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Also the goal is to remove tags as part of the fix.
>
> We had some confusion over where a ...
Jane Zhu
03:36 PM Bug #57562: multisite replication issue on Quincy
So currently without the tag changes we see a issue as mentioned here in "comment":https://tracker.ceph.com/issues/57... Krunal Chheda
03:17 PM Bug #57562: multisite replication issue on Quincy
Also the goal is to remove tags as part of the fix.
We had some confusion over where a regression is and it's conf...
Adam Emerson
03:17 PM Bug #57562: multisite replication issue on Quincy
> Also based on our previous test analysis, we think removal of tags will still not completely solve the race conditi... Adam Emerson
11:42 AM Documentation #58092 (New): rgw_enable_gc_threads / lc_threads not documented on web
Options rgw_enable_gc_threads and rgw_enable_lc_threads are not rendered for docs.ceph.com.
I would expect those t...
Dan van der Ster
09:11 AM Backport #57238 (In Progress): pacific: crash: RGWCoroutinesStack::wakeup()
Cory Snyder
09:11 AM Backport #57237 (In Progress): quincy: crash: RGWCoroutinesStack::wakeup()
Cory Snyder
09:08 AM Backport #55228 (In Progress): pacific: crash: RGWGC::send_chain(cls_rgw_obj_chain&, std::basic_s...
Cory Snyder
09:06 AM Backport #55227 (In Progress): quincy: crash: RGWGC::send_chain(cls_rgw_obj_chain&, std::basic_st...
Cory Snyder
09:05 AM Backport #54497 (In Progress): pacific: bucket index completions may not retry after reshard
Cory Snyder
09:04 AM Backport #54496 (In Progress): quincy: bucket index completions may not retry after reshard
Cory Snyder
09:03 AM Backport #54155 (In Progress): pacific: rgw: "reshard cancel" errors with "invalid argument"
Cory Snyder
09:03 AM Backport #54157 (In Progress): quincy: rgw: "reshard cancel" errors with "invalid argument"
Cory Snyder
09:00 AM Backport #55505 (In Progress): pacific: radosgw rejects some requests without Content-MD5 Header
Cory Snyder
08:59 AM Backport #55506 (In Progress): quincy: radosgw rejects some requests without Content-MD5 Header
Cory Snyder
08:37 AM Backport #57409 (In Progress): pacific: rgw: bucket list operation slow down in special scenario
Cory Snyder
08:32 AM Backport #57410 (In Progress): quincy: rgw: bucket list operation slow down in special scenario
Cory Snyder
08:20 AM Backport #57752 (In Progress): quincy: Log status of individual object deletions for multi-object...
Cory Snyder
07:55 AM Backport #54493 (In Progress): quincy: segmentation fault in UserAsyncRefreshHandler::init_fetch
Cory Snyder

11/25/2022

03:57 PM Backport #58087 (In Progress): quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
Soumya Koduri
03:56 PM Backport #58087: quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
https://github.com/ceph/ceph/pull/49061 Soumya Koduri
03:23 PM Backport #58087 (In Progress): quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
Backport Bot
03:17 PM Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
Soumya Koduri

11/23/2022

06:06 PM Bug #58059 (Resolved): s3tests v2 SignatureDoesNotMatch failures on ubuntu
Casey Bodley
04:19 PM Bug #57853 (Fix Under Review): multisite sync process block after long time running
Casey Bodley

11/22/2022

11:10 PM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
*ceph-qa-suite* should be RGW Max Gao
06:20 AM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
Prepare a policy.json file that contains the bucket policy for testing.... Max Gao
06:13 AM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
The bug can be reproduced with the following script:
Max Gao
07:48 PM Bug #58059 (Fix Under Review): s3tests v2 SignatureDoesNotMatch failures on ubuntu
https://github.com/ceph/s3-tests/pull/476 Casey Bodley
05:03 PM Bug #58059: s3tests v2 SignatureDoesNotMatch failures on ubuntu
bisected botocore versions down to good=botocore-1.27.96 bad=botocore-1.28.0
botocore debug log output from good=b...
Casey Bodley
03:11 PM Bug #58059: s3tests v2 SignatureDoesNotMatch failures on ubuntu
boto versions from a failing run on ubuntu:... Casey Bodley
04:42 AM Bug #57562: multisite replication issue on Quincy
Hi Adam, another question regarding your changes to remove the use of "tags". I'd like to understand your opinion/pla... Jane Zhu
04:34 AM Bug #57562: multisite replication issue on Quincy
Did some investigation on the latest failure of the tests on `lastest - 1` "PR":https://github.com/ceph/ceph/pull/486... Jane Zhu

11/21/2022

09:03 PM Bug #58059 (Resolved): s3tests v2 SignatureDoesNotMatch failures on ubuntu
from main branch results: https://pulpito.ceph.com/cbodley-2022-11-21_18:00:47-rgw-main-distro-default-smithi/
s3t...
Casey Bodley
08:41 PM Bug #57562: multisite replication issue on Quincy
Hi Adam,
Wanted to provide you with an update about the testing that we did over the weekend,
We took 2 PR's, one w...
Krunal Chheda
03:56 PM Bug #55310 (Duplicate): [pacific] RadosGW instance of Cloud Sync zone crashes when objects are up...
Casey Bodley
12:00 AM Bug #58053 (Need More Info): bucket is list in s3cmd but can not be queried after deleting
ceph version 16.2.10
There will be a race between s3.DeleteBucket and s3.DeleteBucketPolicy. When the race happens...
Max Gao

11/17/2022

03:10 PM Bug #58034 (In Progress): RGW misplaces index entries after dynamically resharding bucket
Casey Bodley
08:30 AM Bug #55498 (Duplicate): "AssertionError: rgw multisite test failures" in upgrade:octopus
will be fixed as part of: https://tracker.ceph.com/issues/58036 Yuval Lifshitz
08:25 AM Backport #58036 (In Progress): pacific: pubsub test failures
Yuval Lifshitz

11/16/2022

07:30 PM Backport #58036 (Resolved): pacific: pubsub test failures
https://github.com/ceph/ceph/pull/48928 Backport Bot
07:09 PM Bug #56572: pubsub test failures
this should also be backported to pacific, in order to fix the upgrade issues. see: https://tracker.ceph.com/issues/5... Yuval Lifshitz
06:28 PM Bug #57562: multisite replication issue on Quincy
Hey Adam,
So after all of our current testing and debugging of issue, the current race condition is that the same pa...
Krunal Chheda
04:43 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Hold off for now, I've introduced one problem I need to debug.
Ack.
We are currently tes...
Oguzhan Ozmen
11:02 AM Bug #57562: multisite replication issue on Quincy
Hold off for now, I've introduced one problem I need to debug. Adam Emerson
10:47 AM Bug #57562: multisite replication issue on Quincy
I have pushed a new commit. It disables all use of the part tags. I believe this should eliminate any remaining diffi... Adam Emerson
06:08 PM Bug #58020 (Fix Under Review): notifications: zero timestamp in complete multipart upload event
Yuval Lifshitz
06:06 PM Backport #57561 (In Progress): quincy: pubsub test failures
Yuval Lifshitz
03:26 PM Bug #58035 (Fix Under Review): Copying an object to itself crashes de RGW if executed as admin user.
Casey Bodley
10:44 AM Bug #58035 (Pending Backport): Copying an object to itself crashes de RGW if executed as admin user.
This was observed after executing the test *s3tests_boto3.functional.test_s3:test_object_copy_to_itself* with a user ... Xavi Garcia
03:28 AM Bug #57980 (Fix Under Review): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
>>> From the http packets -> MCG is returning 403 for HEAD request and 400 for PUT request (failed). Maybe the header... Soumya Koduri

11/15/2022

09:56 PM Bug #58034 (Resolved): RGW misplaces index entries after dynamically resharding bucket
When RGW reshards buckets with ~250k index entries*, I've noticed some s3:PutObject requests that return 200 end up w... J. Eric Ivancich
09:54 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a test at scale and it behaved ... J. Eric Ivancich
07:34 PM Bug #58033 (New): multipart copy part: use refcount optimization when possible
rgw only supports CopyObject[1] for object sizes up to rgw_max_put_size=5GB, and requires multipart with UploadPartCo... Casey Bodley
07:07 PM Bug #50076 (Fix Under Review): route librdkafka log messages to rgw log
Yuval Lifshitz
05:45 PM Bug #50076 (In Progress): route librdkafka log messages to rgw log
Yuval Lifshitz

11/14/2022

03:57 PM Bug #58020 (Pending Backport): notifications: zero timestamp in complete multipart upload event
this is a regression due to: https://github.com/ceph/ceph/pull/42266
(original fix was: https://github.com/ceph/ceph...
Yuval Lifshitz
03:13 PM Bug #58014 (Fix Under Review): notifications: metadata does not work for COPY events
Yuval Lifshitz
01:42 PM Bug #57562: multisite replication issue on Quincy
Adam, I submitted a PR on top of yours. My 8h test has passed with this PR. The same test usually failed on earlier v... Jane Zhu
01:37 PM Bug #57562: multisite replication issue on Quincy
Two more racing conditions found. These two are all on the journal entries.
h3. *Race condition 1:*...
Jane Zhu

11/13/2022

09:37 AM Bug #58014 (Pending Backport): notifications: metadata does not work for COPY events
this is a regression due to: https://github.com/ceph/ceph/pull/39192/commits/35a4eb4410394a0014648dda7df92642f3b536d3... Yuval Lifshitz

11/11/2022

01:01 AM Bug #57562: multisite replication issue on Quincy
It should, thank you. I don't think it's the underlying cause, but it's a good catch. Adam Emerson

11/10/2022

09:06 PM Bug #57562: multisite replication issue on Quincy
A potential bug?
https://github.com/ceph/ceph/blob/main/src/cls/fifo/cls_fifo_types.h#L66
Should it be the follow...
Jane Zhu
03:25 PM Bug #57706 (Need More Info): When creating a new user, if the 'uid' is not provided, error report...
Hi Kevin Wang,
Could I get what version of Ceph this issue occurred on? The issue does seem to be resolved in the ...
Ali Maredia
03:07 PM Bug #57724 (Fix Under Review): Keys returned by Admin API during user creation on secondary zone ...
Casey Bodley

11/09/2022

10:11 PM Bug #57706: When creating a new user, if the 'uid' is not provided, error reported as 'Permission...
On a branch close to the master branch from a vstart cluster when I try this same scenario I see:
[ali@acadia buil...
Ali Maredia
09:51 PM Bug #57562: multisite replication issue on Quincy
We also found a place that might potentially cause issues.
Rgw locks the mutex and gets some data from "info" befo...
Jane Zhu
09:22 PM Bug #57562: multisite replication issue on Quincy
Here is some more detailed explanation on how the -EINVAL(-22) error (hence datalog writing failure) happens based on... Jane Zhu

11/08/2022

02:49 PM Bug #57911 (Fix Under Review): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz
 

Also available in: Atom