Project

General

Profile

Activity

From 11/01/2022 to 11/30/2022

11/30/2022

04:02 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Krunal Chheda wrote:
> > Thanks once again for explanation, i was trying to figure out how t...
Krunal Chheda
02:48 AM Bug #57562: multisite replication issue on Quincy
Krunal Chheda wrote:
> Thanks once again for explanation, i was trying to figure out how the assert_exists() works ...
Adam Emerson
02:15 AM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Krunal Chheda wrote:
> > Regarding the ENOENT, with assert_exists in place now, will the rea...
Krunal Chheda
01:38 AM Bug #57562: multisite replication issue on Quincy
Specifically in cls_fifo_legacy.cc/push_part() Adam Emerson
01:37 AM Bug #57562: multisite replication issue on Quincy
Krunal Chheda wrote:
> Regarding the ENOENT, with assert_exists in place now, will the read_part_header on the trim...
Adam Emerson
01:19 AM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> I've pushed a commit that uses assert_exists, then fetches metadata on -ENOENT.
Thanks for ...
Krunal Chheda
01:07 AM Bug #57562: multisite replication issue on Quincy
I've pushed a commit that uses assert_exists, then fetches metadata on -ENOENT. Adam Emerson
03:47 PM Bug #58127 (Resolved): multisite: test_zg_master_zone_delete fails
ex. http://qa-proxy.ceph.com/teuthology/cbodley-2022-09-29_01:41:10-rgw-wip-rgw-sal-bootstrap-distro-default-smithi/7... Casey Bodley
01:57 PM Backport #58119 (In Progress): pacific: check-generated.sh failures for rgw_log_entry
https://github.com/ceph/ceph/pull/49142 Casey Bodley
08:26 AM Bug #58125 (Won't Fix - EOL): In the nautilus version ceph, the notification message "awsRegion" ...
The content of the message is as follows:
{
"Records": [{
"eventVersion": "2.2",
"eventSource...
wang kevin

11/29/2022

11:40 PM Bug #57562: multisite replication issue on Quincy
This is very useful, thank you, and might explain why it's happening.
So from your logs, does it seem like the par...
Adam Emerson
11:18 PM Bug #57562: multisite replication issue on Quincy
So coming back to EIO analysis, what we have found so far is this issue happens for more than 1 RGW instances running... Krunal Chheda
09:08 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Can you point me to the PR for retrying on EIO?
https://github.com/adamemerson/ceph/pull/4/...
Krunal Chheda
04:39 PM Bug #57562: multisite replication issue on Quincy
Can you point me to the PR for retrying on EIO? Adam Emerson
09:35 PM Backport #58119 (Resolved): pacific: check-generated.sh failures for rgw_log_entry
Backport Bot
09:35 PM Backport #58118 (In Progress): quincy: check-generated.sh failures for rgw_log_entry
Backport Bot
09:28 PM Bug #58115 (Pending Backport): check-generated.sh failures for rgw_log_entry
Casey Bodley
06:36 PM Bug #58115 (Fix Under Review): check-generated.sh failures for rgw_log_entry
Casey Bodley
06:35 PM Bug #58115 (Pending Backport): check-generated.sh failures for rgw_log_entry
... Casey Bodley
07:44 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
J. Eric Ivancich wrote:
> The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a te...
Nick Janus
07:21 PM Bug #58111 (Fix Under Review): crash: verify_bucket_owner_or_policy
Casey Bodley
09:42 AM Bug #58111: crash: verify_bucket_owner_or_policy
... Ilsoo Byun
09:28 AM Bug #58111 (Resolved): crash: verify_bucket_owner_or_policy
When executing 's3cmd ls s3://a:', rgw was terminated. ... Ilsoo Byun
10:40 AM Bug #44660: Multipart re-uploads cause orphan data
It is very big problem for us.
We have a lot of big buckets with orphaned parts which use hundreds TBs of space.
...
Aleksandr Rudenko
10:39 AM Bug #16767: RadosGW Multipart Cleanup Failure
It is very big problem for us.
We have a lot of big buckets with orphaned parts which use hundreds TBs of space.
...
Aleksandr Rudenko
02:13 AM Bug #54908: crash: double const md_config_t::get_val<double>(ConfigValues const&, std::basic_stri...
Similar problem.
(gdb) bt full ...
chao wang
01:35 AM Bug #58105 (Won't Fix - EOL): `DeleteBucketPolicy` can not delete policy in slave zonegroup
Huber ming
01:11 AM Bug #58104 (Won't Fix - EOL): `putlc` failed in slave zonegroup
radosgw version: nautilus(14.2.15) Huber ming

11/28/2022

07:29 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Yes, I just pushed it. We're still testing for regression, but you're welcome to try it.
Than...
Jane Zhu
05:31 PM Bug #57562: multisite replication issue on Quincy
Yes, I just pushed it. We're still testing for regression, but you're welcome to try it. Adam Emerson
05:13 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Also the goal is to remove tags as part of the fix.
>
> We had some confusion over where a ...
Jane Zhu
03:36 PM Bug #57562: multisite replication issue on Quincy
So currently without the tag changes we see a issue as mentioned here in "comment":https://tracker.ceph.com/issues/57... Krunal Chheda
03:17 PM Bug #57562: multisite replication issue on Quincy
Also the goal is to remove tags as part of the fix.
We had some confusion over where a regression is and it's conf...
Adam Emerson
03:17 PM Bug #57562: multisite replication issue on Quincy
> Also based on our previous test analysis, we think removal of tags will still not completely solve the race conditi... Adam Emerson
11:42 AM Documentation #58092 (New): rgw_enable_gc_threads / lc_threads not documented on web
Options rgw_enable_gc_threads and rgw_enable_lc_threads are not rendered for docs.ceph.com.
I would expect those t...
Dan van der Ster
09:11 AM Backport #57238 (In Progress): pacific: crash: RGWCoroutinesStack::wakeup()
Cory Snyder
09:11 AM Backport #57237 (In Progress): quincy: crash: RGWCoroutinesStack::wakeup()
Cory Snyder
09:08 AM Backport #55228 (In Progress): pacific: crash: RGWGC::send_chain(cls_rgw_obj_chain&, std::basic_s...
Cory Snyder
09:06 AM Backport #55227 (In Progress): quincy: crash: RGWGC::send_chain(cls_rgw_obj_chain&, std::basic_st...
Cory Snyder
09:05 AM Backport #54497 (In Progress): pacific: bucket index completions may not retry after reshard
Cory Snyder
09:04 AM Backport #54496 (In Progress): quincy: bucket index completions may not retry after reshard
Cory Snyder
09:03 AM Backport #54155 (In Progress): pacific: rgw: "reshard cancel" errors with "invalid argument"
Cory Snyder
09:03 AM Backport #54157 (In Progress): quincy: rgw: "reshard cancel" errors with "invalid argument"
Cory Snyder
09:00 AM Backport #55505 (In Progress): pacific: radosgw rejects some requests without Content-MD5 Header
Cory Snyder
08:59 AM Backport #55506 (In Progress): quincy: radosgw rejects some requests without Content-MD5 Header
Cory Snyder
08:37 AM Backport #57409 (In Progress): pacific: rgw: bucket list operation slow down in special scenario
Cory Snyder
08:32 AM Backport #57410 (In Progress): quincy: rgw: bucket list operation slow down in special scenario
Cory Snyder
08:20 AM Backport #57752 (In Progress): quincy: Log status of individual object deletions for multi-object...
Cory Snyder
07:55 AM Backport #54493 (In Progress): quincy: segmentation fault in UserAsyncRefreshHandler::init_fetch
Cory Snyder

11/25/2022

03:57 PM Backport #58087 (In Progress): quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
Soumya Koduri
03:56 PM Backport #58087: quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
https://github.com/ceph/ceph/pull/49061 Soumya Koduri
03:23 PM Backport #58087 (In Progress): quincy: rgw/cloud-tranistion: Issues with MCG cloud endpoint
Backport Bot
03:17 PM Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
Soumya Koduri

11/23/2022

06:06 PM Bug #58059 (Resolved): s3tests v2 SignatureDoesNotMatch failures on ubuntu
Casey Bodley
04:19 PM Bug #57853 (Fix Under Review): multisite sync process block after long time running
Casey Bodley

11/22/2022

11:10 PM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
*ceph-qa-suite* should be RGW Max Gao
06:20 AM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
Prepare a policy.json file that contains the bucket policy for testing.... Max Gao
06:13 AM Bug #58053: bucket is list in s3cmd but can not be queried after deleting
The bug can be reproduced with the following script:
Max Gao
07:48 PM Bug #58059 (Fix Under Review): s3tests v2 SignatureDoesNotMatch failures on ubuntu
https://github.com/ceph/s3-tests/pull/476 Casey Bodley
05:03 PM Bug #58059: s3tests v2 SignatureDoesNotMatch failures on ubuntu
bisected botocore versions down to good=botocore-1.27.96 bad=botocore-1.28.0
botocore debug log output from good=b...
Casey Bodley
03:11 PM Bug #58059: s3tests v2 SignatureDoesNotMatch failures on ubuntu
boto versions from a failing run on ubuntu:... Casey Bodley
04:42 AM Bug #57562: multisite replication issue on Quincy
Hi Adam, another question regarding your changes to remove the use of "tags". I'd like to understand your opinion/pla... Jane Zhu
04:34 AM Bug #57562: multisite replication issue on Quincy
Did some investigation on the latest failure of the tests on `lastest - 1` "PR":https://github.com/ceph/ceph/pull/486... Jane Zhu

11/21/2022

09:03 PM Bug #58059 (Resolved): s3tests v2 SignatureDoesNotMatch failures on ubuntu
from main branch results: https://pulpito.ceph.com/cbodley-2022-11-21_18:00:47-rgw-main-distro-default-smithi/
s3t...
Casey Bodley
08:41 PM Bug #57562: multisite replication issue on Quincy
Hi Adam,
Wanted to provide you with an update about the testing that we did over the weekend,
We took 2 PR's, one w...
Krunal Chheda
03:56 PM Bug #55310 (Duplicate): [pacific] RadosGW instance of Cloud Sync zone crashes when objects are up...
Casey Bodley
12:00 AM Bug #58053 (Need More Info): bucket is list in s3cmd but can not be queried after deleting
ceph version 16.2.10
There will be a race between s3.DeleteBucket and s3.DeleteBucketPolicy. When the race happens...
Max Gao

11/17/2022

03:10 PM Bug #58034 (In Progress): RGW misplaces index entries after dynamically resharding bucket
Casey Bodley
08:30 AM Bug #55498 (Duplicate): "AssertionError: rgw multisite test failures" in upgrade:octopus
will be fixed as part of: https://tracker.ceph.com/issues/58036 Yuval Lifshitz
08:25 AM Backport #58036 (In Progress): pacific: pubsub test failures
Yuval Lifshitz

11/16/2022

07:30 PM Backport #58036 (Resolved): pacific: pubsub test failures
https://github.com/ceph/ceph/pull/48928 Backport Bot
07:09 PM Bug #56572: pubsub test failures
this should also be backported to pacific, in order to fix the upgrade issues. see: https://tracker.ceph.com/issues/5... Yuval Lifshitz
06:28 PM Bug #57562: multisite replication issue on Quincy
Hey Adam,
So after all of our current testing and debugging of issue, the current race condition is that the same pa...
Krunal Chheda
04:43 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Hold off for now, I've introduced one problem I need to debug.
Ack.
We are currently tes...
Oguzhan Ozmen
11:02 AM Bug #57562: multisite replication issue on Quincy
Hold off for now, I've introduced one problem I need to debug. Adam Emerson
10:47 AM Bug #57562: multisite replication issue on Quincy
I have pushed a new commit. It disables all use of the part tags. I believe this should eliminate any remaining diffi... Adam Emerson
06:08 PM Bug #58020 (Fix Under Review): notifications: zero timestamp in complete multipart upload event
Yuval Lifshitz
06:06 PM Backport #57561 (In Progress): quincy: pubsub test failures
Yuval Lifshitz
03:26 PM Bug #58035 (Fix Under Review): Copying an object to itself crashes de RGW if executed as admin user.
Casey Bodley
10:44 AM Bug #58035 (Pending Backport): Copying an object to itself crashes de RGW if executed as admin user.
This was observed after executing the test *s3tests_boto3.functional.test_s3:test_object_copy_to_itself* with a user ... Xavi Garcia
03:28 AM Bug #57980 (Fix Under Review): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
>>> From the http packets -> MCG is returning 403 for HEAD request and 400 for PUT request (failed). Maybe the header... Soumya Koduri

11/15/2022

09:56 PM Bug #58034 (Resolved): RGW misplaces index entries after dynamically resharding bucket
When RGW reshards buckets with ~250k index entries*, I've noticed some s3:PutObject requests that return 200 end up w... J. Eric Ivancich
09:54 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a test at scale and it behaved ... J. Eric Ivancich
07:34 PM Bug #58033 (New): multipart copy part: use refcount optimization when possible
rgw only supports CopyObject[1] for object sizes up to rgw_max_put_size=5GB, and requires multipart with UploadPartCo... Casey Bodley
07:07 PM Bug #50076 (Fix Under Review): route librdkafka log messages to rgw log
Yuval Lifshitz
05:45 PM Bug #50076 (In Progress): route librdkafka log messages to rgw log
Yuval Lifshitz

11/14/2022

03:57 PM Bug #58020 (Pending Backport): notifications: zero timestamp in complete multipart upload event
this is a regression due to: https://github.com/ceph/ceph/pull/42266
(original fix was: https://github.com/ceph/ceph...
Yuval Lifshitz
03:13 PM Bug #58014 (Fix Under Review): notifications: metadata does not work for COPY events
Yuval Lifshitz
01:42 PM Bug #57562: multisite replication issue on Quincy
Adam, I submitted a PR on top of yours. My 8h test has passed with this PR. The same test usually failed on earlier v... Jane Zhu
01:37 PM Bug #57562: multisite replication issue on Quincy
Two more racing conditions found. These two are all on the journal entries.
h3. *Race condition 1:*...
Jane Zhu

11/13/2022

09:37 AM Bug #58014 (Pending Backport): notifications: metadata does not work for COPY events
this is a regression due to: https://github.com/ceph/ceph/pull/39192/commits/35a4eb4410394a0014648dda7df92642f3b536d3... Yuval Lifshitz

11/11/2022

01:01 AM Bug #57562: multisite replication issue on Quincy
It should, thank you. I don't think it's the underlying cause, but it's a good catch. Adam Emerson

11/10/2022

09:06 PM Bug #57562: multisite replication issue on Quincy
A potential bug?
https://github.com/ceph/ceph/blob/main/src/cls/fifo/cls_fifo_types.h#L66
Should it be the follow...
Jane Zhu
03:25 PM Bug #57706 (Need More Info): When creating a new user, if the 'uid' is not provided, error report...
Hi Kevin Wang,
Could I get what version of Ceph this issue occurred on? The issue does seem to be resolved in the ...
Ali Maredia
03:07 PM Bug #57724 (Fix Under Review): Keys returned by Admin API during user creation on secondary zone ...
Casey Bodley

11/09/2022

10:11 PM Bug #57706: When creating a new user, if the 'uid' is not provided, error reported as 'Permission...
On a branch close to the master branch from a vstart cluster when I try this same scenario I see:
[ali@acadia buil...
Ali Maredia
09:51 PM Bug #57562: multisite replication issue on Quincy
We also found a place that might potentially cause issues.
Rgw locks the mutex and gets some data from "info" befo...
Jane Zhu
09:22 PM Bug #57562: multisite replication issue on Quincy
Here is some more detailed explanation on how the -EINVAL(-22) error (hence datalog writing failure) happens based on... Jane Zhu

11/08/2022

02:49 PM Bug #57911 (Fix Under Review): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz

11/07/2022

09:45 PM Bug #57562: multisite replication issue on Quincy
> I think if the create_part is made exclusive, one of them would fail at part creation and let the other complete pa... Oguzhan Ozmen
06:23 AM Bug #57980: rgw/cloud-transition: transition fails when using MCG Azure Namespacestore with a pre...
Few observations:
- 2022-11-03T08:42:29.718+0000 7fa1bf7e6640 0 lifecycle: ERROR: failed to check object on the ...
Soumya Koduri
06:21 AM Bug #57980 (Pending Backport): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
Reported by - dparkes@redhat.com
>>>>
Found Errors during cloud transition when using MCG Azure Namespacestore wit...
Soumya Koduri
06:07 AM Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
Below issues were observed while testing cloud-transition feature using MCG (Noobaa) endpoint
1) Creation of targe...
Soumya Koduri

11/04/2022

03:00 PM Bug #57911 (In Progress): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz

11/03/2022

07:11 PM Bug #57562: multisite replication issue on Quincy
We are still testing the latest evidence (HEAD at https://github.com/ceph/ceph/commit/cfc3bde36dbc9c6e0b7182bbb325390... Oguzhan Ozmen
07:02 PM Bug #57936 (Fix Under Review): 'radosgw-admin bucket chown' doesn't set bucket instance owner or ...
Daniel Gryniewicz
02:16 PM Bug #57936 (In Progress): 'radosgw-admin bucket chown' doesn't set bucket instance owner or unlin...
Casey Bodley
02:13 PM Bug #57968 (New): Partial fix for XML responses returning different order of XML elements
Hi
This is a follow up on original problem reported here
https://tracker.ceph.com/issues/52027
I've added my com...
Daniel Iwan
02:13 PM Bug #57951 (Fix Under Review): rgw: lc: lc for a single large bucket can run too long
Casey Bodley
02:03 PM Bug #57724 (In Progress): Keys returned by Admin API during user creation on secondary zone not v...
Casey Bodley
08:50 AM Bug #44660: Multipart re-uploads cause orphan data
As it was discussed in [1] there is already a wip PR with more generic solution [2].
[1] https://github.com/ceph/c...
Mykola Golub

11/02/2022

08:27 PM Feature #57965 (Resolved): Add new zone option to control whether an object's first data stripe i...
Delete requests are quite slow on clusters that have a data pool backed by HDDs, especially with an EC pool. For exam... Cory Snyder
06:58 PM Bug #57562: multisite replication issue on Quincy
Yeah, both those commits are gone, make sure you only have the newest one. Adam Emerson
06:33 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Pushed a new version with what should be a fix for multi-thread and multi-client races.
We ...
Oguzhan Ozmen
07:22 AM Bug #57562: multisite replication issue on Quincy
Pushed a new version with what should be a fix for multi-thread and multi-client races. Adam Emerson
08:07 AM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
FI pull-request https://github.com/ceph/ceph/pull/48704 Peter Goron

11/01/2022

08:12 PM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
FI Working on https://github.com/pgoron/ceph/commits/fix_rgw_rados_leaks_57942 to fix both issues (index entry leaks ... Peter Goron
07:25 PM Bug #57562: multisite replication issue on Quincy
Agree as you mentioned, the other solution could be, secondary not limited to just listening on to orpan part, but co... Krunal Chheda
06:37 PM Bug #57562: multisite replication issue on Quincy
Ah, I see, I need to update the async lister. Adam Emerson
06:36 PM Bug #57562: multisite replication issue on Quincy
That's the point of the commit `rgw/fifo: `part_full` is not a reliable indicator`. There is no 'orphan part' in that... Adam Emerson
05:39 PM Bug #57562: multisite replication issue on Quincy
Hey Adam,
Just a heads-up we tested with latest commit and we still see the issue.
The issue is seen when running M...
Krunal Chheda
02:11 PM Bug #57562: multisite replication issue on Quincy
Thank you Adam. We'll test with the latest change. Oguzhan Ozmen
04:27 PM Bug #44660 (Fix Under Review): Multipart re-uploads cause orphan data
Actually it looks like there is a simpler solution to this problem, which uses the meta object lock when checking if ... Mykola Golub
 

Also available in: Atom