Project

General

Profile

Activity

From 10/17/2022 to 11/15/2022

11/15/2022

09:56 PM Bug #58034 (Resolved): RGW misplaces index entries after dynamically resharding bucket
When RGW reshards buckets with ~250k index entries*, I've noticed some s3:PutObject requests that return 200 end up w... J. Eric Ivancich
09:54 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
The code on the PR seems to address the issue. My colleague Mark Kogan ran it through a test at scale and it behaved ... J. Eric Ivancich
07:34 PM Bug #58033 (New): multipart copy part: use refcount optimization when possible
rgw only supports CopyObject[1] for object sizes up to rgw_max_put_size=5GB, and requires multipart with UploadPartCo... Casey Bodley
07:07 PM Bug #50076 (Fix Under Review): route librdkafka log messages to rgw log
Yuval Lifshitz
05:45 PM Bug #50076 (In Progress): route librdkafka log messages to rgw log
Yuval Lifshitz

11/14/2022

03:57 PM Bug #58020 (Pending Backport): notifications: zero timestamp in complete multipart upload event
this is a regression due to: https://github.com/ceph/ceph/pull/42266
(original fix was: https://github.com/ceph/ceph...
Yuval Lifshitz
03:13 PM Bug #58014 (Fix Under Review): notifications: metadata does not work for COPY events
Yuval Lifshitz
01:42 PM Bug #57562: multisite replication issue on Quincy
Adam, I submitted a PR on top of yours. My 8h test has passed with this PR. The same test usually failed on earlier v... Jane Zhu
01:37 PM Bug #57562: multisite replication issue on Quincy
Two more racing conditions found. These two are all on the journal entries.
h3. *Race condition 1:*...
Jane Zhu

11/13/2022

09:37 AM Bug #58014 (Pending Backport): notifications: metadata does not work for COPY events
this is a regression due to: https://github.com/ceph/ceph/pull/39192/commits/35a4eb4410394a0014648dda7df92642f3b536d3... Yuval Lifshitz

11/11/2022

01:01 AM Bug #57562: multisite replication issue on Quincy
It should, thank you. I don't think it's the underlying cause, but it's a good catch. Adam Emerson

11/10/2022

09:06 PM Bug #57562: multisite replication issue on Quincy
A potential bug?
https://github.com/ceph/ceph/blob/main/src/cls/fifo/cls_fifo_types.h#L66
Should it be the follow...
Jane Zhu
03:25 PM Bug #57706 (Need More Info): When creating a new user, if the 'uid' is not provided, error report...
Hi Kevin Wang,
Could I get what version of Ceph this issue occurred on? The issue does seem to be resolved in the ...
Ali Maredia
03:07 PM Bug #57724 (Fix Under Review): Keys returned by Admin API during user creation on secondary zone ...
Casey Bodley

11/09/2022

10:11 PM Bug #57706: When creating a new user, if the 'uid' is not provided, error reported as 'Permission...
On a branch close to the master branch from a vstart cluster when I try this same scenario I see:
[ali@acadia buil...
Ali Maredia
09:51 PM Bug #57562: multisite replication issue on Quincy
We also found a place that might potentially cause issues.
Rgw locks the mutex and gets some data from "info" befo...
Jane Zhu
09:22 PM Bug #57562: multisite replication issue on Quincy
Here is some more detailed explanation on how the -EINVAL(-22) error (hence datalog writing failure) happens based on... Jane Zhu

11/08/2022

02:49 PM Bug #57911 (Fix Under Review): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz

11/07/2022

09:45 PM Bug #57562: multisite replication issue on Quincy
> I think if the create_part is made exclusive, one of them would fail at part creation and let the other complete pa... Oguzhan Ozmen
06:23 AM Bug #57980: rgw/cloud-transition: transition fails when using MCG Azure Namespacestore with a pre...
Few observations:
- 2022-11-03T08:42:29.718+0000 7fa1bf7e6640 0 lifecycle: ERROR: failed to check object on the ...
Soumya Koduri
06:21 AM Bug #57980 (Pending Backport): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
Reported by - dparkes@redhat.com
>>>>
Found Errors during cloud transition when using MCG Azure Namespacestore wit...
Soumya Koduri
06:07 AM Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
Below issues were observed while testing cloud-transition feature using MCG (Noobaa) endpoint
1) Creation of targe...
Soumya Koduri

11/04/2022

03:00 PM Bug #57911 (In Progress): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz

11/03/2022

07:11 PM Bug #57562: multisite replication issue on Quincy
We are still testing the latest evidence (HEAD at https://github.com/ceph/ceph/commit/cfc3bde36dbc9c6e0b7182bbb325390... Oguzhan Ozmen
07:02 PM Bug #57936 (Fix Under Review): 'radosgw-admin bucket chown' doesn't set bucket instance owner or ...
Daniel Gryniewicz
02:16 PM Bug #57936 (In Progress): 'radosgw-admin bucket chown' doesn't set bucket instance owner or unlin...
Casey Bodley
02:13 PM Bug #57968 (New): Partial fix for XML responses returning different order of XML elements
Hi
This is a follow up on original problem reported here
https://tracker.ceph.com/issues/52027
I've added my com...
Daniel Iwan
02:13 PM Bug #57951 (Fix Under Review): rgw: lc: lc for a single large bucket can run too long
Casey Bodley
02:03 PM Bug #57724 (In Progress): Keys returned by Admin API during user creation on secondary zone not v...
Casey Bodley
08:50 AM Bug #44660: Multipart re-uploads cause orphan data
As it was discussed in [1] there is already a wip PR with more generic solution [2].
[1] https://github.com/ceph/c...
Mykola Golub

11/02/2022

08:27 PM Feature #57965 (Resolved): Add new zone option to control whether an object's first data stripe i...
Delete requests are quite slow on clusters that have a data pool backed by HDDs, especially with an EC pool. For exam... Cory Snyder
06:58 PM Bug #57562: multisite replication issue on Quincy
Yeah, both those commits are gone, make sure you only have the newest one. Adam Emerson
06:33 PM Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Pushed a new version with what should be a fix for multi-thread and multi-client races.
We ...
Oguzhan Ozmen
07:22 AM Bug #57562: multisite replication issue on Quincy
Pushed a new version with what should be a fix for multi-thread and multi-client races. Adam Emerson
08:07 AM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
FI pull-request https://github.com/ceph/ceph/pull/48704 Peter Goron

11/01/2022

08:12 PM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
FI Working on https://github.com/pgoron/ceph/commits/fix_rgw_rados_leaks_57942 to fix both issues (index entry leaks ... Peter Goron
07:25 PM Bug #57562: multisite replication issue on Quincy
Agree as you mentioned, the other solution could be, secondary not limited to just listening on to orpan part, but co... Krunal Chheda
06:37 PM Bug #57562: multisite replication issue on Quincy
Ah, I see, I need to update the async lister. Adam Emerson
06:36 PM Bug #57562: multisite replication issue on Quincy
That's the point of the commit `rgw/fifo: `part_full` is not a reliable indicator`. There is no 'orphan part' in that... Adam Emerson
05:39 PM Bug #57562: multisite replication issue on Quincy
Hey Adam,
Just a heads-up we tested with latest commit and we still see the issue.
The issue is seen when running M...
Krunal Chheda
02:11 PM Bug #57562: multisite replication issue on Quincy
Thank you Adam. We'll test with the latest change. Oguzhan Ozmen
04:27 PM Bug #44660 (Fix Under Review): Multipart re-uploads cause orphan data
Actually it looks like there is a simpler solution to this problem, which uses the meta object lock when checking if ... Mykola Golub

10/31/2022

06:56 PM Bug #57562: multisite replication issue on Quincy
Pushed a new version that should make listing list all the objects reliably. Adam Emerson
04:40 PM Bug #57951 (Pending Backport): rgw: lc: lc for a single large bucket can run too long
If this happens, other lc hosts/threads can attempt to process the same bucket, which inflates overhead without any c... Matt Benjamin
09:44 AM Feature #57947 (Pending Backport): Improve performance of multi-object delete by handling individ...
Multi-object deletes are currently quite slow. The handler for this method currently just loops through the list of o... Cory Snyder
08:10 AM Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
After digging more on the issue, I think the root cause is linked to following code:
https://github.com/ceph/ceph/...
Peter Goron

10/29/2022

01:32 AM Bug #57853: multisite sync process block after long time running
I think something wrong with rgw-coroutine,please check the above PR zhipeng li
01:30 AM Bug #57853: multisite sync process block after long time running
PR https://github.com/ceph/ceph/pull/48626 zhipeng li

10/28/2022

09:10 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
J. Eric Ivancich wrote:
> Nick,
>
> I don't know that I have a cluster at my fingertips that might be necessary t...
Nick Janus
07:23 PM Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
Nick,
I don't know that I have a cluster at my fingertips that might be necessary to test this potential fix. How ...
J. Eric Ivancich
07:21 PM Bug #57770 (Fix Under Review): RGW (pacific) misplaces index entries after dynamically resharding...
J. Eric Ivancich
12:36 PM Bug #57942 (Duplicate): rgw leaks rados objects when a part is submitted multiple times in a mult...
Hello,
Issue presented below affects all ceph versions at least since 14.2 (reproducer tested on 14.2, 15.2, 16.2,...
Peter Goron

10/27/2022

09:21 PM Bug #57562: multisite replication issue on Quincy
Pushed a newer, newer fix that guards all calls to _prepare_new_head behind check/set of preparing. Adam Emerson
04:15 PM Bug #57562: multisite replication issue on Quincy
Pushed a newer fix that does the check in need_new_head() Adam Emerson
02:01 PM Bug #57562: multisite replication issue on Quincy
Hi Adam,
We obtained the extra logging with the fix in place.
I think the contention is not within _prepare_ne...
Oguzhan Ozmen
01:09 AM Bug #57562: multisite replication issue on Quincy
I expect there are multiple problems with sync in Quincy, so I don't expect this to actually make sync work.
But i...
Adam Emerson
12:15 AM Bug #57562: multisite replication issue on Quincy
Pulled the changes in on top of the commit _9056dbcdeaa7f4350b54a69f669982358ec5448e_ (on main branch). Unfortunately... Oguzhan Ozmen
02:31 PM Bug #57928 (Duplicate): Octopus:multisite sync process block after long time running
Casey Bodley
02:31 PM Bug #57927 (Duplicate): pacific:multisite sync process block after long time running
Casey Bodley
07:36 AM Cleanup #57938 (Pending Backport): relying on boost flatmap emplace behavior is risky
see coverity issue: http://folio07.front.sepia.ceph.com/main/ceph-main-98d41855/cov-main-html/3/2253rgw_trim_bilog.cc... Yuval Lifshitz

10/26/2022

10:01 PM Bug #57936 (Pending Backport): 'radosgw-admin bucket chown' doesn't set bucket instance owner or ...
steps to reproduce:
1. start a vstart cluster and create a bucket as user 'testid'...
Casey Bodley
05:08 PM Bug #57562: multisite replication issue on Quincy
Awesome! Thanks for the quick turn around! Will pull and test. Jane Zhu
04:49 PM Bug #57562 (Fix Under Review): multisite replication issue on Quincy
I have a candidate fix at https://github.com/ceph/ceph/pull/48632 Adam Emerson
02:14 PM Bug #57562: multisite replication issue on Quincy
FYI: We pulled in the 2 PRs Casey posted in the tracker https://tracker.ceph.com/issues/57783, and tested again with ... Jane Zhu
12:31 PM Bug #57562: multisite replication issue on Quincy
FWIW, below provides some log snippets with enhanced events. To be specific, some existing log events are added addit... Oguzhan Ozmen
03:48 AM Bug #57853: multisite sync process block after long time running
Quincy、Pacific、Octopus、 Nautilus has same issue zhipeng li
03:27 AM Bug #57928 (Duplicate): Octopus:multisite sync process block after long time running
1、deploy RADOSGW multisite
2、put lot of objects
3、keep it runing for a long time
zhipeng li
03:25 AM Bug #57927: pacific:multisite sync process block after long time running
same as https://tracker.ceph.com/issues/57853 zhipeng li
03:24 AM Bug #57927 (Duplicate): pacific:multisite sync process block after long time running
1、deploy RADOSGW multisite
2、put lot of objects
3、keep it runing for a long time
zhipeng li

10/25/2022

10:20 PM Bug #57562 (In Progress): multisite replication issue on Quincy
Small reproducer turned out to not be, but fixing that. Adam Emerson
04:51 PM Bug #57562: multisite replication issue on Quincy
Thank you. Adam Emerson
04:34 PM Bug #57562: multisite replication issue on Quincy
Please see the following FIFO log snippets. And please let me know if you need more.
The creation of data_log.34.n...
Jane Zhu
03:53 PM Bug #57562: multisite replication issue on Quincy
Can we get a more complete log snippet? All the FIFO logging with the relevant TIDs would make tracing what's going o... Adam Emerson
03:12 PM Bug #57562: multisite replication issue on Quincy
thanks, that's very interesting Matt Benjamin
02:59 PM Bug #57562: multisite replication issue on Quincy
We pretty much narrowed down what the problem is: a race condition has been identified in FIFO::_prepare_new_head(..)... Jane Zhu
07:18 AM Bug #57919 (New): bucket can not be resharded after cancelling prior reshard process
Hi,
we run a multisite setup where only the metadata get synced, but not the actual data.
I wanted to reshard a b...
Boris B
05:52 AM Bug #56248: crash: rgw::ARN::ARN(rgw_bucket const&)
Fixed in https://tracker.ceph.com/issues/55765 and https://github.com/ceph/ceph/pull/47194/commits is waiting for rel... Tobias Urdin
05:47 AM Bug #56248: crash: rgw::ARN::ARN(rgw_bucket const&)
We had a RGW crash on this as well some hours ago.... Tobias Urdin

10/24/2022

03:52 PM Bug #19988 (Resolved): RGW: can't stack compression and encryption filters
Casey Bodley
11:37 AM Bug #44660: Multipart re-uploads cause orphan data
Looking at the code. In `MultipartObjectProcessor::process_first_chunk`, if writing the multipart object first chunk ... Mykola Golub

10/23/2022

07:05 PM Bug #57899 (Fix Under Review): admin: cannot use tenant with notification topic
Yuval Lifshitz

10/21/2022

10:39 AM Bug #57911 (Pending Backport): Segmentation fault when uploading file with bucket policy on Quincy
RGW crashes when a file is uploaded and a bucket policy has been set up.
The crash has been "reproduced for latest...
Jan Graichen

10/20/2022

02:24 PM Bug #57770 (Triaged): RGW (pacific) misplaces index entries after dynamically resharding bucket
Casey Bodley
02:24 PM Bug #57770 (New): RGW (pacific) misplaces index entries after dynamically resharding bucket
Casey Bodley
02:21 PM Bug #57783: multisite: data sync reports shards behind after source zone fully trims datalog
related work in https://github.com/ceph/ceph/pull/47682 and https://github.com/ceph/ceph/pull/48397
Casey Bodley
02:20 PM Bug #57804: Enabling sync on bucket not working
i can only recommend running the command until it succeeds Casey Bodley
02:18 PM Bug #57853 (Need More Info): multisite sync process block after long time running
Casey Bodley
02:16 PM Bug #57901 (Fix Under Review): s3:ListBuckets response limited to 1000 buckets (by default) since...
Casey Bodley
02:11 PM Bug #57231 (Resolved): Valgrind: jump on unitialized in s3select
Casey Bodley
01:30 PM Bug #57905 (Pending Backport): multisite: terminate called after throwing an instance of 'ceph::b...
example from rgw/multisite suite: http://qa-proxy.ceph.com/teuthology/cbodley-2022-10-19_23:28:37-rgw-wip-cbodley-tes... Casey Bodley
05:57 AM Bug #57562: multisite replication issue on Quincy
We have an example scenario here where one of the objects in a bucket failed to be synced to the secondary.
* Mdlog...
Jane Zhu

10/19/2022

09:28 PM Bug #57901 (Resolved): s3:ListBuckets response limited to 1000 buckets (by default) since Octopus
Since Octopus, s3:ListBuckets is limited to rgw_list_buckets_max_chunk buckets in its response due to loss of truncat... Joshua Baergen
03:05 PM Bug #16767 (In Progress): RadosGW Multipart Cleanup Failure
Matt Benjamin
02:55 PM Bug #16767: RadosGW Multipart Cleanup Failure
Vicki Good wrote:
> I've encountered this bug in Ceph 14 and 15 and it's a pretty big problem for us for the same re...
Rok Jaklic
01:20 PM rgw-testing Bug #54104: test_rgw_datacache.py: s3cmd fails with '403 (SignatureDoesNotMatch)' in ubuntu
ping @Mark, this remains a blocker for enabling ubuntu in the rgw/verify suite. that subsuite contains most of our fu... Casey Bodley
01:11 PM Bug #57899 (Pending Backport): admin: cannot use tenant with notification topic
issue was a regression introduced in: 200f71a90c9e77c91452cec128c2c8be0d3d6f1f
topic notification commands should be...
Yuval Lifshitz

10/18/2022

04:13 PM Backport #57889 (Rejected): pacific: amqp: rgw crash when ca location is used for amqp connections
Backport Bot
04:12 PM Backport #57888 (In Progress): quincy: amqp: rgw crash when ca location is used for amqp connections
https://github.com/ceph/ceph/pull/54170 Backport Bot
04:08 PM Bug #57850 (Pending Backport): amqp: rgw crash when ca location is used for amqp connections
Yuval Lifshitz
03:39 PM Bug #57881 (Fix Under Review): LDAP invalid password resource leak fix
Casey Bodley
09:56 AM Bug #57881: LDAP invalid password resource leak fix
I created a pull request for a possible fix:
https://github.com/ceph/ceph/pull/48509
Johannes Liebl
01:02 PM Bug #57877 (Fix Under Review): rgw: some operations may not have a valid bucket object
Casey Bodley

10/17/2022

12:30 PM Bug #57881 (Pending Backport): LDAP invalid password resource leak fix
I have noticed that in the case a User tries to log in using LDAP with a wrong password, two new LDAP sessions will b... Johannes Liebl
09:19 AM Bug #57877 (Resolved): rgw: some operations may not have a valid bucket object
Some codepaths may not always have a valid bucket, so add checks to detect this. Abhishek Lekshmanan
 

Also available in: Atom