Activity
From 10/10/2021 to 11/08/2021
11/08/2021
- 11:29 PM Bug #53002 (Fix Under Review): crash BlueStore::Onode::put from BlueStore::TransContext::~TransCo...
- 11:29 PM Bug #53002 (Pending Backport): crash BlueStore::Onode::put from BlueStore::TransContext::~TransCo...
- 11:26 PM Bug #49815 (Resolved): BlueRocksEnv::GetChildren may pass trailing slashes to BlueFS readdir
- 06:00 PM Backport #53196 (Resolved): octopus: fsck/repair uses invalid prefix when removing undecodable Sh...
- https://github.com/ceph/ceph/pull/43883
- 06:00 PM Backport #53195 (Resolved): pacific: fsck/repair uses invalid prefix when removing undecodable Sh...
- 05:57 PM Bug #53011 (Pending Backport): fsck/repair uses invalid prefix when removing undecodable Shared Blob
- 05:50 PM Bug #53011: fsck/repair uses invalid prefix when removing undecodable Shared Blob
- https://github.com/ceph/ceph/pull/43621 merged
- 11:18 AM Bug #53184: failed to start new osd due to SIGSEGV in BlueStore::read()
- Looks like list of OSD's collections is empty. I don't know the root cause but I'm getting pretty the same effects on...
- 07:33 AM Bug #53184 (Closed): failed to start new osd due to SIGSEGV in BlueStore::read()
- A new OSD failed to start due to SIGSEGV. Here is the backtrace.
```
debug -3> 2021-11-08T07:06:17.324+0000 7... - 11:02 AM Bug #53139: OSD might wrongly attempt to use "slow" device when single device is backing the store
- A. Saber Shenouda wrote:
> In which case this bug can be reproduced exactly?
I failed to reproduce that locally h... - 09:47 AM Bug #53185 (Resolved): FSCK removes allocation file when called in DEEP mode causing next mount t...
- FSCK removes allocation file when called in DEEP mode causing next mount to do unnecessary full recovery.
11/05/2021
- 12:59 PM Bug #53139: OSD might wrongly attempt to use "slow" device when single device is backing the store
- In which case this bug can be reproduced exactly?
- 10:03 AM Bug #53139: OSD might wrongly attempt to use "slow" device when single device is backing the store
- Adam Kupczyk wrote:
> It looks to me more like a problem with allocator code itself.
> Checking douts from _allocat... - 09:44 AM Bug #53139 (Fix Under Review): OSD might wrongly attempt to use "slow" device when single device ...
11/04/2021
- 09:32 PM Bug #53139 (In Progress): OSD might wrongly attempt to use "slow" device when single device is ba...
- 08:07 PM Bug #53062 (Resolved): OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- 08:07 PM Backport #53124 (Resolved): pacific: OMAP upgrade to PER-PG format result in ill-formatted OMAP k...
- 06:37 PM Backport #53124: pacific: OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/43793
merged - 12:15 PM Bug #50788 (Duplicate): crash in BlueStore::Onode::put()
11/03/2021
- 03:22 PM Bug #53139: OSD might wrongly attempt to use "slow" device when single device is backing the store
- It looks to me more like a problem with allocator code itself.
Checking douts from _allocate() (or rather lack of th... - 01:49 PM Bug #53139 (Resolved): OSD might wrongly attempt to use "slow" device when single device is backi...
- This looks like a regression introduced by https://github.com/ceph/ceph/pull/42992
Providing RocksDB with an additio... - 02:47 PM Backport #53124 (In Progress): pacific: OMAP upgrade to PER-PG format result in ill-formatted OMA...
- https://github.com/ceph/ceph/pull/43793
- 01:52 PM Backport #53124: pacific: OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- Neha Ojha wrote:
> Igor, can you please take care of the backport?
yeah, in progress atm
- 01:50 PM Backport #53124: pacific: OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- Igor, can you please take care of the backport?
11/02/2021
- 09:50 PM Backport #51648: nautilus: Bluestore repair might erroneously remove SharedBlob entries.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43365
m... - 03:54 PM Bug #53129 (Resolved): BlueFS truncate() and poweroff can create corrupted files
- It is possible to create condition in which a BlueFS contains file that is corrupted.
It can happen when BlueFS repl... - 12:12 PM Bug #53002 (In Progress): crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
11/01/2021
- 07:15 PM Backport #53124 (Resolved): pacific: OMAP upgrade to PER-PG format result in ill-formatted OMAP k...
- https://github.com/ceph/ceph/pull/43793
- 07:13 PM Bug #53062 (Pending Backport): OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- 01:54 PM Backport #52934 (In Progress): pacific: os/bluestore: _do_write_small fix head_pad
- https://github.com/ceph/ceph/pull/43756
- 01:53 PM Backport #52935 (In Progress): octopus: os/bluestore: _do_write_small fix head_pad
- https://github.com/ceph/ceph/pull/43757
- 01:45 PM Backport #48477 (In Progress): octopus: osd: fix bluestore avl allocator
- 01:44 PM Backport #53104 (In Progress): octopus: os/bluestore: Improve _block_picker function
- 01:44 PM Backport #53102 (In Progress): octopus: os/bluestore/AvlAllocator: specialize _block_picker() an...
10/29/2021
- 11:58 PM Backport #53100: octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options
- https://github.com/ceph/ceph/pull/43747
- 10:51 PM Backport #53100 (In Progress): octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_...
- 10:36 PM Backport #53100: octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/43747
ceph-backport.sh versi... - 08:25 PM Backport #53100 (Resolved): octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_...
- https://github.com/ceph/ceph/pull/43747
- 11:57 PM Backport #53102: octopus: os/bluestore/AvlAllocator: specialize _block_picker() and cleanups
- https://github.com/ceph/ceph/pull/43747
- 08:25 PM Backport #53102 (Resolved): octopus: os/bluestore/AvlAllocator: specialize _block_picker() and c...
- https://github.com/ceph/ceph/pull/43747
- 11:57 PM Backport #53104: octopus: os/bluestore: Improve _block_picker function
- https://github.com/ceph/ceph/pull/43747
- 08:25 PM Backport #53104 (Resolved): octopus: os/bluestore: Improve _block_picker function
- https://github.com/ceph/ceph/pull/43747
- 11:57 PM Backport #48477: octopus: osd: fix bluestore avl allocator
- https://github.com/ceph/ceph/pull/43747
- 10:17 PM Backport #48477 (New): octopus: osd: fix bluestore avl allocator
- 03:01 PM Backport #48477: octopus: osd: fix bluestore avl allocator
- Please set:
- Status to New
Per Igor's feedback in https://tracker.ceph.com/issues/52804#note-16 - 11:19 PM Fix #48272 (Pending Backport): osd: fix bluestore avl allocator
- 03:01 PM Fix #48272: osd: fix bluestore avl allocator
- Please set:
- Backport to Octopus
- Status to Pending Backport
Per Igor's feedback in https://tracker.ceph.com/i... - 11:16 PM Backport #53105 (In Progress): pacific: os/bluestore: Improve _block_picker function
- 09:21 PM Backport #53105: pacific: os/bluestore: Improve _block_picker function
- https://github.com/ceph/ceph/pull/43745
- 08:25 PM Backport #53105 (Resolved): pacific: os/bluestore: Improve _block_picker function
- https://github.com/ceph/ceph/pull/43745
- 11:14 PM Backport #53103 (In Progress): pacific: os/bluestore/AvlAllocator: specialize _block_picker() an...
- 09:21 PM Backport #53103: pacific: os/bluestore/AvlAllocator: specialize _block_picker() and cleanups
- https://github.com/ceph/ceph/pull/43745
- 08:25 PM Backport #53103 (Resolved): pacific: os/bluestore/AvlAllocator: specialize _block_picker() and c...
- https://github.com/ceph/ceph/pull/43745
- 10:38 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Octopus backport PR: https://github.com/ceph/ceph/pull/43747
- 09:59 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Attaching chart with tail latency improvements (Octopus)
- 09:58 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Backported PRs:
https://github.com/ceph/ceph/pull/38148 (Octopus-only)
https://github.com/ceph/ceph/pull/41398 (O... - 09:18 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Pacific backport PR: https://github.com/ceph/ceph/pull/43745
Attaching chart with tail latency improvements. - 01:00 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Igor Fedotov wrote:
> If possible it would be great to have Pacific backports ASAP as new minor release is coming in... - 12:57 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Mauricio Oliveira wrote:
> @Igor
>
> Right, the key point is to backport the patches to Pacific/Octopus. I'm work... - 09:12 PM Backport #53101 (In Progress): pacific: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_...
- 09:09 PM Backport #53101: pacific: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/43745
ceph-backport.sh versi... - 08:25 PM Backport #53101 (Resolved): pacific: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_...
- https://github.com/ceph/ceph/pull/43745
- 08:21 PM Bug #53085 (Pending Backport): os/bluestore: Improve _block_picker function
- 02:51 PM Bug #53085: os/bluestore: Improve _block_picker function
- Master PR is merged.
Please set Status to Pending Backport.
- 02:50 PM Bug #53085 (Resolved): os/bluestore: Improve _block_picker function
- master tracker issue for https://github.com/ceph/ceph/pull/41398
- 08:21 PM Bug #53086 (Pending Backport): os/bluestore/AvlAllocator: specialize _block_picker() and cleanups
- 02:54 PM Bug #53086: os/bluestore/AvlAllocator: specialize _block_picker() and cleanups
- Master PR is merged.
Please set Status to Pending Backport. - 02:54 PM Bug #53086 (Resolved): os/bluestore/AvlAllocator: specialize _block_picker() and cleanups
- Master tracker issue for PR https://github.com/ceph/ceph/pull/41825
- 08:20 PM Bug #53087 (Pending Backport): os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* ...
- 02:58 PM Bug #53087: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options
- Master PR is merged.
Please set Status to Pending Backport. - 02:58 PM Bug #53087 (Resolved): os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options
- Master tracker issue for PR https://github.com/ceph/ceph/pull/41615
- 12:52 PM Backport #51763 (In Progress): pacific: Missed shared block repair doesn't fix related issues
- https://github.com/ceph/ceph/pull/43731
- 12:51 PM Backport #52768 (In Progress): pacific: bluestore repair might cause invalid write
- https://github.com/ceph/ceph/pull/43731
10/28/2021
10/27/2021
- 08:10 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- @Igor
Right, the key point is to backport the patches to Pacific/Octopus. I'm working on it, if that is OK w/ you.... - 12:12 PM Bug #53064: pgmeta onode isn't tagged with FLAG_PGMETA_OMAP if created in pre-mimic are
- Sorry, that's https://tracker.ceph.com/issues/53062 which is revealed by the issue, not https://tracker.ceph.com/issu...
- 11:58 AM Bug #53064 (New): pgmeta onode isn't tagged with FLAG_PGMETA_OMAP if created in pre-mimic are
- Legacy pgmeta onodes might have the flag unset after upgrading from Luminous (and before) to recent Ceph releases. Wh...
- 11:42 AM Bug #53062 (Fix Under Review): OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- 11:09 AM Bug #53062 (Resolved): OMAP upgrade to PER-PG format result in ill-formatted OMAP keys.
- Looks like the code appends the full legacy key to the new prefix rather that use the user-provided OMAP name from th...
10/26/2021
10/25/2021
- 03:53 PM Bug #52399: src/os/bluestore/HybridAllocator.cc: FAILED ceph_assert(false)
- We need both of these:
https://github.com/ceph/ceph/pull/43645
https://github.com/ceph/ceph/pull/43583
10/22/2021
10/21/2021
- 11:58 PM Backport #51648 (Resolved): nautilus: Bluestore repair might erroneously remove SharedBlob entries.
- 10:36 PM Backport #51648: nautilus: Bluestore repair might erroneously remove SharedBlob entries.
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/43365
merged - 07:29 PM Bug #53011 (Fix Under Review): fsck/repair uses invalid prefix when removing undecodable Shared Blob
- 03:38 PM Bug #53011 (Resolved): fsck/repair uses invalid prefix when removing undecodable Shared Blob
- 12:16 PM Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
- In frame 7 I can print the Onode. Some of the vals look quite strange (but I don't know if that's normal):...
- 09:45 AM Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
- More context: the cluster was upgraded from 14.2.20 to 15.2.14 two weeks ago. We've never seen this before today; it ...
- 09:43 AM Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
- Igor Fedotov wrote:
> Dan van der Ster wrote:
> > We've just seen this crash in the wild running 15.2.14. Maybe a d... - 09:39 AM Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
- Dan van der Ster wrote:
> We've just seen this crash in the wild running 15.2.14. Maybe a dup of #50788?
I'm pret... - 08:43 AM Bug #53002 (Duplicate): crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
- We've just seen this crash in the wild running 15.2.14. Maybe a dup of #50788?...
10/20/2021
- 02:54 PM Bug #52399 (Fix Under Review): src/os/bluestore/HybridAllocator.cc: FAILED ceph_assert(false)
- 02:51 PM Bug #52399: src/os/bluestore/HybridAllocator.cc: FAILED ceph_assert(false)
- PR is ready
10/19/2021
- 08:54 PM Bug #22066: bluestore osd asserts repeatedly with ceph-12.2.1/src/include/buffer.h: 882: FAILED a...
- Eric Nelson wrote:
> No problem, for the time being some of these have been converted to bluestore osds with wal/db ... - 01:39 PM Bug #52939: lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- steps to reproduce:...
- 01:34 PM Bug #52939: lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- i tested with the fix from https://github.com/ceph/ceph/pull/43589 (commit 4b23ecfa2967d1df37562c67df028b5ce1700afb),...
- 12:48 PM Bug #52939 (Fix Under Review): lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
10/18/2021
- 11:11 AM Bug #52804 (Triaged): pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- 10:50 AM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Additionally it would be interesting to learn allocations which produce that "latency tail". Would you add some print...
- 10:46 AM Bug #52804 (New): pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- @Mauricio - great findings, thanks a lot!
So the major point is that we need to backport all the mentioned patches ...
10/15/2021
- 10:38 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Part 4)
Fake unit test:
@ ceph.git/src/test/objectstore/Allocator_test.cc... - 10:34 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Part 3)
With these commits for the AVL allocator backported to v15.2.12 (at least one is already in the latest 15.... - 10:33 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Part 2)
This does _not_ seem to be a regression from this commit introduced in v15.2.9:
c25def8 octopus: os/blues... - 10:33 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Part 1)
The numbers for the Bitmap and AVL allocators show a long tail on AVL only:
- average of 3 runs
- bitmap... - 10:32 PM Bug #52804: pacific: Hybrid Allocator exhibits high tail latency for writes in Octopus
- Good progress this week!
We could reproduce the issue with the allocator's state dump and the allocation requests ...
10/14/2021
- 08:07 PM Bug #52939: lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- @Adam, would you take a look, please?
- 07:09 PM Bug #52939: lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- update: the same vstart test workload is running stably with the commits from that pr reverted...
- 06:29 PM Bug #52939: lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- Igor Fedotov wrote:
> @Casey, as far as I understand this one is included in your build: https://github.com/ceph/cep... - 06:21 PM Bug #52939: lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- @Casey, as far as I understand this one is included in your build: https://github.com/ceph/ceph/pull/42099
Could y... - 06:03 PM Bug #52939 (Rejected): lockdep cycle under BlueFS::_compact_log_async_LD_NF_D()
- reproduces reliably in vstart just after starting s3-tests against rgw on master, at commit c19250ce3cb30a5a0409e3ebd...
- 02:45 PM Backport #52935 (Resolved): octopus: os/bluestore: _do_write_small fix head_pad
- https://github.com/ceph/ceph/pull/43757
- 02:45 PM Backport #52934 (Resolved): pacific: os/bluestore: _do_write_small fix head_pad
- https://github.com/ceph/ceph/pull/43756
- 02:42 PM Fix #52922 (Pending Backport): os/bluestore: _do_write_small fix head_pad
- 03:31 AM Fix #52922 (Resolved): os/bluestore: _do_write_small fix head_pad
- https://github.com/ceph/ceph/pull/43498
check pad head range should be [offset - head_pad, head_pad].
10/12/2021
- 06:30 PM Bug #52234: crash in Throttle::get()
- May be this modification is relevant as it appeared in Pacific and touches throttling mechanics... https://github.com...
- 06:28 PM Bug #52234: crash in Throttle::get()
- So far kernels 4.18 and 5.11 are mentioned among suffering ones
- 06:16 PM Bug #52234: crash in Throttle::get()
- Relevant thread at ceph-users: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/GNBVS2AWHCOVMOTXPHZNIG...
- 05:40 PM Bug #52816 (Duplicate): Block.db has been migrated with ceph-volume lvm migrate and osd never sta...
Also available in: Atom