Activity
From 01/03/2023 to 02/01/2023
02/01/2023
- 04:35 PM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Hi Igor,
Thanks for continuing to dig into this! Some answers to your questions:
> The first question would be ... - 11:21 AM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- And one more note:
The latest Pacific release (16.2.11) might show much better behavior in terms of log compaction a... - 11:12 AM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Joshua,
thanks for the log. Something interesting, indeed.
The first question would be - do you have any custom b... - 10:21 AM Feature #58421 (Fix Under Review): OSD metadata should show the min_alloc_size that each OSD was ...
- 12:46 AM Bug #58022: Fragmentation score rising by seemingly stuck thread
- We got some monitoring on a 3rd cluster. we're seeing it there too, though slower then the other cluster.
I was se...
01/31/2023
- 04:16 PM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Hi Igor,
> Now I'm wondering whether that high compaction rate persists permanently?
Ah, sorry, I should have m... - 03:34 PM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Hi Joshua,
good catch.
Now I'm wondering whether that high compaction rate persists permanently?
And if so - cou...
01/30/2023
- 01:21 PM Bug #54019: OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
- Christian Rohmann wrote:
> Thanks for fixing this ... but somehow the link to the ML (https://lists.ceph.io/hyperkit...
01/29/2023
- 09:42 PM Bug #54019: OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
- Thanks for fixing this ... but somehow the link to the ML (https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/C...
01/27/2023
- 05:56 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Just looking at lengths, there are lots of pretty small ones?:
[kfox@zathras tmp]$ jq '.extents[].length' osd.3 | s... - 05:52 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Please find dumps attached.
- 10:21 AM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Igor Fedotov wrote:
> One potential explanation can be pretty trivial: allocator keeps tracking a sort of history(hi... - 02:51 PM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Hey Igor, based on the discussion at the perf meeting yesterday we've added some exports for bluefs log stats. Here's...
01/26/2023
- 05:43 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- The patch implies that the calculation may be wrong? But why would behavior change on restart?
Thanks,
Kevin - 04:18 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- https://github.com/ceph/ceph/pull/49885
- 12:09 PM Bug #57507: rocksdb crushed due to checksum mismatch
- Deepika Upadhyay wrote:
> Hey Igor, did this bug got resolved in 16.2.11, can you share the tracker which might be a... - 07:30 AM Bug #57507: rocksdb crushed due to checksum mismatch
- Hey Igor, did this bug got resolved in 16.2.11, can you share the tracker which might be addressing that, thanks!
- 11:40 AM Backport #58588 (In Progress): quincy: OSD is unable to allocate free space for BlueFS
- https://github.com/ceph/ceph/pull/49884
- 01:01 AM Backport #58588 (Resolved): quincy: OSD is unable to allocate free space for BlueFS
- 01:02 AM Backport #58589 (Rejected): pacific: OSD is unable to allocate free space for BlueFS
- 12:35 AM Bug #53466 (Pending Backport): OSD is unable to allocate free space for BlueFS
01/25/2023
- 11:23 PM Bug #53466: OSD is unable to allocate free space for BlueFS
- https://github.com/ceph/ceph/pull/48854 merged
- 06:27 PM Feature #57785: fragmentation score in metrics
- Ok. Thanks.
- 02:00 PM Feature #57785: fragmentation score in metrics
- Hi Kevin,
We will implement the aligned fragmentation score after we merge https://github.com/ceph/ceph/pull/48854. - 05:59 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- For question 1, here's a couple of screenshots with consumed space and fragmentation added to both. utilization is pr...
- 05:24 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Hi Kevin,
I have two questions:
1) Is rising fragmentation score related to change of free space?
If no other...
01/24/2023
- 11:58 PM Feature #57785: fragmentation score in metrics
- Any updates on this?
Thanks,
Kevin - 05:43 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- After restart:
[root@rcs3 ~]# journalctl -u ceph-b15015c8-af07-4973-b35d-28c3bfd2af22@osd.4.service | grep probe
Ja... - 05:23 PM Bug #58256 (Resolved): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2: Expected: (logger->ge...
- 05:21 PM Bug #58256: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2: Expected: (logger->get(l_bluefs_...
- https://github.com/ceph/ceph/pull/49392 merged
01/23/2023
- 05:00 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Here's osd4 that was still running away this morning. I just restarted it. Here's the right before metrics. Will get ...
- 04:39 PM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Hey Igor, it just so happens that we've collected some perf dumps from a cluster that we upgraded this weekend. We ha...
- 02:43 PM Bug #58530: Pacific: Significant write amplification as compared to Nautilus
- Joshua, please share perf counter dumps for a couple of OSDs of each MAS.
01/20/2023
- 10:05 PM Bug #58530 (Triaged): Pacific: Significant write amplification as compared to Nautilus
- After upgrading multiple RBD clusters from 14.2.18 to 16.2.9, we've found that OSDs write significantly more to the u...
- 06:40 PM Feature #58113 (Fix Under Review): BLK/Kernel: Improve protection against running one OSD twice
- 05:38 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Here is a runaway one I restarted 2 days ago.
ELAPSED CMD
2-00:09:13 /usr/bin/ceph-osd -n osd.3 -f --setuser... - 05:06 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- I can get some more, but here's an initial bit.
osd.4 has been running away for a long time (at least 2 weeks. bas... - 02:10 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- At the first step I'd like to see allocation stats probes from OSD logs. Here is an example:
2023-01-20T16:28:41.4... - 01:59 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Vikhyat Umrao wrote:
> Igor/Adam - "But the behavior stops immediately on restart. So feels like some thread in the ...
01/19/2023
- 10:52 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- If I strace a run away osd, it shows up with 59 threads. If I do it to one that is not run away, it shows up with 59 ...
01/18/2023
- 05:16 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- Here's a picture.
- 05:13 PM Bug #58022: Fragmentation score rising by seemingly stuck thread
- We ended up slowly reformatting all of our osts and re-adding them. Things settled out to a fragmentation score of < ...
01/17/2023
- 12:40 AM Bug #58463 (Fix Under Review): RocksDBTransactionImpl::rm_range_keys doesn't use bound iterator
01/15/2023
- 11:24 PM Bug #58463 (Pending Backport): RocksDBTransactionImpl::rm_range_keys doesn't use bound iterator
- Hence this might cause slow omap enumeration when rocksdb has got tons of tombstones.
01/13/2023
- 04:35 PM Feature #58421: OSD metadata should show the min_alloc_size that each OSD was built with
- Ideally this will be available both via `ceph osd metadata` and the admin socket so as to dovetail into common metric...
- 12:14 PM Bug #53002 (Fix Under Review): crash BlueStore::Onode::put from BlueStore::TransContext::~TransCo...
- 12:14 PM Bug #58439 (Duplicate): octopus osd crash
- 09:57 AM Bug #58439 (Duplicate): octopus osd crash
- Hi,
I was not able to find another bug which looks exactly like this (I found https://tracker.ceph.com/issues/2497... - 11:59 AM Bug #58441 (New): ceph-bluestore-tool fsck crash with "FAILED ceph_assert(v.length() == p->shard_...
- After OSD crashed with "FAILED ceph_assert(v.length() == p->shard_info->bytes)" (crash report here https://gist.githu...
- 10:46 AM Bug #58440 (Resolved): BlueFS spillover alert is broken
- Apparently this has been removed by https://github.com/ceph/ceph/commit/d17cd6604b4031ca997deddc5440248aff451269#diff...
01/11/2023
- 10:53 PM Feature #58421 (Pending Backport): OSD metadata should show the min_alloc_size that each OSD was ...
To be very clear, the value the OSD was built with, *not* the prevailing value in `ceph.conf` or the central db.
...- 06:42 AM Bug #53184: failed to start new osd due to SIGSEGV in BlueStore::read()
- Hi Igor
I'm working with Satoru and Yuma, and I was trying to reproduce the problem with Ceph v17.2.5 and Rook v1.... - 02:01 AM Bug #58418 (New): unittest mempool always fail on Arm64 CI node
- 57: /root/ceph/src/test/test_mempool.cc:433: Failure
57: Expected: (missed) < (mempool::num_shards / 2), actual: 28 ...
01/10/2023
- 11:40 AM Bug #56382: ONode ref counting is broken
- Joshua Baergen wrote:
> All of the tickets related to each other for this problem are marked Duplicate. Which should... - 11:38 AM Bug #56382 (Fix Under Review): ONode ref counting is broken
01/09/2023
- 02:40 PM Bug #56382: ONode ref counting is broken
- All of the tickets related to each other for this problem are marked Duplicate. Which should be the main tracker for ...
01/03/2023
- 08:01 AM Bug #58274: BlueStore::collection_list becomes extremely slow due to unbounded rocksdb iteration
- yixing hao wrote:
> Also observed from our HDD bluestore cluster with tens of billions of objects, the stack is like... - 07:51 AM Bug #58274: BlueStore::collection_list becomes extremely slow due to unbounded rocksdb iteration
- Also observed from our HDD bluestore cluster with tens of billions of objects, the stack is like the above.
7ffad9...
Also available in: Atom