Project

General

Profile

Activity

From 10/19/2022 to 11/17/2022

11/17/2022

10:37 PM Feature #57785: fragmentation score in metrics
We have a meeting scheduled for next week to discuss this topic. Laura Flores
06:30 PM Feature #57785: fragmentation score in metrics
❤️ Kevin Fox
06:28 PM Feature #57785: fragmentation score in metrics
Thanks, Kevin. Let me talk this over with Adam and Paul, and we will decide a course of action. Laura Flores
06:15 PM Feature #57785: fragmentation score in metrics
A ceph warning for it would also be quite useful I think.
https://access.redhat.com/documentation/fr-fr/red_hat_ceph...
Kevin Fox
06:09 PM Feature #57785: fragmentation score in metrics
Thanks for sharing this, Kevin. We discussed this Tracker more in the Telemetry huddle, and we are curious if you wou... Laura Flores
05:11 PM Feature #57785: fragmentation score in metrics
We've had to hack a script together to monitor one of our clusters, and it has been useful to catch an issue:
https:...
Kevin Fox
04:25 PM Feature #57785: fragmentation score in metrics
@Kevin I have asked Paul Cuzner to take a look at this tracker and offer his opinion, as he has done a lot of work fo... Laura Flores

11/15/2022

09:56 AM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
In the customer case running luminous when OSD process was run twice (let's skip how), the assert
'file->fnode.ino' ...
Adam Kupczyk

11/14/2022

05:06 PM Bug #58022 (Pending Backport): Fragmentation score rising by seemingly stuck thread
Due to issue https://tracker.ceph.com/issues/57672 we've been monitoring our clusters closely ensure it doesn't run i... Kevin Fox
12:10 PM Bug #53466 (Fix Under Review): OSD is unable to allocate free space for BlueFS
Igor Fedotov

11/08/2022

09:20 PM Feature #57785: fragmentation score in metrics
@Vikhyat, no worries. Based on Kevin's comment, I think this metric might be better suited for Prometheus than Teleme... Laura Flores
06:37 PM Feature #57785: fragmentation score in metrics
Laura - sorry I missed the update. Can you please ping Adam and Igor? Vikhyat Umrao
07:37 PM Fix #54299 (Need More Info): osd error restart
Igor Fedotov
07:34 PM Bug #57672 (Duplicate): SSD OSD won't start after high framentation score!
Igor Fedotov
07:27 PM Bug #53466 (In Progress): OSD is unable to allocate free space for BlueFS
Igor Fedotov

10/28/2022

01:39 AM Feature #57785: fragmentation score in metrics
Ultimately, I'd like it in prometheus, so I can setup alerts if it gets too high. Kevin Fox

10/27/2022

05:28 PM Feature #57785: fragmentation score in metrics
Kevin Fox wrote:
> Currently the bluestore fragmentation score does not seem to be exported in metrics. Due to the i...
Yaarit Hatuka

10/24/2022

11:24 AM Bug #57895: OSD crash in Onode::put()
OK, thanks Igor for your confirmation, I'm reviewing your patch, we can discuss over there. dongdong tao
02:48 AM Bug #57855: cannot enable level_compaction_dynamic_level_bytes
db_paths is not compatible with level_compaction_dynamic_level_bytes. Beom-Seok Park

10/21/2022

06:26 PM Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
Hi Sven,
Thanks for reporting telemetry! The issue you reported is tracked in https://tracker.ceph.com/issues/5620...
Yaarit Hatuka
04:41 PM Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
We have almost daily crashes on our octopus cluster, which are also reported via telemetry, which look like this bug,... Anonymous
10:28 AM Bug #57895: OSD crash in Onode::put()
dongdong tao wrote:
> Yaarit Hatuka wrote:
> > Status changed from "New" to "Duplicate" since this issue duplicates...
Igor Fedotov
12:20 AM Bug #57895: OSD crash in Onode::put()
Yaarit Hatuka wrote:
> Status changed from "New" to "Duplicate" since this issue duplicates https://tracker.ceph.com...
dongdong tao

10/20/2022

10:47 PM Feature #57785: fragmentation score in metrics
I'm just a user so I can't answer some of the questions. I'll fill in what I know though.
1. Not sure
3. No priva...
Kevin Fox
10:26 PM Feature #57785: fragmentation score in metrics
Hey Kevin (and Vikhyat),
I have a few questions regarding the fragmentation score:
1. Where are all the places ...
Laura Flores
01:51 PM Bug #57895 (Duplicate): OSD crash in Onode::put()
Status changed from "New" to "Duplicate" since this issue duplicates https://tracker.ceph.com/issues/56382. Yaarit Hatuka
10:10 AM Bug #57895: OSD crash in Onode::put()
Please help to review this one, https://github.com/ceph/ceph/pull/48566
Here is the related log: https://pastebin....
dongdong tao
10:54 AM Bug #56851: crash: int BlueStore::read_allocation_from_onodes(SimpleBitmap*, BlueStore::read_allo...
@Sudhin - curious if you can reproduce the issue? If so it would be great to get OSD log with debug-bluestore set to ... Igor Fedotov
10:52 AM Bug #52464: FAILED ceph_assert(current_shard->second->valid())
IMO this is rather related to DB sharding stuff introduced by https://github.com/ceph/ceph/pull/34006
Hence reassign...
Igor Fedotov
10:46 AM Bug #52464: FAILED ceph_assert(current_shard->second->valid())
Neha Ojha wrote:
> Gabi, I am assigning it to you for now, since this looks related to NCB.
No, apparently this i...
Igor Fedotov
09:49 AM Bug #57857 (Fix Under Review): KernelDevice::read doesn't translate error codes correctly
Igor Fedotov
09:40 AM Bug #56382 (Fix Under Review): ONode ref counting is broken
Igor Fedotov
09:10 AM Bug #56382 (Pending Backport): ONode ref counting is broken
Igor Fedotov

10/19/2022

01:45 PM Bug #57855: cannot enable level_compaction_dynamic_level_bytes
I found that the level_compaction_dynamic_level_bytes option does not apply if opt.db_paths exists when opening rocks... Beom-Seok Park
01:26 PM Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
Benoît Knecht wrote:
> > I see this was backported in: https://github.com/ceph/ceph/pull/45963 but was later reverte...
Igor Fedotov
12:09 PM Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
Sven Kieske wrote:
> I assume this was not backported to the last octopus release?
Yes, the octopus is EOL
Konstantin Shalygin
12:04 PM Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
> I see this was backported in: https://github.com/ceph/ceph/pull/45963 but was later reverted in https://github.com/... Benoît Knecht
11:21 AM Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
Sven Kieske wrote:
> I don't see the PR showing up in any release notes. I assume this was not backported to the las...
Anonymous
11:16 AM Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
I don't see the PR showing up in any release notes. I assume this was not backported to the last octopus release? In ... Anonymous
09:06 AM Bug #55324 (Resolved): rocksdb omap iterators become extremely slow in the presence of large dele...
Igor Fedotov
11:49 AM Bug #57895: OSD crash in Onode::put()
This is observed from 15.2.16, but I believe the code defect to cause this kind of race condition is still present on... dongdong tao
11:42 AM Bug #57895 (Duplicate): OSD crash in Onode::put()

This issue happens when an Onode is being trimmed right away after it's unpinned. This is possible when the LRU lis...
dongdong tao
08:46 AM Bug #55328 (Closed): OSD crashed due to checksum error
Igor Fedotov
 

Also available in: Atom