Activity
From 09/05/2022 to 10/04/2022
10/04/2022
- 05:39 PM Bug #57672: SSD OSD won't start after high framentation score!
- Sure. https://tracker.ceph.com/issues/57762
- 03:51 PM Bug #57672: SSD OSD won't start after high framentation score!
- Kevin Fox wrote:
> For the record, ssd/ssd or hdd/hdd seems to work fine even though the documentation makes it soun... - 03:30 PM Bug #57672: SSD OSD won't start after high framentation score!
- For the record, ssd/ssd or hdd/hdd seems to work fine even though the documentation makes it sound like it doesn't.
... - 05:38 PM Documentation #57762 (New): documentation about same hardware class wrong
- The documentation in at least one place:
https://docs.ceph.com/en/pacific/man/8/ceph-bluestore-tool/ bluefs-bdev-mig...
09/30/2022
- 02:34 AM Bug #57672: SSD OSD won't start after high framentation score!
- So, it looks like moving the db to a db volume works with ceph-bluestore-tool bluefs-bdev-migrate. So most of the way...
09/29/2022
- 09:01 PM Bug #57672: SSD OSD won't start after high framentation score!
- Igor Fedotov wrote:
> The issue with that fragmentation score is that there is no strong math behind it. Originally ... - 08:39 PM Bug #57672: SSD OSD won't start after high framentation score!
- Igor Fedotov wrote:
> If this is still available - may I ask you to run ceph-bluestore-tool's free-dump command and ... - 08:38 PM Bug #57672: SSD OSD won't start after high framentation score!
- Igor Fedotov wrote:
> This is totally irrelevant - these are warnings showing legacy formatted omaps for this OSD.... - 08:15 PM Bug #57672: SSD OSD won't start after high framentation score!
- Kevin Fox wrote:
> Random other thing... during repairs, I see:
> [root@pc20 ceph]# ceph-bluestore-tool --log-lev... - 08:10 PM Bug #57672: SSD OSD won't start after high framentation score!
- Kevin Fox wrote:
> Just saw this again, on a small scale. just one of the osds that I had moved the db off to its ow... - 08:08 PM Bug #57672: SSD OSD won't start after high framentation score!
- Kevin Fox wrote:
> One note I see in the rook documentation:
> "Notably, ceph-volume will not use a device of the s... - 08:06 PM Bug #57672: SSD OSD won't start after high framentation score!
- Kevin Fox wrote:
> Hi Igor,
>
>
> Does the fragmentation score alone show how fragmented things are? I still se... - 04:30 PM Bug #57672: SSD OSD won't start after high framentation score!
- Random other thing... during repairs, I see:
[root@pc20 ceph]# ceph-bluestore-tool --log-level 30 --path /var/lib/... - 04:19 PM Bug #57672: SSD OSD won't start after high framentation score!
- Got some more info... during the outage, I had 12 drives that wouldn't recover by moving off the db. looking back thr...
- 03:51 PM Bug #57672: SSD OSD won't start after high framentation score!
- Just saw this again, on a small scale. just one of the osds that I had moved the db off to its own volume, just enter...
09/28/2022
- 08:00 PM Bug #57672: SSD OSD won't start after high framentation score!
- One note I see in the rook documentation:
"Notably, ceph-volume will not use a device of the same device class (HDD,... - 07:32 PM Bug #57672: SSD OSD won't start after high framentation score!
- Hi Igor,
Thanks for the details. That makes sense and helps me feel much more comfortable that the hack I put in p... - 09:07 AM Bug #57672: SSD OSD won't start after high framentation score!
- Kevin Fox wrote:
> I can find no evidence that the cluster got full. I've seen it occasionally go up a little past 8... - 09:21 AM Backport #57688 (In Progress): quincy: unable to read osd superblock on AArch64 with page size 64K
- https://github.com/ceph/ceph/pull/48279
- 09:19 AM Backport #57687 (In Progress): pacific: unable to read osd superblock on AArch64 with page size 64K
- https://github.com/ceph/ceph/pull/48278
09/27/2022
- 06:04 PM Bug #57672: SSD OSD won't start after high framentation score!
- Thank you, Igor. I think Kevin answered as much as the background he had from the issue.
- 04:34 PM Backport #57688 (Resolved): quincy: unable to read osd superblock on AArch64 with page size 64K
- 04:34 PM Backport #57687 (Resolved): pacific: unable to read osd superblock on AArch64 with page size 64K
- 04:18 PM Bug #57537 (Pending Backport): unable to read osd superblock on AArch64 with page size 64K
09/26/2022
- 05:37 PM Bug #57672: SSD OSD won't start after high framentation score!
- I can find no evidence that the cluster got full. I've seen it occasionally go up a little past 85 (usually if I'm re...
- 05:34 PM Bug #57672: SSD OSD won't start after high framentation score!
- I switched one of the pods to /bin/bash and tried various things to fsck the osds. Every time it hit the point where ...
- 05:06 PM Bug #57672: SSD OSD won't start after high framentation score!
- The cluster involved was provisioned with ceph:v14.2.4-20190917 Oct 2019. Its been running nautilus until last month....
- 09:12 AM Bug #57672 (Need More Info): SSD OSD won't start after high framentation score!
- 09:12 AM Bug #57672: SSD OSD won't start after high framentation score!
- @Vikhyat - what Ceph release are we talking about?
- 08:46 AM Bug #57292 (Fix Under Review): Failed to start OSD when upgrading from nautilus to pacific with b...
09/23/2022
- 06:10 PM Bug #57672: SSD OSD won't start after high framentation score!
- User question:...
- 06:09 PM Bug #57672: SSD OSD won't start after high framentation score!
- The user was not able to capture any debug data because it hit the cluster so hard that it went down.
- 06:07 PM Bug #57672 (Duplicate): SSD OSD won't start after high framentation score!
- One of the rook upstream users reported this issue in the upstream rook channel!...
09/21/2022
- 12:06 AM Bug #57507: rocksdb crushed due to checksum mismatch
- Thank you very much Igor! I'll try this workaround and will update to v16.2.11 or later.
09/20/2022
- 09:51 PM Bug #52464: FAILED ceph_assert(current_shard->second->valid())
- Also reported via telemetry:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=245d... - 05:21 PM Bug #52464 (New): FAILED ceph_assert(current_shard->second->valid())
- It happened again: https://github.com/rook/rook/issues/10936. For logs see: https://github.com/rook/rook/issues/10936...
09/19/2022
- 04:43 PM Backport #57604 (Resolved): quincy: Log message is little confusing
- 04:43 PM Backport #57603 (Resolved): pacific: Log message is little confusing
- 04:34 PM Bug #57271 (Pending Backport): Log message is little confusing
- 03:34 PM Bug #57271: Log message is little confusing
- https://github.com/ceph/ceph/pull/47774 merged
- 11:58 AM Backport #57028 (In Progress): quincy: Bluefs might put an orpan op_update record in the log
- https://github.com/ceph/ceph/pull/48171
- 11:54 AM Backport #55301 (In Progress): quincy: Hybrid allocator might return duplicate extents when perfo...
- https://github.com/ceph/ceph/pull/48170
- 11:49 AM Backport #57458 (In Progress): quincy: bluefs fsync doesn't respect file truncate
- https://github.com/ceph/ceph/pull/48169
- 11:03 AM Backport #57027 (In Progress): pacific: Bluefs might put an orpan op_update record in the log
- https://github.com/ceph/ceph/pull/48168
- 10:47 AM Backport #55302 (Rejected): octopus: Hybrid allocator might return duplicate extents when perform...
- Octopus is at EOL
- 10:47 AM Backport #55300 (In Progress): pacific: Hybrid allocator might return duplicate extents when perf...
- https://github.com/ceph/ceph/pull/48167
- 09:11 AM Backport #55518 (Resolved): pacific: test_cls_rbd.sh: multiple TestClsRbd failures during upgrade...
- 09:10 AM Bug #54465 (Resolved): BlueFS broken sync compaction mode
- 09:10 AM Backport #55024 (Resolved): quincy: BlueFS broken sync compaction mode
- 09:09 AM Bug #54248 (Resolved): BlueFS improperly tracks vselector sizes in _flush_special()
- 09:09 AM Backport #54318 (Resolved): quincy: BlueFS improperly tracks vselector sizes in _flush_special()
- 09:08 AM Bug #53907 (Resolved): BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)
- 09:03 AM Backport #54209 (Resolved): quincy: BlueStore.h: 4148: FAILED ceph_assert(cur >= p.length)
09/16/2022
- 07:00 PM Bug #53184: failed to start new osd due to SIGSEGV in BlueStore::read()
- Hey Igor, do we have any more insight on this issue?
09/15/2022
- 10:00 AM Bug #57537: unable to read osd superblock on AArch64 with page size 64K
- Well, I can see your PR hence the root cause is apparently more or less clear. So no much sense in the above question...
- 09:43 AM Bug #57537 (Fix Under Review): unable to read osd superblock on AArch64 with page size 64K
- 09:29 AM Bug #57537: unable to read osd superblock on AArch64 with page size 64K
- Hi @luo - could you please answer the following questions:
- what Ceph release are you using?
- Is this a container... - 09:11 AM Bug #57507 (Triaged): rocksdb crushed due to checksum mismatch
- 09:10 AM Bug #57507: rocksdb crushed due to checksum mismatch
- I think you might be facing https://tracker.ceph.com/issues/54547
As far as I can see the preconditions for this bug...
09/14/2022
- 11:40 AM Bug #57537 (Resolved): unable to read osd superblock on AArch64 with page size 64K
- On aarch64 with page size 64k, it occurs occasionally "OSD::init(): unable to read osd superblock" when deploying osd...
09/13/2022
- 01:20 AM Bug #57507 (Duplicate): rocksdb crushed due to checksum mismatch
- OSD crushed with the following log.
```
2022-09-09 03:03:05 debug 2022-09-08T18:03:05.665+0000 7f236763e080 1 os...
09/12/2022
- 06:52 AM Bug #53266 (Resolved): default osd_fast_shutdown=true would cause NCB to recover allocation map o...
- 06:50 AM Backport #54523 (Resolved): quincy: default osd_fast_shutdown=true would cause NCB to recover all...
09/08/2022
- 05:25 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Wow, thank you, this is amazing. And great write-up in that kernel commit!
- 09:06 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Trent, thanks a lot for the update! Mind reposting this information on ceph-users mailing list? Or I can do that myse...
- 06:54 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- At Canonical we tracked down and solved the cause of this bug. Credit to my colleague Mauricio Faria de Oliveira for ...
- 08:21 AM Bug #56456: rook-ceph-v1.9.5: ceph-osd crash randomly
- Igor,
We can't know how large is the peak since we removed rook osd limits.
Without limit, the biggest OSD is takin...
09/07/2022
- 04:28 PM Bug #56456: rook-ceph-v1.9.5: ceph-osd crash randomly
- Hi Sebastien,
does the above mean that OSDs are using more that 15GB RAM at some point?
How large is the peak t... - 11:54 AM Backport #57458 (Resolved): quincy: bluefs fsync doesn't respect file truncate
- 11:52 AM Bug #55307 (Pending Backport): bluefs fsync doesn't respect file truncate
Also available in: Atom