Activity
From 10/08/2020 to 11/06/2020
11/06/2020
- 04:39 PM Backport #40449: nautilus: "no available blob id" assertion might occur
- @Alexander - it might make sense to open a new bug in the Bluestore project for that, since this one is closed.
- 03:38 PM Backport #40449: nautilus: "no available blob id" assertion might occur
- Alexander Patrakov wrote:
> Nathan Cutler wrote:
> > This update was made using the script "backport-resolve-issue"... - 02:37 PM Backport #40449: nautilus: "no available blob id" assertion might occur
- Nathan Cutler wrote:
> This update was made using the script "backport-resolve-issue".
> backport PR https://github... - 12:36 PM Bug #48036: bluefs corrupted in a OSD
- > Unfortunately, after capturing logs, this problem hasn't been reproduced.
More precisely, with setting `debug-bl... - 11:14 AM Bug #48036: bluefs corrupted in a OSD
- > > Your initial analysis about ino 26 being removed and later reused is very helpful and indicative. Wondering if th...
11/05/2020
- 07:59 PM Backport #47894 (Resolved): nautilus: Compressed blobs lack checksums
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37843
m... - 05:19 PM Backport #47894: nautilus: Compressed blobs lack checksums
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37843
merged - 07:59 PM Backport #47707 (Resolved): nautilus: Potential race condition regression around new OSD flock()s
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37842
m... - 05:19 PM Backport #47707: nautilus: Potential race condition regression around new OSD flock()s
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37842
merged - 05:59 PM Backport #46008 (Resolved): nautilus: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37824
m... - 05:18 PM Backport #46008: nautilus: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 failed
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37824
merged
11/04/2020
- 02:12 PM Backport #46194 (In Progress): nautilus: BlueFS replay log grows without end
- 11:38 AM Backport #46194: nautilus: BlueFS replay log grows without end
- Fixed by https://github.com/ceph/ceph/pull/37948
11/03/2020
- 11:26 AM Backport #48094 (Resolved): octopus: Hybrid allocator might segfault when fallback allocator is p...
- https://github.com/ceph/ceph/pull/38428
- 11:26 AM Backport #48093 (Resolved): nautilus: Hybrid allocator might segfault when fallback allocator is ...
- https://github.com/ceph/ceph/pull/38637
- 11:25 AM Backport #48092 (Rejected): mimic: Hybrid allocator might segfault when fallback allocator is pre...
- 02:09 AM Bug #48025: osd start up failed when osd superblock crc fail
- Igor Fedotov wrote:
> Bo Zhang wrote:
> > Another bug also appears on the same node.(https://tracker.ceph.com/issue...
11/02/2020
- 10:37 PM Backport #46194 (Need More Info): nautilus: BlueFS replay log grows without end
- first attempted backport - https://github.com/ceph/ceph/pull/37833 - was closed
- 03:34 PM Bug #48036: bluefs corrupted in a OSD
- @Satoru,
given you're able to reproduce the issue locally would you be able to collect OSD log (with debug-bluefs = ... - 12:50 PM Bug #48036: bluefs corrupted in a OSD
- As you suspect, `bluefs-bdev-expand` seems to be the first sensor. After running the reproducer with my custom Rook, ...
- 12:25 PM Bug #48036: bluefs corrupted in a OSD
- > Nevertheless I'm not completely sure whether bluefs-bdev-expand is a trigger for the issue or it's just the first "...
- 12:10 PM Bug #48036: bluefs corrupted in a OSD
- Hi Satoru,
thanks for the update.
Nevertheless I'm not completely sure whether bluefs-bdev-expand is a trigger for ... - 11:27 AM Bug #48036: bluefs corrupted in a OSD
- I succeeded to reproduce this problem in my Rook/Ceph cluster.
https://github.com/rook/rook/issues/6530
I gue... - 12:01 AM Bug #48036: bluefs corrupted in a OSD
- > As far as I can see you're attempting to expand DB volume, weren't you? Any rationale for that?
> Wasn't that a vo... - 02:18 PM Bug #48070 (New): Wrong bluefs db usage value (doubled) returned by `perf dump` when option `blue...
- During some tests we discovered that OSD db usage returned by `ceph daemon osd.num perf dump` tool is twice the real ...
- 12:26 PM Bug #47751 (Pending Backport): Hybrid allocator might segfault when fallback allocator is present
- 11:35 AM Bug #48025: osd start up failed when osd superblock crc fail
- Bo Zhang wrote:
> Another bug also appears on the same node.(https://tracker.ceph.com/issues/48061)
This another ... - 02:38 AM Bug #48025: osd start up failed when osd superblock crc fail
- Another bug also appears on the same node.(https://tracker.ceph.com/issues/48061)
- 02:09 AM Bug #48025: osd start up failed when osd superblock crc fail
- Igor Fedotov wrote:
> Bo Jang, I haven't got your last commends on disabled WAL, please elaborate.
>
> From Rocks... - 02:34 AM Bug #48061 (New): .sst block checksum mismatch
- 【verson】
14.2.8
【trigger operation 】
Under normal operation of the cluster, power down the equipment manually, and...
11/01/2020
- 05:12 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Staring rebuild of osd.0.
- 06:56 AM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Will start the recreation of osd.0 tomorrow (roughly 10 hours from now). Will check this bug report before doing so.
10/30/2020
- 10:56 AM Bug #48036: bluefs corrupted in a OSD
- Igor Fedotov wrote:
>
> Please set debug-bluestore & debug-bluefs to 20 and collect OSD startup log.
Never mind... - 10:41 AM Bug #48036: bluefs corrupted in a OSD
- As far as I can see you're attempting to expand DB volume, weren't you? Any rationale for that?
Wasn't that a volum... - 10:41 AM Bug #48036: bluefs corrupted in a OSD
- Both
https://tracker.ceph.com/issues/46886
and https://github.com/ceph/ceph/pull/36745
were following up the http... - 10:28 AM Bug #48025: osd start up failed when osd superblock crc fail
- Bo Jang, I haven't got your last commends on disabled WAL, please elaborate.
From RocksDB config line I don't see ... - 10:13 AM Bug #48047: osd: fix bluestore stupid allocator
- IMO bdev_block_size should be marked with FLAG_STARTUP (or even FLAG_CREATE) and hence protected from the modificatio...
- 03:33 AM Bug #48047 (Rejected): osd: fix bluestore stupid allocator
- In StupidAllocator::_choose_bin, it uses cct->_conf->bdev_block_size that can be changed in the allocator running,but...
10/29/2020
- 10:59 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- I'm planning to zap and rebuild the OSD (@osd.0@) this weekend. Please let me know if there's any information you'd ...
- 02:35 PM Bug #47330 (Fix Under Review): ceph-osd can't start when CURRENT file does not end with newline o...
- 02:33 PM Bug #47453 (Can't reproduce): checksum failures lead to assert on OSD shutdown in lab tests
- 02:26 PM Bug #47874 (Need More Info): Allocation error even though the block has 50 GB free
- 02:24 PM Bug #47883 (Need More Info): bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r...
- Still waiting for https://tracker.ceph.com/issues/47883#note-5
- 06:20 AM Bug #48036 (Closed): bluefs corrupted in a OSD
- I hit a problem that is very similar to the following issue/PR in v15.2.4.
upgrade/nautilus-x-master: bluefs mount... - 01:37 AM Bug #48025: osd start up failed when osd superblock crc fail
- Bo Zhang wrote:
> Igor Fedotov wrote:
> > Just in case - don't you have any custom settings for RocksDB, e.g. disab... - 01:36 AM Bug #48025: osd start up failed when osd superblock crc fail
- Igor Fedotov wrote:
> Just in case - don't you have any custom settings for RocksDB, e.g. disabled WAL?
NOT disab... - 01:30 AM Bug #48025: osd start up failed when osd superblock crc fail
- Igor Fedotov wrote:
> Just in case - don't you have any custom settings for RocksDB, e.g. disabled WAL?
Has been ...
10/28/2020
- 03:16 PM Bug #46490: osds crashing during deep-scrub
- It seems that the ceph-bluestore-tool repair only temporarily resolves the issue for us.
We ran the repair tool on e... - 10:51 AM Bug #48025: osd start up failed when osd superblock crc fail
- Just in case - don't you have any custom settings for RocksDB, e.g. disabled WAL?
- 09:52 AM Bug #48025 (New): osd start up failed when osd superblock crc fail
- 【verson】
14.2.8
【trigger operation 】
Under normal operation of the cluster, power down the equipment ... - 10:47 AM Bug #47985: When WAL is closed, osd cannot be restarted
- I doubt it will work this way as there would be no onode's metadata consistency guarantee any more... In your case su...
- 06:04 AM Bug #47985: When WAL is closed, osd cannot be restarted
- Hi Igor:
1. we've found disable WAL would reduce latency(measured by P99.9 latency),as we've tested rgw put worklo...
10/27/2020
- 02:42 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- > At the moment I don't see anything else to one can retrieve from this daemon. But suggest to keep it for additional...
- 09:48 AM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Jamin Collins wrote:
> Nothing suspicious about it either. It's the DB device for all the OSDs on that host and is ... - 09:39 AM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Jamin Collins wrote:
> Also, should I continue to keep the OSD in its failed state, is there any information that ca... - 09:37 AM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Jamin Collins wrote:
> What about the error coinciding precisely with the log volume filling? Any chance that's the... - 11:11 AM Bug #47985: When WAL is closed, osd cannot be restarted
- In addition, I have also tried to deploy osd first, and then modify the bluestore_rocksdb_options in the configuratio...
- 11:03 AM Bug #47985: When WAL is closed, osd cannot be restarted
- In some application scenarios, I want to close wal in order to get lower latency and higher IOPS. After closing wal, ...
- 09:22 AM Bug #47985: When WAL is closed, osd cannot be restarted
- I haven't investigated this deeper but what's the rationale to disableWAL? Generally this introduces a breach to data...
- 01:40 AM Bug #47985: When WAL is closed, osd cannot be restarted
- The detailed steps to deploy the cluster are as follows:
1. deploy a cluster without osd
```
MON=1 OSD=0 MDS=0 MGR... - 09:47 AM Backport #47892 (In Progress): octopus: Compressed blobs lack checksums
- 09:46 AM Backport #47708 (In Progress): octopus: Potential race condition regression around new OSD flock()s
- 08:36 AM Backport #47894 (In Progress): nautilus: Compressed blobs lack checksums
- 08:35 AM Backport #47707 (In Progress): nautilus: Potential race condition regression around new OSD flock()s
- 08:27 AM Backport #47669 (Need More Info): nautilus: Some structs aren't bound to mempools properly
- Not immediately clear how to backport this.
- 08:07 AM Backport #46194 (In Progress): nautilus: BlueFS replay log grows without end
10/26/2020
- 10:31 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Also, should I continue to keep the OSD in its failed state, is there any information that can be retrieved from it t...
- 09:57 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Nothing suspicious about it either. It's the DB device for all the OSDs on that host and is the same as in the previ...
- 09:25 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- So there is no spillover to main(HDD) device. Hence the issue is rather not related to this device.
Anything suspi... - 05:39 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- > Additionally for OSDs running on the same host (I presume you haven't restarted them for a while, have you?) please...
- 05:16 PM Bug #48002: Compaction error: Corruption: block checksum mismatch:
- Could you please provide a report for ceph-bluestore-tool's bluefs-bdev-sizes command. Wondering if this OSD has any ...
- 04:42 PM Bug #48002 (New): Compaction error: Corruption: block checksum mismatch:
- I appear to have ran into https://tracker.ceph.com/issues/37282 again.
Same AMD based host... - 09:55 PM Backport #46599 (Resolved): octopus: Rescue procedure for extremely large bluefs log
- 09:51 PM Backport #46008 (In Progress): nautilus: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentati...
- 12:04 PM Backport #47671 (In Progress): octopus: Hybrid allocator might cause duplicate admin socket comma...
- https://github.com/ceph/ceph/pull/37794
- 12:03 PM Backport #47672 (In Progress): nautilus: Hybrid allocator might cause duplicate admin socket comm...
- https://github.com/ceph/ceph/pull/37793
- 11:18 AM Bug #47985 (Need More Info): When WAL is closed, osd cannot be restarted
- It's not clear what did you mean under "close bluestore wal during deployment, and place bluestore wal/db and block o...
- 09:57 AM Bug #47985 (Need More Info): When WAL is closed, osd cannot be restarted
- Compile the master branch source code, use vstart to deploy the cluster, close bluestore wal during deployment, and p...
- 10:52 AM Bug #38272: "no available blob id" assertion might occur
- Nathan Cutler wrote:
> Jiang Yu wrote:
> > I encountered the same problem in ceph 12.2.2, but found that there is n... - 10:36 AM Bug #38272: "no available blob id" assertion might occur
- Jiang Yu wrote:
> I encountered the same problem in ceph 12.2.2, but found that there is no patch available in ceph ... - 01:24 AM Bug #38272: "no available blob id" assertion might occur
- Hello everyone,
I encountered the same problem in ceph 12.2.2, but found that there is no patch available in ceph 1...
10/20/2020
- 12:43 PM Bug #47883: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- Igor Fedotov wrote:
> chunsong feng wrote:
> Hi Feng,
>
> you're using master Ceph branch, right?
right
> Am I... - 10:56 AM Bug #47883: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- Also prior to redeploying the cluster/OSDs you might want to learn current fragmentation rating using:
ceph-bluestor... - 10:38 AM Bug #47883: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- chunsong feng wrote:
Hi Feng,
you're using master Ceph branch, right?
Am I getting properly OSD is unable to res... - 01:01 AM Bug #47883: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- Test Allocator=stipid and hybrid respectively.
When creating an OSD, only data is used. Block-wal and block-db are n... - 11:40 AM Bug #47874: Allocation error even though the block has 50 GB free
- Hi Fabian,
Bluestore is unable to allocate space for additional BlueFS data. This line:
2020-10-15 18:36:36.526 7f8... - 11:24 AM Bug #45519: OSD asserts during block allocation for BlueFS
- You might want to use free-dump command to inspect actual free extents layout:
ceph-bluestore-tool --path <> free-...
10/19/2020
- 08:33 AM Backport #47895 (Rejected): mimic: Compressed blobs lack checksums
- 08:33 AM Backport #47894 (Resolved): nautilus: Compressed blobs lack checksums
- https://github.com/ceph/ceph/pull/37843
- 08:33 AM Backport #47893 (Rejected): luminous: Compressed blobs lack checksums
- 08:33 AM Backport #47892 (Resolved): octopus: Compressed blobs lack checksums
- https://github.com/ceph/ceph/pull/37861
- 02:16 AM Bug #47883: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- The problem recurs within 10 minutes after a test is performed on more than 32 KB concurrent random writes.
- 01:38 AM Bug #47883 (Resolved): bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- 2020-10-17T18:02:17.658+0800 ffff7eaac980 1 bluefs _allocate failed to allocate 0x4d0000 on bdev 1, free 0x19c210800...
- 01:31 AM Bug #47243: bluefs _allocate failed then assert
- 2020-10-17T18:02:17.658+0800 ffff7eaac980 1 bluefs _allocate failed to allocate 0x4d0000 on bdev 1, free 0x19c210800...
10/18/2020
- 03:14 PM Bug #45519: OSD asserts during block allocation for BlueFS
- I just redeployed an OSD that was created at 2020-10-16 16:35:49.954951487 +0000, it is currently Sun Oct 18 15:13:07...
10/16/2020
- 04:19 PM Bug #45519: OSD asserts during block allocation for BlueFS
- > Are you using v14.2.11 from the beginning or OSDs suffering from high fragmentation were deployed (and used) with t...
- 03:05 PM Bug #45519: OSD asserts during block allocation for BlueFS
- Mohammed Naser wrote:
> Thanks for responding, this is actually running Nautilus with the default allocator (so `hyb... - 02:35 PM Bug #45519: OSD asserts during block allocation for BlueFS
- Thanks for responding, this is actually running Nautilus with the default allocator (so `hybrid`). Perhaps we should...
- 04:11 AM Bug #47740: OSD crash when increase pg_num
- Igor Fedotov wrote:
> Definitely it's not expected to crash but from my experience it's a bad practice to put DB dat...
10/15/2020
- 06:46 PM Bug #47874 (Need More Info): Allocation error even though the block has 50 GB free
- We started seeing our ceph cluster going down when OSD hit 20GB usage. Currently all out OSD have 70 GB disks attache...
- 02:59 PM Bug #46658 (Rejected): Ceph-OSD nautilus/octopus memory leak ?
- 02:59 PM Bug #46658: Ceph-OSD nautilus/octopus memory leak ?
- No problem.
- 02:51 PM Bug #46658: Ceph-OSD nautilus/octopus memory leak ?
- Thanks for your help :)
- 02:46 PM Bug #46658: Ceph-OSD nautilus/octopus memory leak ?
- Igor Fedotov wrote:
> Christophe Hauquiert wrote:
> > I just realized that osd_memory_target was setted to ~20GB (a... - 02:13 PM Bug #46658: Ceph-OSD nautilus/octopus memory leak ?
- Christophe Hauquiert wrote:
> I just realized that osd_memory_target was setted to ~20GB (as Adam Kupczyk already no... - 02:32 PM Bug #45519: OSD asserts during block allocation for BlueFS
- Mohammed Naser wrote:
> I did a whole bunch of digging today and found out that for some reason, there is a non-triv... - 01:59 PM Bug #47740: OSD crash when increase pg_num
- 玮文 胡 wrote:
> Igor Fedotov wrote:
> > 1) Are osd_op_thread's timeouts are observed in OSD log after its restart? If... - 06:28 AM Bug #47661: Cannot allocate memory appears when using io_uring osd
- The kernel panic problem can be solved by upgrading to 5.4.0-49.
But ceph osd will crash abnormally after running fo... - 06:22 AM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
- > Besides recently we switched backed to direct IO for bluefs, see https://github.com/ceph/ceph/pull/34297
> Likely ...
10/14/2020
- 02:26 PM Bug #47475 (Pending Backport): Compressed blobs lack checksums
- 01:50 AM Bug #47661: Cannot allocate memory appears when using io_uring osd
- @
[Tue Oct 13 10:26:03 2020] pps pps0: new PPS source ptp2
[Tue Oct 13 10:26:03 2020] ixgbe 0000:04:00.0: registere...
10/13/2020
- 04:26 PM Bug #47740: OSD crash when increase pg_num
- Igor Fedotov wrote:
> 1) Are osd_op_thread's timeouts are observed in OSD log after its restart? If so - could you p...
10/10/2020
- 11:41 AM Bug #47740: OSD crash when increase pg_num
- Neha Ojha wrote:
> Igor, does this look like one of the other _trim_to crashes we had seen in our teuthology runs an...
10/09/2020
- 09:44 PM Bug #47740: OSD crash when increase pg_num
- Igor, does this look like one of the other _trim_to crashes we had seen in our teuthology runs and are now fixed by y...
Also available in: Atom