Activity
From 01/27/2021 to 02/25/2021
02/25/2021
- 10:15 PM Bug #49394: another terminate called after throwing an instance of 'std::bad_alloc'
- Per #49387 (and an email from Casey) could be an issue with the tcmalloc version.
- 08:52 PM Backport #48282 (In Progress): nautilus: osd: fix bluestore bitmap allocator
- 05:16 PM Backport #49100: pacific: crash in BlueStore::Onode::put()
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39228
m... - 05:16 PM Backport #49097: pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(Blu...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39228
m... - 04:55 PM Backport #48193 (In Progress): nautilus: bufferlist c_str() sometimes clears assignment to mempool
- 04:24 PM Bug #49285: OSD Crash: Compaction error: Corruption: block checksum mismatch
- Chris K wrote:
> Yes, both nodes are using swap. Only an 8g file on the / filesystem though. It's backed by a raid ... - 04:19 PM Backport #49481 (In Progress): octopus: Bluefs improperly handles huge (>4GB) writes which causes...
- https://github.com/ceph/ceph/pull/39701
- 06:25 AM Backport #49481 (Resolved): octopus: Bluefs improperly handles huge (>4GB) writes which causes da...
- https://github.com/ceph/ceph/pull/39701
- 02:58 PM Backport #49479 (In Progress): pacific: Bluefs improperly handles huge (>4GB) writes which causes...
- https://github.com/ceph/ceph/pull/39688
- 06:25 AM Backport #49479 (Resolved): pacific: Bluefs improperly handles huge (>4GB) writes which causes da...
- https://github.com/ceph/ceph/pull/39688
- 02:57 PM Backport #49480 (In Progress): nautilus: Bluefs improperly handles huge (>4GB) writes which cause...
- https://github.com/ceph/ceph/pull/39698
- 06:25 AM Backport #49480 (Resolved): nautilus: Bluefs improperly handles huge (>4GB) writes which causes d...
- https://github.com/ceph/ceph/pull/39698
- 01:24 PM Backport #48950: pacific: ObjectStore/StoreTest hangs
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38989
m... - 06:25 AM Backport #49478 (Rejected): luminous: Bluefs improperly handles huge (>4GB) writes which causes d...
- 06:25 AM Backport #49477 (Rejected): mimic: Bluefs improperly handles huge (>4GB) writes which causes data...
- 06:22 AM Bug #49168 (Pending Backport): Bluefs improperly handles huge (>4GB) writes which causes data cor...
02/24/2021
- 08:59 PM Bug #49285: OSD Crash: Compaction error: Corruption: block checksum mismatch
- Yes, both nodes are using swap. Only an 8g file on the / filesystem though. It's backed by a raid 10 array provided ...
- 05:48 PM Bug #49285: OSD Crash: Compaction error: Corruption: block checksum mismatch
- Wondering if you have swap enabled for your OSD nodes? And may be excessive RSS memory (much above configured osd-mem...
02/19/2021
- 09:14 PM Bug #49394 (Resolved): another terminate called after throwing an instance of 'std::bad_alloc'
- ...
- 03:27 PM Bug #49383: BlueFS reads might improperly rebuild internal buffer under an shared lock
- Presumably caused by: https://github.com/ceph/ceph/commit/054355934a59bf4c08aa994fbab97a0f96cab31c
- 03:24 PM Bug #49383 (Resolved): BlueFS reads might improperly rebuild internal buffer under an shared lock
- Both read and read_random methods in BlueFS call bufferlist::c_str() method against shared buffer under a read lock.
... - 03:25 PM Backport #49386 (Resolved): octopus: BlueFS reads might improperly rebuild internal buffer under ...
- https://github.com/ceph/ceph/pull/39884
- 03:25 PM Backport #49385 (Resolved): nautilus: BlueFS reads might improperly rebuild internal buffer under...
- https://github.com/ceph/ceph/pull/39883
- 03:25 PM Backport #49384 (Resolved): pacific: BlueFS reads might improperly rebuild internal buffer under ...
- https://github.com/ceph/ceph/pull/39881
02/17/2021
- 10:29 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- And prior log I dissected showed 4916555776 bytes allocated shortly before the crash.
- 10:23 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- Neha Ojha wrote:
> https://pulpito.ceph.com/ksirivad-2021-02-16_08:28:04-rados:mgr-wip-fix-test-turn-off-module-dist... - 08:57 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- https://pulpito.ceph.com/ksirivad-2021-02-16_08:28:04-rados:mgr-wip-fix-test-turn-off-module-distro-basic-smithi/
02/16/2021
- 07:14 PM Backport #49099: octopus: crash in BlueStore::Onode::put()
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39230
m... - 07:14 PM Backport #49098: octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(Blu...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39230
m...
02/15/2021
- 06:19 PM Bug #49285: OSD Crash: Compaction error: Corruption: block checksum mismatch
- Hello Igor,
These OSDs can be restarted and it seems they resync and step back in line without trouble.
I'll ... - 06:15 PM Bug #49285: OSD Crash: Compaction error: Corruption: block checksum mismatch
- Igor Fedotov wrote:
> Are these failures volatile? I mean is affected OSD able to startup successfully after a while... - 06:08 PM Bug #49285: OSD Crash: Compaction error: Corruption: block checksum mismatch
- Are these failures volatile? I mean is affected OSD able to startup successfully after a while?
- 06:05 PM Bug #48781 (Resolved): crash in BlueStore::Onode::put()
- 06:05 PM Backport #49099 (Resolved): octopus: crash in BlueStore::Onode::put()
- 04:47 PM Backport #49099: octopus: crash in BlueStore::Onode::put()
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/39230
merged
- 06:04 PM Bug #48966 (Resolved): FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(BlueSt...
- 06:04 PM Backport #49098 (Resolved): octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::spli...
- 04:48 PM Backport #49098: octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(Blu...
- https://github.com/ceph/ceph/pull/39230 merged
02/14/2021
- 04:11 PM Bug #45519: OSD asserts during block allocation for BlueFS
- I faced the same issue in 14.2.14
My cluster was recovering the degraded pgs and one of my OSDs' db get full! After ...
02/12/2021
- 07:59 PM Bug #49285 (Closed): OSD Crash: Compaction error: Corruption: block checksum mismatch
- We're encountering periodic OSD crashes during test runs on some new hardware.
For this report I'm using osd 101 a... - 04:58 PM Backport #49100 (Resolved): pacific: crash in BlueStore::Onode::put()
- 04:17 PM Backport #49100: pacific: crash in BlueStore::Onode::put()
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/39228
merged - 04:57 PM Backport #49097 (Resolved): pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::spli...
- https://github.com/ceph/ceph/pull/39228
- 04:17 PM Backport #49097: pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(Blu...
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/39228
merged - 01:32 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- Yesterday I observed similar issue with the same error code for my vstart cluster. The root cause (in my case) was th...
02/11/2021
- 11:32 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- Chris K wrote:
> I think I have encountered this same issue on 15.2.5 running ubuntu 18.04.5. I reproduced the prob... - 09:32 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- I think I have encountered this same issue on 15.2.5 running ubuntu 18.04.5. I reproduced the problem with debug_roc...
- 07:02 PM Bug #49256 (Can't reproduce): src/os/bluestore/BlueStore.cc: FAILED ceph_assert(!c)
- ...
- 10:04 AM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- Neha Ojha wrote:
> could this be related to the recent out of space issues in the lab?
>
> [...]
It impossible...
02/10/2021
- 07:24 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- could this be related to the recent out of space issues in the lab?...
02/09/2021
02/08/2021
- 10:18 PM Backport #48478: octopus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38474
m... - 04:19 PM Backport #48478 (Resolved): octopus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_...
- 04:17 PM Backport #48478: octopus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/38474
merged - 04:19 PM Bug #47883 (Resolved): bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
- 04:07 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- So using teuthology-2021-01-07_07:01:02-rados-master-distro-basic-smithi/5762380.
The crashes are apparently at
... - 01:25 PM Bug #49110 (Triaged): BlueFS.cc: 1542: FAILED assert(r == 0)
- Once fresher Luminous build (highly likely to be done on your own) is obtained you might try recovery procedure provi...
- 01:20 PM Bug #49110: BlueFS.cc: 1542: FAILED assert(r == 0)
- Given Ceph version provided and huge BlueFS log file size:
-3> 2021-02-06 09:39:29.927409 7ff72040dec0 10 bluefs _re...
02/07/2021
- 09:26 PM Bug #49110: BlueFS.cc: 1542: FAILED assert(r == 0)
- Here is the last file: ceph-osd.108.log.10.xz
- 09:25 PM Bug #49110: BlueFS.cc: 1542: FAILED assert(r == 0)
- I now have a more complete dump.
Upfront I ran...
02/05/2021
- 11:13 AM Bug #49168 (In Progress): Bluefs improperly handles huge (>4GB) writes which causes data corruption
02/04/2021
- 06:43 PM Bug #45903 (Resolved): BlueFS replay log grows without end
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 04:36 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- ...
- 03:16 PM Bug #49170: BlueFS might end-up with huge WAL files when upgrading OMAPs
- Here is a sample of such a huge (7+GB) WAL file:
0x0: op_file_update file(ino 431989 size 0x1cf0c7888 mtime 202... - 03:13 PM Bug #49170: BlueFS might end-up with huge WAL files when upgrading OMAPs
- And I believe we've heard about some more cases when BlueFS got corrupted after upgrade to Octopus....
- 03:08 PM Bug #49170: BlueFS might end-up with huge WAL files when upgrading OMAPs
- Huge WAL files aren't good by themselves but BlueFS has a bug in handling >4GB writes which presumably cause data cor...
- 03:06 PM Bug #49170 (Resolved): BlueFS might end-up with huge WAL files when upgrading OMAPs
- When performing OMAP on-disk format upgrade BlueStore's fsck might flood BlueFS with converting transactions which re...
- 02:42 PM Bug #49168: Bluefs improperly handles huge (>4GB) writes which causes data corruption
- The above looks like 32-bit value overflow and indeed BlueFS::_flush() uses FileWriter::get_buffer_length() which is ...
- 02:37 PM Bug #49168 (Resolved): Bluefs improperly handles huge (>4GB) writes which causes data corruption
- Here is the symptomatic log snippet, please note the length(9136e44b) in _flush() call and offset/length in subsequen...
- 09:56 AM Bug #49110: BlueFS.cc: 1542: FAILED assert(r == 0)
- My first try with ...
02/03/2021
- 06:31 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- Neha Ojha wrote:
> I would have thought so too, but it is weird that I am seeing it on multiple smithi machines now.... - 06:28 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- I would have thought so too, but it is weird that I am seeing it on multiple smithi machines now.
/a/yuriw-2021-01... - 05:56 PM Bug #49138: blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- From my experience this highly likely means real H/W problems... Suggest to check with dmesg and/or smartctl..
- 04:54 PM Bug #49138 (New): blk/kernel/KernelDevice.cc: void KernelDevice::_aio_thread() Unexpected IO error
- ...
- 05:37 PM Backport #48281: octopus: osd: fix bluestore bitmap allocator
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38430
m... - 12:39 PM Backport #48281 (Resolved): octopus: osd: fix bluestore bitmap allocator
- 12:43 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- Adam has submitted a PR which might be helpful in detecting transient read errors:
https://github.com/ceph/ceph/pu... - 12:09 PM Bug #40300: ceph-osd segfault: "rocksdb: Corruption: file is too short"
- Nautilus backport:
https://github.com/ceph/ceph/pull/39254
Octopus backport:
https://github.com/ceph/ceph/pull/3...
02/02/2021
- 09:40 PM Bug #49110: BlueFS.cc: 1542: FAILED assert(r == 0)
- Could you please set debug-bluefs to 20, start OSD again and share the relevant OSD log... Or at least last 20000 lin...
- 06:47 PM Bug #49110 (Won't Fix): BlueFS.cc: 1542: FAILED assert(r == 0)
- All the SSD based OSDs in my ceph cluster crashed.
The initial error was:... - 04:12 PM Backport #48281: octopus: osd: fix bluestore bitmap allocator
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/38430
merged - 03:24 PM Backport #49098 (In Progress): octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::s...
- 11:11 AM Backport #49098 (Resolved): octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::spli...
- https://github.com/ceph/ceph/pull/39230
- 03:24 PM Backport #49097: pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(Blu...
- https://github.com/ceph/ceph/pull/39228
- 03:22 PM Backport #49097 (In Progress): pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::s...
- 11:11 AM Backport #49097 (Resolved): pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::spli...
- https://github.com/ceph/ceph/pull/39228
- 03:23 PM Backport #49100 (In Progress): pacific: crash in BlueStore::Onode::put()
- https://github.com/ceph/ceph/pull/39228
- 11:12 AM Backport #49100 (Resolved): pacific: crash in BlueStore::Onode::put()
- https://github.com/ceph/ceph/pull/39228
- 03:22 PM Backport #49099 (In Progress): octopus: crash in BlueStore::Onode::put()
- https://github.com/ceph/ceph/pull/39230
- 11:12 AM Backport #49099 (Resolved): octopus: crash in BlueStore::Onode::put()
- https://github.com/ceph/ceph/pull/39230
- 11:12 AM Bug #48781 (Pending Backport): crash in BlueStore::Onode::put()
- 09:15 AM Bug #48781 (Fix Under Review): crash in BlueStore::Onode::put()
- @Tom - thanks a lot.
I presume the root cause for the bug is an improper (too early) nref decrement in Onode::put me... - 11:10 AM Bug #48966 (Pending Backport): FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cach...
- 10:33 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- Igor Fedotov wrote:
> @Christian - thanks for the update. Could you please keep monitoring these counters on a per-d... - 10:20 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- @Christian - thanks for the update. Could you please keep monitoring these counters on a per-day basis for a while?
... - 09:57 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- Igor Fedotov wrote:
> Hi @Christian,
> sorry for the long analysis. But again nothing very interesting...
Too ba...
02/01/2021
- 07:37 PM Backport #46194 (Resolved): nautilus: BlueFS replay log grows without end
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37948
m... - 05:26 PM Backport #46194: nautilus: BlueFS replay log grows without end
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37948
merged - 12:03 PM Bug #48781: crash in BlueStore::Onode::put()
- Extra logs
- 11:59 AM Bug #48781: crash in BlueStore::Onode::put()
- Here you go (output from cephadm logs)
This crash is the first one now after 1 week. - 11:26 AM Bug #48781: crash in BlueStore::Onode::put()
- Tom Myny wrote:
> Here is a dump of our latest crash
@Tom, may I have additional 10000 lines of the log preceding... - 10:41 AM Bug #48781: crash in BlueStore::Onode::put()
- Here is a dump of our latest crash
01/29/2021
- 07:41 AM Bug #46780 (Triaged): BlueFS Spillover without db being full
- Seena, this fixed in 14.2.11, and default in 14.2.12
01/28/2021
- 03:27 PM Bug #48256 (Can't reproduce): Many4KWritesNoCSumTest fails on nautilus [ FAILED ] ObjectStore/S...
- 03:24 PM Bug #48218 (Can't reproduce): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompressionAlgor...
- 02:43 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
- Christian Rohmann wrote:
> I was able to dump all of the output of the osds from journald now, properly timestamped ... - 01:11 PM Bug #48781 (Need More Info): crash in BlueStore::Onode::put()
- 01:10 PM Backport #49039 (Resolved): octopus: Cannot allocate memory appears when using io_uring osd
- https://github.com/ceph/ceph/pull/39899
- 01:10 PM Backport #49038 (Resolved): pacific: Cannot allocate memory appears when using io_uring osd
- https://github.com/ceph/ceph/pull/39898
- 01:09 PM Bug #47661 (Pending Backport): Cannot allocate memory appears when using io_uring osd
- 12:52 PM Bug #48776 (Resolved): ObjectStore/StoreTest hangs
- 12:51 PM Backport #48950 (Resolved): pacific: ObjectStore/StoreTest hangs
- https://github.com/ceph/ceph/pull/38989
01/27/2021
- 11:51 PM Bug #48776: ObjectStore/StoreTest hangs
- Neha Ojha wrote:
> pacific backport - https://github.com/ceph/ceph/pull/38989
merged - 08:11 PM Bug #20870 (Resolved): OSD compression: incorrect display of the used disk space
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:09 PM Bug #46411 (Rejected): mimic: Disks associated to osds have small write io even on an idle ceph c...
- mimic EOL
- 08:08 PM Bug #38150 (Resolved): KernelDevice exclusive lock broken
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:06 PM Feature #40704 (Resolved): BlueStore tool to check fragmentation
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:06 PM Bug #41188 (Resolved): incorrect RW_IO_MAX
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:05 PM Bug #41901 (Resolved): bluestore: unused calculation is broken
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:05 PM Bug #42091 (Resolved): bluefs: sync_metadata leaks dirty files if log_t is empty
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:01 PM Bug #45788 (Resolved): ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 failed
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:00 PM Bug #46552 (Resolved): Rescue procedure for extremely large bluefs log
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:00 PM Bug #47475 (Resolved): Compressed blobs lack checksums
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 07:58 PM Backport #43086 (Rejected): mimic: bluefs: sync_metadata leaks dirty files if log_t is empty
- mimic EOL
- 07:57 PM Backport #47895 (Rejected): mimic: Compressed blobs lack checksums
- 07:57 PM Backport #46192 (Rejected): mimic: BlueFS replay log grows without end
- 07:57 PM Backport #46713 (Rejected): mimic: Rescue procedure for extremely large bluefs log
- 07:57 PM Backport #46010 (Rejected): mimic: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 f...
- 07:57 PM Backport #45062 (Rejected): mimic: bluestore: unused calculation is broken
- 07:57 PM Backport #41280 (Rejected): mimic: BlueStore tool to check fragmentation
- 07:57 PM Backport #41461 (Rejected): mimic: incorrect RW_IO_MAX
- 07:57 PM Backport #38161 (Rejected): mimic: KernelDevice exclusive lock broken
- 07:57 PM Backport #36641 (Rejected): mimic: Unable to recover from ENOSPC in BlueFS
- 07:57 PM Backport #37564 (Rejected): mimic: OSD compression: incorrect display of the used disk space
- 07:27 PM Backport #47893 (Rejected): luminous: Compressed blobs lack checksums
- luminous EOL
- 07:26 PM Backport #36640 (Rejected): luminous: Unable to recover from ENOSPC in BlueFS
- luminous EOL
- 07:26 PM Backport #38160 (Rejected): luminous: KernelDevice exclusive lock broken
- luminous EOL
- 07:25 PM Backport #41462 (Rejected): luminous: incorrect RW_IO_MAX
- luminous EOL
- 06:25 PM Bug #47551 (Resolved): Some structs aren't bound to mempools properly
- 06:25 PM Backport #47670 (Rejected): mimic: Some structs aren't bound to mempools properly
- mimic EOL
Also available in: Atom