Project

General

Profile

Activity

From 01/04/2021 to 02/02/2021

02/02/2021

09:40 PM Bug #49110: BlueFS.cc: 1542: FAILED assert(r == 0)
Could you please set debug-bluefs to 20, start OSD again and share the relevant OSD log... Or at least last 20000 lin... Igor Fedotov
06:47 PM Bug #49110 (Won't Fix): BlueFS.cc: 1542: FAILED assert(r == 0)
All the SSD based OSDs in my ceph cluster crashed.
The initial error was:...
Andreas Buschmann
04:12 PM Backport #48281: octopus: osd: fix bluestore bitmap allocator
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/38430
merged
Yuri Weinstein
03:24 PM Backport #49098 (In Progress): octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::s...
Igor Fedotov
11:11 AM Backport #49098 (Resolved): octopus: FAILED ceph_assert(o->pinned) in BlueStore::Collection::spli...
https://github.com/ceph/ceph/pull/39230 Igor Fedotov
03:24 PM Backport #49097: pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(Blu...
https://github.com/ceph/ceph/pull/39228 Igor Fedotov
03:22 PM Backport #49097 (In Progress): pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::s...
Igor Fedotov
11:11 AM Backport #49097 (Resolved): pacific: FAILED ceph_assert(o->pinned) in BlueStore::Collection::spli...
https://github.com/ceph/ceph/pull/39228 Igor Fedotov
03:23 PM Backport #49100 (In Progress): pacific: crash in BlueStore::Onode::put()
https://github.com/ceph/ceph/pull/39228 Igor Fedotov
11:12 AM Backport #49100 (Resolved): pacific: crash in BlueStore::Onode::put()
https://github.com/ceph/ceph/pull/39228 Igor Fedotov
03:22 PM Backport #49099 (In Progress): octopus: crash in BlueStore::Onode::put()
https://github.com/ceph/ceph/pull/39230 Igor Fedotov
11:12 AM Backport #49099 (Resolved): octopus: crash in BlueStore::Onode::put()
https://github.com/ceph/ceph/pull/39230 Igor Fedotov
11:12 AM Bug #48781 (Pending Backport): crash in BlueStore::Onode::put()
Igor Fedotov
09:15 AM Bug #48781 (Fix Under Review): crash in BlueStore::Onode::put()
@Tom - thanks a lot.
I presume the root cause for the bug is an improper (too early) nref decrement in Onode::put me...
Igor Fedotov
11:10 AM Bug #48966 (Pending Backport): FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cach...
Igor Fedotov
10:33 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Igor Fedotov wrote:
> @Christian - thanks for the update. Could you please keep monitoring these counters on a per-d...
Christian Rohmann
10:20 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
@Christian - thanks for the update. Could you please keep monitoring these counters on a per-day basis for a while?
...
Igor Fedotov
09:57 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Igor Fedotov wrote:
> Hi @Christian,
> sorry for the long analysis. But again nothing very interesting...
Too ba...
Christian Rohmann

02/01/2021

07:37 PM Backport #46194 (Resolved): nautilus: BlueFS replay log grows without end
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37948
m...
Nathan Cutler
05:26 PM Backport #46194: nautilus: BlueFS replay log grows without end
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37948
merged
Yuri Weinstein
12:03 PM Bug #48781: crash in BlueStore::Onode::put()
Extra logs Tom Myny
11:59 AM Bug #48781: crash in BlueStore::Onode::put()
Here you go (output from cephadm logs)
This crash is the first one now after 1 week.
Tom Myny
11:26 AM Bug #48781: crash in BlueStore::Onode::put()
Tom Myny wrote:
> Here is a dump of our latest crash
@Tom, may I have additional 10000 lines of the log preceding...
Igor Fedotov
10:41 AM Bug #48781: crash in BlueStore::Onode::put()
Here is a dump of our latest crash Tom Myny

01/29/2021

07:41 AM Bug #46780 (Triaged): BlueFS Spillover without db being full
Seena, this fixed in 14.2.11, and default in 14.2.12 Konstantin Shalygin

01/28/2021

03:27 PM Bug #48256 (Can't reproduce): Many4KWritesNoCSumTest fails on nautilus [ FAILED ] ObjectStore/S...
Neha Ojha
03:24 PM Bug #48218 (Can't reproduce): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompressionAlgor...
Neha Ojha
02:43 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Christian Rohmann wrote:
> I was able to dump all of the output of the osds from journald now, properly timestamped ...
Igor Fedotov
01:11 PM Bug #48781 (Need More Info): crash in BlueStore::Onode::put()
Igor Fedotov
01:10 PM Backport #49039 (Resolved): octopus: Cannot allocate memory appears when using io_uring osd
https://github.com/ceph/ceph/pull/39899 Backport Bot
01:10 PM Backport #49038 (Resolved): pacific: Cannot allocate memory appears when using io_uring osd
https://github.com/ceph/ceph/pull/39898 Backport Bot
01:09 PM Bug #47661 (Pending Backport): Cannot allocate memory appears when using io_uring osd
Igor Fedotov
12:52 PM Bug #48776 (Resolved): ObjectStore/StoreTest hangs
Igor Fedotov
12:51 PM Backport #48950 (Resolved): pacific: ObjectStore/StoreTest hangs
https://github.com/ceph/ceph/pull/38989 Igor Fedotov

01/27/2021

11:51 PM Bug #48776: ObjectStore/StoreTest hangs
Neha Ojha wrote:
> pacific backport - https://github.com/ceph/ceph/pull/38989
merged
Yuri Weinstein
08:11 PM Bug #20870 (Resolved): OSD compression: incorrect display of the used disk space
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:09 PM Bug #46411 (Rejected): mimic: Disks associated to osds have small write io even on an idle ceph c...
mimic EOL Nathan Cutler
08:08 PM Bug #38150 (Resolved): KernelDevice exclusive lock broken
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:06 PM Feature #40704 (Resolved): BlueStore tool to check fragmentation
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:06 PM Bug #41188 (Resolved): incorrect RW_IO_MAX
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:05 PM Bug #41901 (Resolved): bluestore: unused calculation is broken
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:05 PM Bug #42091 (Resolved): bluefs: sync_metadata leaks dirty files if log_t is empty
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:01 PM Bug #45788 (Resolved): ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 failed
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:00 PM Bug #46552 (Resolved): Rescue procedure for extremely large bluefs log
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:00 PM Bug #47475 (Resolved): Compressed blobs lack checksums
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
07:58 PM Backport #43086 (Rejected): mimic: bluefs: sync_metadata leaks dirty files if log_t is empty
mimic EOL Nathan Cutler
07:57 PM Backport #47895 (Rejected): mimic: Compressed blobs lack checksums
Nathan Cutler
07:57 PM Backport #46192 (Rejected): mimic: BlueFS replay log grows without end
Nathan Cutler
07:57 PM Backport #46713 (Rejected): mimic: Rescue procedure for extremely large bluefs log
Nathan Cutler
07:57 PM Backport #46010 (Rejected): mimic: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 f...
Nathan Cutler
07:57 PM Backport #45062 (Rejected): mimic: bluestore: unused calculation is broken
Nathan Cutler
07:57 PM Backport #41280 (Rejected): mimic: BlueStore tool to check fragmentation
Nathan Cutler
07:57 PM Backport #41461 (Rejected): mimic: incorrect RW_IO_MAX
Nathan Cutler
07:57 PM Backport #38161 (Rejected): mimic: KernelDevice exclusive lock broken
Nathan Cutler
07:57 PM Backport #36641 (Rejected): mimic: Unable to recover from ENOSPC in BlueFS
Nathan Cutler
07:57 PM Backport #37564 (Rejected): mimic: OSD compression: incorrect display of the used disk space
Nathan Cutler
07:27 PM Backport #47893 (Rejected): luminous: Compressed blobs lack checksums
luminous EOL Nathan Cutler
07:26 PM Backport #36640 (Rejected): luminous: Unable to recover from ENOSPC in BlueFS
luminous EOL Nathan Cutler
07:26 PM Backport #38160 (Rejected): luminous: KernelDevice exclusive lock broken
luminous EOL Nathan Cutler
07:25 PM Backport #41462 (Rejected): luminous: incorrect RW_IO_MAX
luminous EOL Nathan Cutler
06:25 PM Bug #47551 (Resolved): Some structs aren't bound to mempools properly
Nathan Cutler
06:25 PM Backport #47670 (Rejected): mimic: Some structs aren't bound to mempools properly
mimic EOL Nathan Cutler

01/26/2021

10:24 AM Bug #47661: Cannot allocate memory appears when using io_uring osd
Jiang Yu wrote:
> The kernel panic problem can be solved by upgrading to 5.4.0-49.
> But ceph osd will crash abnorm...
Yanhu Cao

01/23/2021

05:48 PM Bug #48966 (Fix Under Review): FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cach...
Igor Fedotov

01/22/2021

08:24 PM Bug #48966 (Resolved): FAILED ceph_assert(o->pinned) in BlueStore::Collection::split_cache(BlueSt...
... Neha Ojha

01/21/2021

10:23 PM Bug #48036: bluefs corrupted in a OSD
I've tried to reproduce this problem of multiple ceph-osds sharing the same device (with cephadm) and the second ceph... Sage Weil
11:20 AM Backport #48950 (Resolved): pacific: ObjectStore/StoreTest hangs
https://github.com/ceph/ceph/pull/38989 Nathan Cutler

01/20/2021

05:16 PM Bug #48776: ObjectStore/StoreTest hangs
pacific backport - https://github.com/ceph/ceph/pull/38989 Neha Ojha
01:56 AM Bug #48776 (Pending Backport): ObjectStore/StoreTest hangs
Neha Ojha
08:12 AM Bug #48036 (Closed): bluefs corrupted in a OSD
Igor Fedotov
01:32 AM Bug #48036: bluefs corrupted in a OSD
This issue is fixed in Rook.
https://github.com/rook/rook/pull/6793
Satoru Takeuchi

01/19/2021

06:35 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
I was able to dump all of the output of the osds from journald now, properly timestamped - see the attached file jour... Christian Rohmann
02:01 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
This doesn't have preceding rocksdb log output. Hence makes a little sense.
Sorry it looks like a dead end from ro...
Igor Fedotov
10:16 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
I attached stack traces in file crash_stacktraces.log.
Unfortunately since this was shipped from syslog -> elastic...
Christian Rohmann
02:09 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
William Law wrote:
> Hi Igor -
>
> Thank you for your help and two options; we will discuss.
>
> Given this, i...
Igor Fedotov

01/18/2021

10:27 PM Backport #48478: octopus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
Fabio Martins wrote:
> Has this been merged into Octopus?
Not yet, this is gonna to happen after merge to master
Igor Fedotov
09:38 PM Backport #48478: octopus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r == 0)
Has this been merged into Octopus? Fabio Martins
04:30 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
Hi Igor -
Thank you for your help and two options; we will discuss.
Given this, is there any way we could (rela...
William Law
03:35 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
William Law wrote:
> Hi Igor – thanks for the clarification. We actually did that but for whatever reason, we upload...
Igor Fedotov
02:45 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Actual OSDs logs prior to the assertion are what I'd really like to see. The attached crash info isn't enough - it's ... Igor Fedotov
02:29 PM Backport #48479: nautilus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r =...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38475
m...
Nathan Cutler

01/17/2021

03:35 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
I did a... Christian Rohmann

01/16/2021

05:06 PM Bug #48776 (Fix Under Review): ObjectStore/StoreTest hangs
Igor Fedotov
01:35 AM Bug #48776: ObjectStore/StoreTest hangs
IMO this PR is a culprit: https://github.com/ceph/ceph/pull/30027
Will try to fix after weekend.
Igor Fedotov
12:22 AM Bug #48776: ObjectStore/StoreTest hangs
Here's a test that's hung, logs in ubuntu@smithi013.... Neha Ojha
05:01 PM Bug #48876 (Duplicate): osd crash in bluestore code
Despite different symptoms the root cause is pretty the same - osr locking regression caused by https://github.com/ce... Igor Fedotov
12:23 AM Bug #48819: fsck error: found stray (per-pg) omap data on omap_head
https://github.com/ceph/ceph/pull/38929 this will let us see this in teuthology logs at least Josh Durgin
12:08 AM Bug #48819: fsck error: found stray (per-pg) omap data on omap_head
Igor, you can log on to ubuntu@smithi092 right now and check out osd.4.
The test isn't doing much special, it's th...
Josh Durgin
12:05 AM Bug #48819: fsck error: found stray (per-pg) omap data on omap_head
This looks related to my recent PR introducing per-pg omap naming scheme: https://github.com/ceph/ceph/pull/38651
...
Igor Fedotov

01/15/2021

11:58 PM Bug #48819: fsck error: found stray (per-pg) omap data on omap_head
Cause appears to be a bluestore issue - manually running the objectstore command with --log-to-stderr we can see it's... Josh Durgin
10:47 PM Bug #48819: fsck error: found stray (per-pg) omap data on omap_head
This is happening due to an exception running ceph-objectstore-tool:... Josh Durgin
11:40 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Christian Rohmann wrote:
> The fsck using
>
> [...]
>
> just returned
>
> [...]
>
>
> So this is of n...
Igor Fedotov
10:00 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
The fsck using... Christian Rohmann
03:00 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Clarification: All ODSs that crashed came back up, I manually (!) took a single one offline to do fsck. Until then th... Christian Rohmann
02:59 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)

Igor Fedotov wrote:
> So at this point I need some clarification on what you'd like to do first:
> 1) Recover OSD...
Christian Rohmann
02:55 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Igor Fedotov wrote:
> Christian Rohmann wrote:
> > We offlined an OSD and ran an _fsck --deep_, which resulted in
...
Christian Rohmann
02:41 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
So at this point I need some clarification on what you'd like to do first:
1) Recover OSDs. It's still unclear wheth...
Igor Fedotov
02:28 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Christian Rohmann wrote:
> Could there be an issue with a recent RocksDB update as observed by the Gentoo folks: htt...
Igor Fedotov
02:20 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Christian Rohmann wrote:
> We offlined an OSD and ran an _fsck --deep_, which resulted in
>
> [...]
>
> error...
Igor Fedotov
05:55 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
Hi Igor – thanks for the clarification. We actually did that but for whatever reason, we uploaded the wrong file, sor... William Law
09:31 AM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
Hi William,
this should be 0xedf022~21 indeed.
But extract-337-modified doesn't look correct in any case.
There ...
Igor Fedotov
02:54 PM Bug #48389 (Rejected): _do_read bdev-read failed
Igor Fedotov
02:53 PM Bug #48781: crash in BlueStore::Onode::put()
Could you please share yet another 10000 lines of log preceding ones from crash.zip?
Igor Fedotov
01:22 PM Bug #48876: osd crash in bluestore code
Unfortunately, I don't have the rest of the log after all. I'm OOTO for a few days, but should be back on Monday. I'l... Jeff Layton
01:20 PM Bug #48876: osd crash in bluestore code
@Jeff - would you please share yet another 10000 lines of log prior to the one you've already attached. Igor Fedotov

01/14/2021

11:11 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
Hi Igor -
Thanks again for your kind and patient assistance. I think we got mixed up a little and hope you can he...
William Law
03:12 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
William Law wrote:
> OK great! How do we do that? And we have at least 4 other OSDs this happened to; should it be ...
Igor Fedotov
02:13 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
OK great! How do we do that? And we have at least 4 other OSDs this happened to; should it be the same or how do we ... William Law
01:00 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
So intermediate summary on the issue.
Bluefs log contains a transaction (seq = 3790994) with improperly ordered oper...
Igor Fedotov
05:48 PM Backport #48479 (Resolved): nautilus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph...
Igor Fedotov
04:15 PM Backport #48479: nautilus: bluefs _allocate failed to allocate bdev 1 and 2,cause ceph_assert(r =...
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/38475
merged
Yuri Weinstein
04:48 PM Bug #46800 (Duplicate): Octopus OSD died and fails to start with FAILED ceph_assert(is_valid_io(o...
Igor Fedotov
04:45 PM Bug #48276 (Duplicate): OSD Crash with ceph_assert(is_valid_io(off, len))
Igor Fedotov
04:07 PM Bug #47751 (Resolved): Hybrid allocator might segfault when fallback allocator is present
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:43 PM Bug #48876 (Duplicate): osd crash in bluestore code
OSD crash seen when doing some cephfs testing with some experimental MDS and client patches. Build was based on top o... Jeff Layton
07:37 AM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
I ran the fsck with bluestore debug level 20 again, in case you might need more details. The whole file is about 18 G... Christian Rohmann

01/13/2021

05:08 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Could there be an issue with a recent RocksDB update as observed by the Gentoo folks: https://bugs.gentoo.org/764221 ... Christian Rohmann
02:48 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Igor Fedotov wrote:
> Wondering if you had experienced any recent OSD crashes prior to this failure?
Not recently...
Christian Rohmann

01/12/2021

05:06 PM Bug #48729 (Triaged): Bluestore memory leak on srub operations
It looks like high RAM usage is caused by improper onode cache trimming inside BlueStore. Which in turn might be caus... Igor Fedotov
10:55 AM Bug #48729: Bluestore memory leak on srub operations
@Igor
here you are:
https://cf2.cloudferro.com:8080/swift/v1/AUTH_5b9ea421deb745bfb4dab930cebe153f/ceph-sharings/...
Rafal Wadolowski
02:02 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
Thank you thank you. They are attached.
Best,
Will
William Law
11:43 AM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
@Will - to make block.db extract just use:
dd if=block.db ibs=1 skip=15589376 count=32768 of=dump.out
Igor Fedotov
01:19 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
Wondering if you had experienced any recent OSD crashes prior to this failure?
You might also want to Check for HW...
Igor Fedotov
12:43 PM Bug #48849: BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
BTW, I looked through other reported issues and found https://tracker.ceph.com/issues/48002 or https://tracker.ceph.c... Christian Rohmann
12:41 PM Bug #48849 (Need More Info): BlueStore.cc: 11380: FAILED ceph_assert(r == 0)
We experienced a few OSD crashes all with the same signature in the logs:
--- cut ---
2021-01-08 06:13:54.946 7f3...
Christian Rohmann
11:43 AM Backport #48194 (Resolved): octopus: bufferlist c_str() sometimes clears assignment to mempool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38429
m...
Nathan Cutler
11:43 AM Backport #48094 (Resolved): octopus: Hybrid allocator might segfault when fallback allocator is p...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38428
m...
Nathan Cutler
11:41 AM Backport #48093: nautilus: Hybrid allocator might segfault when fallback allocator is present
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38637
m...
Nathan Cutler
11:40 AM Backport #47672: nautilus: Hybrid allocator might cause duplicate admin socket command registrati...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37793
m...
Nathan Cutler
10:23 AM Bug #42928: ceph-bluestore-tool bluefs-bdev-new-db does not update lv tags
to answer my question - head -n 2 /dev/vg/lv will give the block device uuid Glen Baars
09:44 AM Bug #42928: ceph-bluestore-tool bluefs-bdev-new-db does not update lv tags
Any way to determine the correct DB->Block arrangement after they are lost? I have a host that has hit this bug and a... Glen Baars
01:19 AM Bug #48776: ObjectStore/StoreTest hangs
... Neha Ojha

01/11/2021

09:19 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
HI Igor -
I feel like I did something wrong as hexdump returned nothing... My apologies we haven't slept much
@ro...
William Law
08:33 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
@Will, would you please share the hex dump of block.db file starting offset 0xede000 length 0x8000.
Latest startup...
Igor Fedotov
05:00 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
Igor, thank you! It's attached.
Will
William Law
04:24 PM Bug #48827: Ceph Bluestore OSDs fail to start on WAL corruption
@William - would you please share OSD startup log with debug-bluefs set to 20? Igor Fedotov
04:03 PM Bug #48827 (Duplicate): Ceph Bluestore OSDs fail to start on WAL corruption
Hi -
I posted a note to the Ceph user list also, but we've run into this bug and it unfortunately hit 5 OSDs at th...
William Law
07:59 PM Bug #48729: Bluestore memory leak on srub operations
Presuming mem utilization is still that high could you please temporary set debug_bluestore to 20 for the osd in ques... Igor Fedotov
10:25 AM Bug #48729: Bluestore memory leak on srub operations
Unfortunately, That's not the case. After 4 days some of the osds took >10GB of ram.
In example:...
Rafal Wadolowski
07:55 PM Backport #48194: octopus: bufferlist c_str() sometimes clears assignment to mempool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/38429
merged
Yuri Weinstein
07:55 PM Backport #48094: octopus: Hybrid allocator might segfault when fallback allocator is present
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/38428
merged
Yuri Weinstein
04:50 PM Bug #47443 (Resolved): Hybrid allocator might cause duplicate admin socket command registration.
Igor Fedotov
04:49 PM Backport #47672 (Resolved): nautilus: Hybrid allocator might cause duplicate admin socket command...
Igor Fedotov
04:43 PM Backport #47672: nautilus: Hybrid allocator might cause duplicate admin socket command registrati...
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/37793
merged
Yuri Weinstein
04:48 PM Backport #48093 (Resolved): nautilus: Hybrid allocator might segfault when fallback allocator is ...
Igor Fedotov
04:44 PM Backport #48093: nautilus: Hybrid allocator might segfault when fallback allocator is present
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/38637
merged
Yuri Weinstein
04:13 AM Bug #48819 (New): fsck error: found stray (per-pg) omap data on omap_head
/a/kchai-2021-01-10_13:20:22-rados-master-distro-basic-smithi/ Kefu Chai

01/08/2021

01:08 PM Bug #48781: crash in BlueStore::Onode::put()
and on the last host:
Jan 7 07:34:17 ceph2 kernel: [107054.315343] tp_osd_tp[20519]: segfault at 0 ip 00007efd3db...
Tom Myny
01:04 PM Bug #48781: crash in BlueStore::Onode::put()
On another system we see the following to:
Jan 7 10:02:32 ceph1 kernel: [114774.759038] tp_osd_tp[17449]: segfaul...
Tom Myny
01:02 PM Bug #48781: crash in BlueStore::Onode::put()
We also see the following in our OS logs:
[119268.259883] tp_osd_tp[32332]: segfault at 0 ip 00007f8ccce40733 sp 0...
Tom Myny

01/07/2021

09:34 PM Bug #48776: ObjectStore/StoreTest hangs
... Neha Ojha
12:38 AM Bug #48776: ObjectStore/StoreTest hangs
/a/teuthology-2021-01-05_07:01:02-rados-master-distro-basic-smithi/5755704 Neha Ojha
12:38 AM Bug #48776 (Resolved): ObjectStore/StoreTest hangs
... Neha Ojha
02:45 PM Bug #48781: crash in BlueStore::Onode::put()
Download file in attachment with extra logs Tom Myny
02:21 PM Bug #48781: crash in BlueStore::Onode::put()
Here is some extra information regarding this problem:
{
"backtrace": [
"(()+0x12b20) [0x7f0afc7a8b2...
Tom Myny
09:26 AM Bug #48781 (Resolved): crash in BlueStore::Onode::put()
Following the earlier issue reported in #48778, I now see frequent OSD crashes. I'm not sure both are related.
<pr...
Gerry D

01/06/2021

11:14 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
No problem, and thanks for confirming! Joshua Baergen
11:12 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Joshua Baergen wrote:
> Interesting, thanks. Is that 14.2.17 change this one: https://tracker.ceph.com/issues/47044 ...
Dan van der Ster
11:10 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Interesting, thanks. Is that 14.2.17 change this one: https://tracker.ceph.com/issues/47044 ?
FWIW, what I'm seein...
Joshua Baergen
11:07 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Joshua Baergen wrote:
> Hey Dan/Eric, did either of you see a big increase in the number of writes hitting your disk...
Dan van der Ster
11:00 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Hey Dan/Eric, did either of you see a big increase in the number of writes hitting your disks when buffered mode was ... Joshua Baergen

01/05/2021

03:06 PM Bug #47751: Hybrid allocator might segfault when fallback allocator is present
Fixing this error:... Nathan Cutler
03:02 PM Bug #46124 (Resolved): Potential race condition regression around new OSD flock()s
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
10:41 AM Bug #46490: osds crashing during deep-scrub
We seem to hit the same behaviour after upgrading our ceph cluster from 12.2.12 to 14.2.11.
Since then we have quite...
Maximilian Stinsky
02:48 AM Support #48747: which version support spdk perfect?
... hg liu
02:45 AM Support #48747: which version support spdk perfect?
13.2.13 i encounter this fail when ceph-osd read data of 2048 counts of lbas... hg liu
02:42 AM Support #48747 (Closed): which version support spdk perfect?
if i try 12.2.12 12.2.13 and 13.2.10 all of them can not run stable because of
crush when write or read? which ver...
hg liu
 

Also available in: Atom