Project

General

Profile

Activity

From 02/08/2018 to 03/09/2018

03/09/2018

08:12 PM Bug #22534 (Fix Under Review): Debian's bluestore *rocksdb* does not support neither fast CRC nor...
Pull requests:
* https://github.com/ceph/rocksdb/pull/35,
* https://github.com/ceph/ceph/pull/20825.
Radoslaw Zarzynski

03/08/2018

10:07 PM Bug #22977: High CPU load caused by operations on onode_map
Awesome, that fixed it, thanks :)... Paul Emmerich
01:07 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Ah, yeah I see that. Oops.
Corrected and I did set bdev_aio_max_queue_depth = 65536 aaand all OSD's are up!
Christoffer Lilja
12:44 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Hmm, that's surprising. Setting extreme values should cause *EINVAL* during *io_setup* because exceeding system limit... Radoslaw Zarzynski

03/07/2018

11:57 PM Bug #22977: High CPU load caused by operations on onode_map
Paul,
Just improved hashing. Please test.
https://shaman.ceph.com/builds/ceph/wip-22977-inspect-onode-map/
Adam Kupczyk
02:27 PM Bug #22977: High CPU load caused by operations on onode_map
... Paul Emmerich
02:23 PM Bug #22977: High CPU load caused by operations on onode_map
Yes, perf top now shows the new hash_helper struct as key in the table. Paul Emmerich
02:22 PM Bug #22977: High CPU load caused by operations on onode_map
Paul,
Have you been using latest builds? IDs 93123 - 93126 ?
This is the only build that actually tries to fix has...
Adam Kupczyk
01:56 PM Bug #22977: High CPU load caused by operations on onode_map
CPU load is ~30% lower than my "control group" now, but it's still pretty bad (as in: > 90% of the CPU time is spent ... Paul Emmerich
08:53 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Hi,
Thank you for looking into this.
I'd set dev_aio_max_queue_depth higher and increased it until I reached 6871...
Christoffer Lilja
04:03 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Yeah, saw that. However, your _IOContext_ is so large it exceeds even that. :-)
I would try bigger values just as a ...
Radoslaw Zarzynski
03:16 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
As I mentioned in the initial bug report :-)
"I've tested with "dev_aio_max_queue_depth = 4096" as some gave as a wo...
Christoffer Lilja
03:16 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Radoslaw Zarzynski
02:48 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Thanks for taking the log once again! It looks the *IOContext* carried really huge number of operations:... Radoslaw Zarzynski
12:24 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Here you have a osd log with debug_bluestore=20 and debug_bdev=20:
https://drive.google.com/open?id=11oW6yAG0M6rdMSz...
Christoffer Lilja
12:17 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
It still occurs and I'll come back with the logs asap. Christoffer Lilja
12:09 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Could be that the kernel was just reasonably rejecting the requests because of HW issue.
I would need even more logs...
Radoslaw Zarzynski
06:59 AM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Now I remember that the sata controller that this drive was connected to had a glitch and took down the drives for a ... Christoffer Lilja
04:59 PM Bug #23266 (Won't Fix): "terminate called after throwing an instance of 'std::bad_alloc'" in upgr...
Out of memory ?
Run: http://pulpito.ceph.com/teuthology-2018-03-07_03:25:02-upgrade:kraken-x-luminous-distro-basic...
Yuri Weinstein
02:39 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
We didn't run any deep scrubs the last day or so due to a longer backfill, will report back later. Paul Emmerich
01:45 PM Bug #23251 (Rejected): ceph daemon osd.NNN slow_used_bytes and slow_total_bytes wrong?
BlueStore has a BlueFS rebalance feature that dynamically reserves some amount of space for BlueFS at 'slow' device -... Igor Fedotov
01:24 PM Bug #23251: ceph daemon osd.NNN slow_used_bytes and slow_total_bytes wrong?
Thanks for responding, I didn't realize that, thought from looking at code that it was used for data as well. You ca... Ben England
09:31 AM Bug #23251: ceph daemon osd.NNN slow_used_bytes and slow_total_bytes wrong?
"slow_total_bytes" and "slow_used_bytes" are under "BlueFS" section and denotes just a fraction of BlueStore block de... Igor Fedotov

03/06/2018

08:41 PM Bug #22977: High CPU load caused by operations on onode_map
Daniel,
Version for 12.2.4:
https://shaman.ceph.com/builds/ceph/wip-22977-inspect-onode-map-12-2-4/
Adam Kupczyk
03:53 PM Bug #22977: High CPU load caused by operations on onode_map
Thanks! I've updated one host and will report back tomorrow.
This is what the output looked like after ~20 hours w...
Paul Emmerich
03:31 PM Bug #22977: High CPU load caused by operations on onode_map
Adam, can you get me a build for 12.2.4? My results are pretty immediate.
Daniel Pryor
09:47 AM Bug #22977: High CPU load caused by operations on onode_map
Hi Paul,
An attempt to rectify hash problem is here:
https://shaman.ceph.com/builds/ceph/wip-22977-inspect-onode...
Adam Kupczyk
09:17 AM Bug #22977: High CPU load caused by operations on onode_map
Paul Emmerich wrote:
> Thank you!
>
> I'll run this on one OSD and report back tomorrow as it takes some time for...
Daniel Pryor
09:14 AM Bug #22977: High CPU load caused by operations on onode_map
Paul Emmerich wrote:
> Thank you!
>
> I'll run this on one OSD and report back tomorrow as it takes some time for...
Daniel Pryor
07:52 PM Bug #23251 (Rejected): ceph daemon osd.NNN slow_used_bytes and slow_total_bytes wrong?
version: ceph-osd-12.2.1-34.el7cp.x86_64 = RHCS 3.0z1
In trying to understand ceph daemon osd.NNN perf dump counte...
Ben England
07:50 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Hi,
Here comes a logfile with debug info, it's pretty big so I share it through Google Drive:
https://drive.googl...
Christoffer Lilja
06:36 PM Bug #23246 (Need More Info): [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Radoslaw Zarzynski
06:35 PM Bug #23246: [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Yeah, looks like _io_submit()_ was constantly returning *EAGAIN* and the number of retries (16) has been exhausted. L... Radoslaw Zarzynski
03:08 PM Bug #23246 (Resolved): [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
Hi,
Got a bug at one of my OSD's, see a snippet below. Full logfile also attached.
I've tested with "dev_aio_max_...
Christoffer Lilja
01:50 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Packages with the paranoid checker "are available":https://shaman.ceph.com/builds/ceph/wip-bug22102-paranoid-checker-... Radoslaw Zarzynski
11:59 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
@Rory
I can't find find a call to _RocksDBStore::get()_ in the attached trace. The process died also because of _S...
Radoslaw Zarzynski
09:30 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi Paul,
Currently, I am continuing research on "crc 0x6706be76" issue. Its relations to deep-scrub errors will b...
Adam Kupczyk

03/05/2018

09:47 PM Bug #22534 (In Progress): Debian's bluestore *rocksdb* does not support neither fast CRC nor comp...
The build args are all coming from ceph.spec.in or debian/rules, and should match up with the builds you see in shama... Sage Weil
08:15 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
It really looks our official packages don't provide the FastCRC32 support in RocksDB. The report from verification is... Radoslaw Zarzynski
06:29 AM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
Yes, downloaded from download.ceph.com Марк Коренберг
06:37 PM Bug #22977: High CPU load caused by operations on onode_map
Thank you!
I'll run this on one OSD and report back tomorrow as it takes some time for the problem to appear.
B...
Paul Emmerich
09:35 AM Bug #22977: High CPU load caused by operations on onode_map
Hi Paul,
I made a development build, based on 12.2.2.
It will dump stats from onode_map, a container we suspect t...
Adam Kupczyk
05:03 PM Backport #23226 (Resolved): luminous: bluestore_cache_data uses too much memory
https://github.com/ceph/ceph/pull/21059 Nathan Cutler
03:51 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
I think I'm having similar problem in my 3 nodes ceph cluster.
It's installed on proxmox nodes, each with 3x1TB HDD ...
Marco Baldini

03/04/2018

04:15 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Used the wrong OSDs for the previous post (these servers mostly had the "head candidate had a read error" scrub error... Paul Emmerich

03/03/2018

08:14 PM Bug #23206: ceph-osd daemon crashes - *** Caught signal (Aborted) **
This is an repeating/ongoing issue, please tell me what to help to investigate this. Anonymous
08:13 PM Bug #23206 (Rejected): ceph-osd daemon crashes - *** Caught signal (Aborted) **
Upfront: sorry for the title, I don't know better.
One of our OSDs on a machine is constantly up/down due to crashin...
Anonymous
02:55 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Tl;dr: retrying the read works.
I've also been running one server with that patch for a few days and got a log for...
Paul Emmerich

03/02/2018

04:34 AM Backport #23173 (In Progress): luminous: BlueFS reports rotational journals if BDEV_WAL is not set
Nathan Cutler
02:45 AM Bug #22957: [bluestore]bstore_kv_final thread seems deadlock
Adam Kupczyk wrote:
> Hi Zhou,
> 1) Could you next time attach with gdb and "bt" of threads bstore_kv_final and fi...
zhou yang
02:28 AM Bug #22957: [bluestore]bstore_kv_final thread seems deadlock
Sage Weil wrote:
> I'm pretty sure this is #21470, fixed in 12.2.2. Please upgrade!
Thanks a lot, I will upgrade...
zhou yang
12:22 AM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
And another slightly different log:... Paul Emmerich
12:13 AM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
I'm also seeing this on a completely unrelated 12.2.4 cluster. Only thing they have in common is: they use erasure co... Paul Emmerich

03/01/2018

05:48 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
I seem to be getting something like this as well. Knocked out 8 osd's in our cluster across multiple hosts. We we're ... rory shcramm
03:42 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
Марк Коренберг wrote:
> I use official .deb packages. and 12.2.1 exactly. (maybe I tested on 1.2.2, I'm not sure)
...
Sage Weil

02/28/2018

07:37 PM Backport #23173: luminous: BlueFS reports rotational journals if BDEV_WAL is not set
https://github.com/ceph/ceph/pull/20651 Greg Farnum
11:20 AM Backport #23173 (Resolved): luminous: BlueFS reports rotational journals if BDEV_WAL is not set
https://github.com/ceph/ceph/pull/20651 Nathan Cutler
03:13 PM Bug #23141: BlueFS reports rotational journals if BDEV_WAL is not set
https://github.com/ceph/ceph/pull/20602 Greg Farnum
01:35 AM Bug #23141 (Pending Backport): BlueFS reports rotational journals if BDEV_WAL is not set
Kefu Chai
02:00 PM Bug #22616 (Pending Backport): bluestore_cache_data uses too much memory
Kefu Chai
08:56 AM Bug #23165: OSD used for Metadata / MDS storage constantly entering heartbeat timeout
After a night of waiting and some million files written and deleted, the sitation has stabilized.
The space-usage...
Oliver Freyermuth

02/27/2018

10:33 PM Bug #23165: OSD used for Metadata / MDS storage constantly entering heartbeat timeout
Ah, and this is Luminous 12.2.4, just upgraded from 12.2.3 a few hours ago. Oliver Freyermuth
10:31 PM Bug #23165: OSD used for Metadata / MDS storage constantly entering heartbeat timeout
It also seems:... Oliver Freyermuth
10:29 PM Bug #23165: OSD used for Metadata / MDS storage constantly entering heartbeat timeout
To clarify, since this was maybe not clear from the ticket text:
Those 4 OSDs were only used for the metadata pool. ...
Oliver Freyermuth
10:28 PM Bug #23165 (Resolved): OSD used for Metadata / MDS storage constantly entering heartbeat timeout
After our stress test creating 100,000,000 small files on cephfs, and now finally deleting all those files, now 2 of ... Oliver Freyermuth
08:50 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
I recently upgraded one of my test nodes, now two of my three node have 16 gb RAM with 4 OSDs (4tb hdd each), the 3rd... Martin Preuss
05:08 PM Bug #22977: High CPU load caused by operations on onode_map
Sure, thanks for looking into it :) Paul Emmerich

02/26/2018

11:22 PM Bug #23120: OSDs continously crash during recovery
Actually, looks like your crashing ops are different from mine. I'll just open a new bug. Peter Woodman
11:21 PM Bug #23120: OSDs continously crash during recovery
Yeah, I've got some of that. Problem is, I'm not seeing debug log messages that should be there based on the failure,... Peter Woodman
09:33 PM Bug #23120: OSDs continously crash during recovery
The bad news (for the ticket) is that the problem vanished after restarting all crashing OSDs often enough,
and temp...
Oliver Freyermuth
09:03 PM Bug #23120 (Need More Info): OSDs continously crash during recovery
Can you reproduce the crash on one or more OSDs with 'debug osd = 20' and 'debug bluestore = 20'?
Also, can you ch...
Sage Weil
08:56 AM Bug #23120: OSDs continously crash during recovery
@Peter Woodman: Since the system recovered after many OSD restarts (see my previous comment) and I did not think abou... Oliver Freyermuth
02:47 AM Bug #23120: OSDs continously crash during recovery
Hey, I might be seeing the same bug. Can you paste in the operation dump that shows up right before that crash, and m... Peter Woodman
09:41 PM Bug #23141 (Fix Under Review): BlueFS reports rotational journals if BDEV_WAL is not set
Greg Farnum
09:33 PM Bug #23141: BlueFS reports rotational journals if BDEV_WAL is not set
BlueFS::wal_is_rotational() returns true if (!bdev[BDEV_WAL]). Updating this to fall back through BDEV_DB and BDEV_SLOW. Greg Farnum
09:32 PM Bug #23141 (Resolved): BlueFS reports rotational journals if BDEV_WAL is not set
This came in from two different users on the mailing list (https://www.spinics.net/lists/ceph-users/msg42873.html, ht... Greg Farnum
09:00 PM Bug #22977 (Need More Info): High CPU load caused by operations on onode_map
Sage Weil
09:00 PM Bug #22977: High CPU load caused by operations on onode_map
Hi Paul,
I'm having trouble viewing the perf report files (crashes on my box; doesn't shows but shows nothing on M...
Sage Weil

02/25/2018

11:55 PM Bug #23120: OSDs continously crash during recovery
All HDD-OSDs have 4 TB, while the SSDs used for the metadata pool have 240 GB. Oliver Freyermuth
11:51 PM Bug #23120: OSDs continously crash during recovery
Here's a ceph osd tree due to popular request:... Oliver Freyermuth
07:27 PM Bug #23120: OSDs continously crash during recovery
Cluster has mostly recovered, looks good.
Still, hopefully the stacktrace and logs can help to track down the under...
Oliver Freyermuth
05:26 PM Bug #23120: OSDs continously crash during recovery
After many restarts of all OSDs, and temporarily lowering min_size, they now stay up. I'll watch and see if the clust... Oliver Freyermuth
05:09 PM Bug #23120: OSDs continously crash during recovery
Here's another log of another OSD:
7de1dddf-27d4-4b6b-9128-0138bfaf85cf
backtrace looks similar.
Oliver Freyermuth
05:00 PM Bug #23120: OSDs continously crash during recovery
It might be that this OSD was subject to OOM at some point in the last 24 hours.
It seems OSDs are using 2-3 times ...
Oliver Freyermuth
04:23 PM Bug #23120 (Can't reproduce): OSDs continously crash during recovery
I have several OSDs continuously crashing during recovery. This is Luminous 12.2.3. ... Oliver Freyermuth

02/23/2018

05:16 AM Backport #23074 (In Progress): luminous: bluestore: statfs available can go negative
https://github.com/ceph/ceph/pull/20554 Prashant D

02/22/2018

11:15 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi Martin,
I am not sure yet what causes problem with 0x6706be76 crc.
To pinpoint, I added debug code to close i...
Adam Kupczyk

02/21/2018

09:25 PM Backport #23074 (Resolved): luminous: bluestore: statfs available can go negative
https://github.com/ceph/ceph/pull/20554 Nathan Cutler
04:34 PM Bug #23040 (Pending Backport): bluestore: statfs available can go negative
Sage Weil
11:09 AM Backport #23063 (Resolved): luminous: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: as...
https://github.com/ceph/ceph/pull/21394 Nathan Cutler

02/20/2018

06:44 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
I use official .deb packages. and 12.2.1 exactly. (maybe I tested on 1.2.2, I'm not sure) Марк Коренберг
02:33 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
Yeah, I agree. Is there a build log from the debian build farm?
The packages we build upstream *do* appear to h...
Sage Weil
03:34 PM Bug #21781 (Can't reproduce): bluestore: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsu...
Sage Weil
03:33 PM Feature #21741 (In Progress): os/bluestore: multi-tier support in BlueStore
Sage Weil
03:32 PM Bug #22044 (Can't reproduce): rocksdb log replay - corruption: missing start of fragmented record
please let us know and we can reopen if this is still an issue with the latest code. Sage Weil
03:32 PM Bug #21550 (Can't reproduce): PG errors reappearing after OSD node rebooted on Luminous
Sage Weil
03:01 PM Bug #21550: PG errors reappearing after OSD node rebooted on Luminous
Hi Sage,
No, I have not seen the problem on this test cluster since rebuilding it with 12.2.2, and the system has ...
Eric Eastman
03:26 PM Bug #22510 (Pending Backport): osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 ...
https://github.com/ceph/ceph/pull/18494 is the fix in master; should be backported to luminous Sage Weil
03:24 PM Bug #22796: bluestore gets to ENOSPC with small devices
The full checks rely on a (slow) feedback loop. For small devices, it's easy to go faster than the "set the full fla... Sage Weil
03:21 PM Bug #21809 (Can't reproduce): Raw Used space is 70x higher than actually used space (maybe orphan...
Sage Weil
03:12 PM Bug #22957: [bluestore]bstore_kv_final thread seems deadlock
Hi Zhou,
1) Could you next time attach with gdb and "bt" of threads bstore_kv_final and finisher.
2) Are you worki...
Adam Kupczyk
02:18 PM Bug #22957 (Duplicate): [bluestore]bstore_kv_final thread seems deadlock
I'm pretty sure this is #21470, fixed in 12.2.2. Please upgrade! Sage Weil
02:35 PM Bug #22616 (Fix Under Review): bluestore_cache_data uses too much memory
https://github.com/ceph/ceph/pull/20498 Sage Weil
02:29 PM Feature #20801 (Rejected): ability to rebuild BlueStore WAL journals is missing
The wal or journal is an integral part of the store. The data store cannot be reconstructed without it. Sage Weil
02:28 PM Bug #21068 (Won't Fix): ceph-disk deploy bluestore fails to create correct block symlink for mult...
focusing on ceph-volume instead of ceph-disk for bluestore support. Sage Weil
02:26 PM Bug #22245 (Can't reproduce): [segfault] ceph-bluestore-tool bluefs-log-dump
Sage Weil
02:26 PM Bug #22609 (Can't reproduce): thrash-eio + bluestore fails with "reached maximum tries (3650) aft...
Haven't seen this failure in quite a while.. I think it may be resolved! reopen if it reappears Sage Weil
02:24 PM Feature #21306 (Rejected): Reduce RBD filestore/bluestore fragmentation throught fallocate
We already have a function like this for filestore: librbd does a hint and filestore calls the xfs ioctl to set the d... Sage Weil
02:21 PM Feature #22159 (In Progress): allow tracking of bluestore compression ratio by pool
see https://github.com/ceph/ceph/pull/19454 Sage Weil
02:20 PM Bug #21040 (Resolved): bluestore: multiple objects (clones?) referencing same blocks (on all repl...
The original bug here is fixed. Meanwhile, Igor is working on a repair function for ceph-bluestore-tool that will co... Sage Weil
02:16 PM Bug #20870 (In Progress): OSD compression: incorrect display of the used disk space
The problem is that currently the RAW USED stats is just USED * (replications or ec factor).
Igor is working on ...
Sage Weil
01:06 PM Bug #20385 (Won't Fix): jemalloc+Bluestore+BlueFS causes unexpected RSS Memory usage growth
we don't care about jemalloc at this point Sage Weil

02/19/2018

11:22 PM Bug #20997 (Can't reproduce): bluestore_types.h: 739: FAILED assert(p != extents.end())
Marek, if you still see this (or a related issue) please ping me. I dropped the ball on this bug but we're newly foc... Sage Weil
05:49 PM Bug #21550 (Need More Info): PG errors reappearing after OSD node rebooted on Luminous
Hi Eric,
Do you still see this problem? I haven't seen anything like it so I'm hoping this is an artifact of 12.2.0
Sage Weil
05:47 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Sage Weil
05:47 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin Preuss wrote:
> I would like to try with filestore, but since the introduction of this ceph-volume stuff crea...
Sage Weil
04:24 PM Bug #22796: bluestore gets to ENOSPC with small devices
Yes The 22GB is correct, the 16PB is not. I created a quick set of SSD OSDs to test new crush rules from what the OSD... David Turner
04:10 PM Bug #22796 (Need More Info): bluestore gets to ENOSPC with small devices
... Sage Weil
04:16 PM Bug #23040 (Fix Under Review): bluestore: statfs available can go negative
https://github.com/ceph/ceph/pull/20487 Sage Weil
04:14 PM Bug #23040 (Resolved): bluestore: statfs available can go negative
see https://tracker.ceph.com/issues/22796 Sage Weil
03:41 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
Sage Weil
09:53 AM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
There is another one issue reproduction at https://tracker.ceph.com/issues/22977 Igor Fedotov
09:52 AM Bug #22977: High CPU load caused by operations on onode_map
The last backtrace seems similar to the one from
https://tracker.ceph.com/issues/21259#change-107555
and not sur...
Igor Fedotov

02/16/2018

11:58 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
Having this issue multiple times on ceph version 12.2.2 with this patch https://github.com/ceph/ceph/pull/18805 alrea... Petr Ilyin
05:32 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi Paul,
Paul Emmerich wrote:
> Martin, let's compare hardware. The only cluster I'm seeing this on is HP servers...
Martin Preuss
05:24 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi,
now I see an inconsistent PG for which both OSDs report that HEAD has a read error, but in this case no read e...
Martin Preuss

02/14/2018

06:20 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Sage Weil wrote:
> If this is something you can reproduce that would be extermely helpful.
We can't reproduce bug...
Artemy Kapitula
06:06 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Sage Weil wrote:
> Have you seen this bug occur since you filed the bug? One other user was seeing it but we've bee...
Artemy Kapitula

02/13/2018

06:16 PM Bug #22977: High CPU load caused by operations on onode_map
I've just observed an OSD crashing with the following log.
Feb 13 09:36:03 ceph-XXX-osd-a10 ceph-osd[24106]: *...
Paul Emmerich

02/12/2018

09:25 PM Bug #22977: High CPU load caused by operations on onode_map
32 OSDs, so ~650k objects/ec chunks per OSDs. Paul Emmerich
11:51 AM Bug #22977: High CPU load caused by operations on onode_map
The cluster is running with default settings except for osd_max_backfills.
Sorry, I completely forgot about the me...
Paul Emmerich
10:35 AM Bug #22977: High CPU load caused by operations on onode_map
Hi Paul.
Could you please share your Ceph settings?
Also please collect mempool statistics on a saturated OSD using...
Igor Fedotov
07:10 PM Bug #22102 (Need More Info): BlueStore crashed on rocksdb checksum mismatch
Sage Weil
07:10 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Have you seen this bug occur since you filed the bug? One other user was seeing it but we've been able to generate l... Sage Weil
09:25 AM Bug #22978 (Duplicate): assert in Throttle.cc on primary OSD with Bluestore and an erasure coded ...
Looks like a duplicate of http://tracker.ceph.com/issues/22539. Should be fixed in v12.2.3 Igor Fedotov
02:58 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
... Kefu Chai

02/11/2018

05:27 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin, let's compare hardware. The only cluster I'm seeing this on is HP servers with smartarray controllers and lot... Paul Emmerich

02/10/2018

10:17 PM Bug #22977: High CPU load caused by operations on onode_map
Here's a "perf dump" from an OSD suffering from this.
Potentially relevant onode data that looks similar on all OS...
Paul Emmerich
12:04 AM Bug #22978 (Duplicate): assert in Throttle.cc on primary OSD with Bluestore and an erasure coded ...
An assert is being hit in Throttle.cc with BlueStore, when a large object is being written on the primary OSD in an e... Subhachandra Chandra

02/09/2018

09:14 PM Bug #22977 (Resolved): High CPU load caused by operations on onode_map
I'm investigating performance on a cluster that shows an unusually high CPU load.
Setup are Bluestore OSDs running m...
Paul Emmerich

02/08/2018

03:50 PM Bug #22616: bluestore_cache_data uses too much memory
Ok, I think the thing to do here is make the bluestore trimming a bit more frequent, and have this as a known caveat ... Sage Weil
07:52 AM Bug #22957 (Duplicate): [bluestore]bstore_kv_final thread seems deadlock
ceph 12.2.1
ec overwrite
cephfs performance test
_pool 2 'fs_data' erasure size 3 min_size 3 crush_rule 1 obj...
zhou yang
 

Also available in: Atom