Activity
From 01/21/2018 to 02/19/2018
02/19/2018
- 11:22 PM Bug #20997 (Can't reproduce): bluestore_types.h: 739: FAILED assert(p != extents.end())
- Marek, if you still see this (or a related issue) please ping me. I dropped the ball on this bug but we're newly foc...
- 05:49 PM Bug #21550 (Need More Info): PG errors reappearing after OSD node rebooted on Luminous
- Hi Eric,
Do you still see this problem? I haven't seen anything like it so I'm hoping this is an artifact of 12.2.0 - 05:47 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- 05:47 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Martin Preuss wrote:
> I would like to try with filestore, but since the introduction of this ceph-volume stuff crea... - 04:24 PM Bug #22796: bluestore gets to ENOSPC with small devices
- Yes The 22GB is correct, the 16PB is not. I created a quick set of SSD OSDs to test new crush rules from what the OSD...
- 04:10 PM Bug #22796 (Need More Info): bluestore gets to ENOSPC with small devices
- ...
- 04:16 PM Bug #23040 (Fix Under Review): bluestore: statfs available can go negative
- https://github.com/ceph/ceph/pull/20487
- 04:14 PM Bug #23040 (Resolved): bluestore: statfs available can go negative
- see https://tracker.ceph.com/issues/22796
- 03:41 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- 09:53 AM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- There is another one issue reproduction at https://tracker.ceph.com/issues/22977
- 09:52 AM Bug #22977: High CPU load caused by operations on onode_map
- The last backtrace seems similar to the one from
https://tracker.ceph.com/issues/21259#change-107555
and not sur...
02/16/2018
- 11:58 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- Having this issue multiple times on ceph version 12.2.2 with this patch https://github.com/ceph/ceph/pull/18805 alrea...
- 05:32 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Hi Paul,
Paul Emmerich wrote:
> Martin, let's compare hardware. The only cluster I'm seeing this on is HP servers... - 05:24 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Hi,
now I see an inconsistent PG for which both OSDs report that HEAD has a read error, but in this case no read e...
02/14/2018
- 06:20 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Sage Weil wrote:
> If this is something you can reproduce that would be extermely helpful.
We can't reproduce bug... - 06:06 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Sage Weil wrote:
> Have you seen this bug occur since you filed the bug? One other user was seeing it but we've bee...
02/13/2018
- 06:16 PM Bug #22977: High CPU load caused by operations on onode_map
- I've just observed an OSD crashing with the following log.
Feb 13 09:36:03 ceph-XXX-osd-a10 ceph-osd[24106]: *...
02/12/2018
- 09:25 PM Bug #22977: High CPU load caused by operations on onode_map
- 32 OSDs, so ~650k objects/ec chunks per OSDs.
- 11:51 AM Bug #22977: High CPU load caused by operations on onode_map
- The cluster is running with default settings except for osd_max_backfills.
Sorry, I completely forgot about the me... - 10:35 AM Bug #22977: High CPU load caused by operations on onode_map
- Hi Paul.
Could you please share your Ceph settings?
Also please collect mempool statistics on a saturated OSD using... - 07:10 PM Bug #22102 (Need More Info): BlueStore crashed on rocksdb checksum mismatch
- 07:10 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Have you seen this bug occur since you filed the bug? One other user was seeing it but we've been able to generate l...
- 09:25 AM Bug #22978 (Duplicate): assert in Throttle.cc on primary OSD with Bluestore and an erasure coded ...
- Looks like a duplicate of http://tracker.ceph.com/issues/22539. Should be fixed in v12.2.3
- 02:58 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
- ...
02/11/2018
- 05:27 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Martin, let's compare hardware. The only cluster I'm seeing this on is HP servers with smartarray controllers and lot...
02/10/2018
- 10:17 PM Bug #22977: High CPU load caused by operations on onode_map
- Here's a "perf dump" from an OSD suffering from this.
Potentially relevant onode data that looks similar on all OS... - 12:04 AM Bug #22978 (Duplicate): assert in Throttle.cc on primary OSD with Bluestore and an erasure coded ...
- An assert is being hit in Throttle.cc with BlueStore, when a large object is being written on the primary OSD in an e...
02/09/2018
- 09:14 PM Bug #22977 (Resolved): High CPU load caused by operations on onode_map
- I'm investigating performance on a cluster that shows an unusually high CPU load.
Setup are Bluestore OSDs running m...
02/08/2018
- 03:50 PM Bug #22616: bluestore_cache_data uses too much memory
- Ok, I think the thing to do here is make the bluestore trimming a bit more frequent, and have this as a known caveat ...
- 07:52 AM Bug #22957 (Duplicate): [bluestore]bstore_kv_final thread seems deadlock
- ceph 12.2.1
ec overwrite
cephfs performance test
_pool 2 'fs_data' erasure size 3 min_size 3 crush_rule 1 obj...
02/07/2018
- 07:38 AM Bug #22285 (Resolved): _read_bdev_label unable to decode label at offset
- 07:38 AM Backport #22892 (Resolved): luminous: _read_bdev_label unable to decode label at offset
02/06/2018
- 10:42 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Update: Now at least one other host starts giving me these crc errors, too...
So I have now at least two out of th... - 10:19 PM Backport #22892: luminous: _read_bdev_label unable to decode label at offset
- merged https://github.com/ceph/ceph/pull/20326
02/05/2018
- 08:41 PM Backport #22892 (In Progress): luminous: _read_bdev_label unable to decode label at offset
- 06:26 PM Backport #22892: luminous: _read_bdev_label unable to decode label at offset
- http://tracker.ceph.com/issues/22892
02/03/2018
- 07:24 AM Bug #22535 (Resolved): OSD crushes with FAILED assert(used_blocks.size() > count) during the firs...
- 07:24 AM Backport #22633 (Resolved): luminous: OSD crushes with FAILED assert(used_blocks.size() > count) ...
- 07:14 AM Backport #22698 (Resolved): luminous: bluestore: New OSD - Caught signal - bstore_kv_sync
- 01:08 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- I just re-created all 3 OSDs on ceph1 (the host which had the read errors).
Now the errors occur less often, but t...
02/02/2018
- 10:37 PM Backport #22633: luminous: OSD crushes with FAILED assert(used_blocks.size() > count) during the ...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19888
merged - 10:32 PM Backport #22698: luminous: bluestore: New OSD - Caught signal - bstore_kv_sync
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19995
merged - 01:29 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- It doesn't seem to happen on all servers, it's only 5 out of 15.
But there is nothing special about the affected ser... - 03:08 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- I'm also seeing this on one cluster. Bluestore and CephFS, replicated pools, no compression, HDDs.
It happens random... - 06:03 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
02/01/2018
- 11:44 PM Backport #22892 (Resolved): luminous: _read_bdev_label unable to decode label at offset
- https://github.com/ceph/ceph/pull/20326
- 11:31 PM Bug #22161 (Resolved): bluestore: do not crash on over-large objects
- 11:30 PM Backport #22507 (Resolved): luminous: bluestore: do not crash on over-large objects
- 11:07 PM Backport #22507: luminous: bluestore: do not crash on over-large objects
- Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/19630
merged - 04:11 PM Bug #22285: _read_bdev_label unable to decode label at offset
- master pr: https://github.com/ceph/ceph/pull/20090
- 02:51 PM Bug #22285 (Pending Backport): _read_bdev_label unable to decode label at offset
- 12:01 PM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
- https://github.com/ceph/ceph/pull/20230
- 07:54 AM Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc
- Hi, just wanted to report I'm hitting the same issue on centos 7 with jemalloc-3.6.0-1.el7 and ceph 12.2.2
01/31/2018
- 08:23 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- I have the same problem on my cluster. Periodically I got pg inconsistent only on bluestore osd with this type of mes...
01/29/2018
- 10:43 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
- actual_allocated_size - expected_allocated_size = 4259840 - 4194304 = 0x10000
- 04:25 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
- http://pulpito.ceph.com/yuriw-2018-01-26_18:13:44-rados-wip_yuri_master_1.26.18-distro-basic-smithi/2112995/...
- 02:27 AM Bug #22796: bluestore gets to ENOSPC with small devices
- David Turner wrote:
> I was able to resolve this issue by using the ceph-objectstore-tool to remove copies of PGs so...
01/28/2018
- 03:54 PM Bug #22796: bluestore gets to ENOSPC with small devices
- I was able to resolve this issue by using the ceph-objectstore-tool to remove copies of PGs so the osds could start. ...
01/27/2018
- 05:55 PM Bug #22102 (In Progress): BlueStore crashed on rocksdb checksum mismatch
- full logs at 5e38cf1e-532a-4aa4-8289-5b9e9c59632a
01/26/2018
- 01:27 PM Bug #22796: bluestore gets to ENOSPC with small devices
- This might be a red herring. I think Nick Fisk on the ML found the problem. Originally the output of `ceph osd df` s...
- 01:24 PM Bug #22796: bluestore gets to ENOSPC with small devices
- debug bluestore = 20 log for the same OSD as before.
ceph-post-file: 06b467b7-4a91-4263-85e0-c89268b694e3 - 01:16 PM Bug #22796: bluestore gets to ENOSPC with small devices
- Please use ceph-post-file to upload the full logs.
01/25/2018
- 02:35 PM Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc
- The arch is x68_64. Ceph was installed from eu.ceph.com deb repo. This issue isn't current for me anymore as the clus...
- 02:24 PM Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc
- Hi Mikko,
What architecture are you running on?
I tried to match your callstacks with binaries for x86_64 for "ceph... - 02:10 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Martin,
For "device location [0x6d76b40000~1000]" it would be:
dd bs=4096 if=/var/lib/ceph/osd/ceph-1/block skip=... - 11:11 AM Bug #22796: bluestore gets to ENOSPC with small devices
- David Turner wrote:
> Here's a log with `debug bluestore = 5`.
- 11:10 AM Bug #22796: bluestore gets to ENOSPC with small devices
- Here's a log with `debug bluestore 5`.
- 11:00 AM Bug #22796: bluestore gets to ENOSPC with small devices
- Can you attach logs with lower debug level? E.g. debug bluestore = 5
- 10:51 AM Bug #22796 (Resolved): bluestore gets to ENOSPC with small devices
- I have a 3 node cluster with mon, mds, mgr, and osds all running on each. The steps I've recently performed on my cl...
01/23/2018
- 10:24 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Hi,
how do I translate the given location, e.g. to a "dd" argument?
Meanwhile I found out that only the first m... - 12:23 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Martin, your logs show places where data is located, for example: "device location [0x6d76b40000~1000]".
Is it possi... - 04:05 PM Bug #22285: _read_bdev_label unable to decode label at offset
- 10:21 AM Backport #22698: luminous: bluestore: New OSD - Caught signal - bstore_kv_sync
- @Prashant Please fix the cherry-pick conflict resolution as suggested by Igor in the PR.
- 12:16 AM Bug #22427 (Resolved): osd_fsid does not exist, fsid is generated instead
01/22/2018
- 08:17 PM Bug #22427 (Fix Under Review): osd_fsid does not exist, fsid is generated instead
- PR at https://github.com/ceph/ceph/pull/20059
- 03:51 PM Bug #22427 (Triaged): osd_fsid does not exist, fsid is generated instead
- 03:53 PM Bug #22510: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "allocate failed...
- 03:51 PM Bug #22245 (Need More Info): [segfault] ceph-bluestore-tool bluefs-log-dump
- can you still reproduce this? do you have (or can you generate) a core file? THe log doesn't tell us where it faile...
- 03:45 PM Bug #22115 (Duplicate): OSD SIGABRT on bluestore_prefer_deferred_size = 104857600: assert(_buffer...
- see #21932
- 03:43 PM Bug #22543 (Can't reproduce): OSDs can not start after shutdown, killed by OOM killer during PGs ...
- 03:40 PM Bug #22066 (Duplicate): bluestore osd asserts repeatedly with ceph-12.2.1/src/include/buffer.h: 8...
- see #21932, pending backport, should be in 12.2.3
- 03:34 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Martin, can you check your dmesg/kernel log and see if there are any media errors? The crc value is for a single blo...
- 03:16 PM Backport #22264 (Resolved): luminous: bluestore: db.slow used when db is not full
- 03:02 PM Backport #22264: luminous: bluestore: db.slow used when db is not full
- luminous cherry-pick is merged.
- 06:00 AM Bug #22616: bluestore_cache_data uses too much memory
- I did some test with bluestore_default_buffered_read = false
The bluestore_cache_data now only use around a fe...
Also available in: Atom