Project

General

Profile

Activity

From 01/21/2018 to 02/19/2018

02/19/2018

11:22 PM Bug #20997 (Can't reproduce): bluestore_types.h: 739: FAILED assert(p != extents.end())
Marek, if you still see this (or a related issue) please ping me. I dropped the ball on this bug but we're newly foc... Sage Weil
05:49 PM Bug #21550 (Need More Info): PG errors reappearing after OSD node rebooted on Luminous
Hi Eric,
Do you still see this problem? I haven't seen anything like it so I'm hoping this is an artifact of 12.2.0
Sage Weil
05:47 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Sage Weil
05:47 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin Preuss wrote:
> I would like to try with filestore, but since the introduction of this ceph-volume stuff crea...
Sage Weil
04:24 PM Bug #22796: bluestore gets to ENOSPC with small devices
Yes The 22GB is correct, the 16PB is not. I created a quick set of SSD OSDs to test new crush rules from what the OSD... David Turner
04:10 PM Bug #22796 (Need More Info): bluestore gets to ENOSPC with small devices
... Sage Weil
04:16 PM Bug #23040 (Fix Under Review): bluestore: statfs available can go negative
https://github.com/ceph/ceph/pull/20487 Sage Weil
04:14 PM Bug #23040 (Resolved): bluestore: statfs available can go negative
see https://tracker.ceph.com/issues/22796 Sage Weil
03:41 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
Sage Weil
09:53 AM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
There is another one issue reproduction at https://tracker.ceph.com/issues/22977 Igor Fedotov
09:52 AM Bug #22977: High CPU load caused by operations on onode_map
The last backtrace seems similar to the one from
https://tracker.ceph.com/issues/21259#change-107555
and not sur...
Igor Fedotov

02/16/2018

11:58 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
Having this issue multiple times on ceph version 12.2.2 with this patch https://github.com/ceph/ceph/pull/18805 alrea... Petr Ilyin
05:32 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi Paul,
Paul Emmerich wrote:
> Martin, let's compare hardware. The only cluster I'm seeing this on is HP servers...
Martin Preuss
05:24 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi,
now I see an inconsistent PG for which both OSDs report that HEAD has a read error, but in this case no read e...
Martin Preuss

02/14/2018

06:20 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Sage Weil wrote:
> If this is something you can reproduce that would be extermely helpful.
We can't reproduce bug...
Artemy Kapitula
06:06 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Sage Weil wrote:
> Have you seen this bug occur since you filed the bug? One other user was seeing it but we've bee...
Artemy Kapitula

02/13/2018

06:16 PM Bug #22977: High CPU load caused by operations on onode_map
I've just observed an OSD crashing with the following log.
Feb 13 09:36:03 ceph-XXX-osd-a10 ceph-osd[24106]: *...
Paul Emmerich

02/12/2018

09:25 PM Bug #22977: High CPU load caused by operations on onode_map
32 OSDs, so ~650k objects/ec chunks per OSDs. Paul Emmerich
11:51 AM Bug #22977: High CPU load caused by operations on onode_map
The cluster is running with default settings except for osd_max_backfills.
Sorry, I completely forgot about the me...
Paul Emmerich
10:35 AM Bug #22977: High CPU load caused by operations on onode_map
Hi Paul.
Could you please share your Ceph settings?
Also please collect mempool statistics on a saturated OSD using...
Igor Fedotov
07:10 PM Bug #22102 (Need More Info): BlueStore crashed on rocksdb checksum mismatch
Sage Weil
07:10 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
Have you seen this bug occur since you filed the bug? One other user was seeing it but we've been able to generate l... Sage Weil
09:25 AM Bug #22978 (Duplicate): assert in Throttle.cc on primary OSD with Bluestore and an erasure coded ...
Looks like a duplicate of http://tracker.ceph.com/issues/22539. Should be fixed in v12.2.3 Igor Fedotov
02:58 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
... Kefu Chai

02/11/2018

05:27 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin, let's compare hardware. The only cluster I'm seeing this on is HP servers with smartarray controllers and lot... Paul Emmerich

02/10/2018

10:17 PM Bug #22977: High CPU load caused by operations on onode_map
Here's a "perf dump" from an OSD suffering from this.
Potentially relevant onode data that looks similar on all OS...
Paul Emmerich
12:04 AM Bug #22978 (Duplicate): assert in Throttle.cc on primary OSD with Bluestore and an erasure coded ...
An assert is being hit in Throttle.cc with BlueStore, when a large object is being written on the primary OSD in an e... Subhachandra Chandra

02/09/2018

09:14 PM Bug #22977 (Resolved): High CPU load caused by operations on onode_map
I'm investigating performance on a cluster that shows an unusually high CPU load.
Setup are Bluestore OSDs running m...
Paul Emmerich

02/08/2018

03:50 PM Bug #22616: bluestore_cache_data uses too much memory
Ok, I think the thing to do here is make the bluestore trimming a bit more frequent, and have this as a known caveat ... Sage Weil
07:52 AM Bug #22957 (Duplicate): [bluestore]bstore_kv_final thread seems deadlock
ceph 12.2.1
ec overwrite
cephfs performance test
_pool 2 'fs_data' erasure size 3 min_size 3 crush_rule 1 obj...
zhou yang

02/07/2018

07:38 AM Bug #22285 (Resolved): _read_bdev_label unable to decode label at offset
Nathan Cutler
07:38 AM Backport #22892 (Resolved): luminous: _read_bdev_label unable to decode label at offset
Nathan Cutler

02/06/2018

10:42 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Update: Now at least one other host starts giving me these crc errors, too...
So I have now at least two out of th...
Martin Preuss
10:19 PM Backport #22892: luminous: _read_bdev_label unable to decode label at offset
merged https://github.com/ceph/ceph/pull/20326 Yuri Weinstein

02/05/2018

08:41 PM Backport #22892 (In Progress): luminous: _read_bdev_label unable to decode label at offset
Nathan Cutler
06:26 PM Backport #22892: luminous: _read_bdev_label unable to decode label at offset
http://tracker.ceph.com/issues/22892 Abhishek Lekshmanan

02/03/2018

07:24 AM Bug #22535 (Resolved): OSD crushes with FAILED assert(used_blocks.size() > count) during the firs...
Nathan Cutler
07:24 AM Backport #22633 (Resolved): luminous: OSD crushes with FAILED assert(used_blocks.size() > count) ...
Nathan Cutler
07:14 AM Backport #22698 (Resolved): luminous: bluestore: New OSD - Caught signal - bstore_kv_sync
Nathan Cutler
01:08 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
I just re-created all 3 OSDs on ceph1 (the host which had the read errors).
Now the errors occur less often, but t...
Martin Preuss

02/02/2018

10:37 PM Backport #22633: luminous: OSD crushes with FAILED assert(used_blocks.size() > count) during the ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19888
merged
Yuri Weinstein
10:32 PM Backport #22698: luminous: bluestore: New OSD - Caught signal - bstore_kv_sync
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19995
merged
Yuri Weinstein
01:29 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
It doesn't seem to happen on all servers, it's only 5 out of 15.
But there is nothing special about the affected ser...
Paul Emmerich
03:08 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
I'm also seeing this on one cluster. Bluestore and CephFS, replicated pools, no compression, HDDs.
It happens random...
Paul Emmerich
06:03 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
Nathan Cutler

02/01/2018

11:44 PM Backport #22892 (Resolved): luminous: _read_bdev_label unable to decode label at offset
https://github.com/ceph/ceph/pull/20326 Nathan Cutler
11:31 PM Bug #22161 (Resolved): bluestore: do not crash on over-large objects
Nathan Cutler
11:30 PM Backport #22507 (Resolved): luminous: bluestore: do not crash on over-large objects
Nathan Cutler
11:07 PM Backport #22507: luminous: bluestore: do not crash on over-large objects
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/19630
merged
Yuri Weinstein
04:11 PM Bug #22285: _read_bdev_label unable to decode label at offset
master pr: https://github.com/ceph/ceph/pull/20090 Abhishek Lekshmanan
02:51 PM Bug #22285 (Pending Backport): _read_bdev_label unable to decode label at offset
Sage Weil
12:01 PM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
https://github.com/ceph/ceph/pull/20230 Igor Fedotov
07:54 AM Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc
Hi, just wanted to report I'm hitting the same issue on centos 7 with jemalloc-3.6.0-1.el7 and ceph 12.2.2 Nikola Ciprich

01/31/2018

08:23 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
I have the same problem on my cluster. Periodically I got pg inconsistent only on bluestore osd with this type of mes... Nicolas Drufin

01/29/2018

10:43 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
actual_allocated_size - expected_allocated_size = 4259840 - 4194304 = 0x10000 Kefu Chai
04:25 AM Bug #21312: occaionsal ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
http://pulpito.ceph.com/yuriw-2018-01-26_18:13:44-rados-wip_yuri_master_1.26.18-distro-basic-smithi/2112995/... Kefu Chai
02:27 AM Bug #22796: bluestore gets to ENOSPC with small devices
David Turner wrote:
> I was able to resolve this issue by using the ceph-objectstore-tool to remove copies of PGs so...
Brad Hubbard

01/28/2018

03:54 PM Bug #22796: bluestore gets to ENOSPC with small devices
I was able to resolve this issue by using the ceph-objectstore-tool to remove copies of PGs so the osds could start. ... David Turner

01/27/2018

05:55 PM Bug #22102 (In Progress): BlueStore crashed on rocksdb checksum mismatch
full logs at 5e38cf1e-532a-4aa4-8289-5b9e9c59632a Sage Weil

01/26/2018

01:27 PM Bug #22796: bluestore gets to ENOSPC with small devices
This might be a red herring. I think Nick Fisk on the ML found the problem. Originally the output of `ceph osd df` s... David Turner
01:24 PM Bug #22796: bluestore gets to ENOSPC with small devices
debug bluestore = 20 log for the same OSD as before.
ceph-post-file: 06b467b7-4a91-4263-85e0-c89268b694e3
David Turner
01:16 PM Bug #22796: bluestore gets to ENOSPC with small devices
Please use ceph-post-file to upload the full logs. Greg Farnum

01/25/2018

02:35 PM Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc
The arch is x68_64. Ceph was installed from eu.ceph.com deb repo. This issue isn't current for me anymore as the clus... Mikko Tanner
02:24 PM Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc
Hi Mikko,
What architecture are you running on?
I tried to match your callstacks with binaries for x86_64 for "ceph...
Adam Kupczyk
02:10 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin,
For "device location [0x6d76b40000~1000]" it would be:
dd bs=4096 if=/var/lib/ceph/osd/ceph-1/block skip=...
Adam Kupczyk
11:11 AM Bug #22796: bluestore gets to ENOSPC with small devices
David Turner wrote:
> Here's a log with `debug bluestore = 5`.
David Turner
11:10 AM Bug #22796: bluestore gets to ENOSPC with small devices
Here's a log with `debug bluestore 5`. David Turner
11:00 AM Bug #22796: bluestore gets to ENOSPC with small devices
Can you attach logs with lower debug level? E.g. debug bluestore = 5 Igor Fedotov
10:51 AM Bug #22796 (Resolved): bluestore gets to ENOSPC with small devices
I have a 3 node cluster with mon, mds, mgr, and osds all running on each. The steps I've recently performed on my cl... David Turner

01/23/2018

10:24 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Hi,
how do I translate the given location, e.g. to a "dd" argument?
Meanwhile I found out that only the first m...
Martin Preuss
12:23 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin, your logs show places where data is located, for example: "device location [0x6d76b40000~1000]".
Is it possi...
Adam Kupczyk
04:05 PM Bug #22285: _read_bdev_label unable to decode label at offset
Alfredo Deza
10:21 AM Backport #22698: luminous: bluestore: New OSD - Caught signal - bstore_kv_sync
@Prashant Please fix the cherry-pick conflict resolution as suggested by Igor in the PR. Nathan Cutler
12:16 AM Bug #22427 (Resolved): osd_fsid does not exist, fsid is generated instead
Sage Weil

01/22/2018

08:17 PM Bug #22427 (Fix Under Review): osd_fsid does not exist, fsid is generated instead
PR at https://github.com/ceph/ceph/pull/20059 Alfredo Deza
03:51 PM Bug #22427 (Triaged): osd_fsid does not exist, fsid is generated instead
Sage Weil
03:53 PM Bug #22510: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "allocate failed...
Sage Weil
03:51 PM Bug #22245 (Need More Info): [segfault] ceph-bluestore-tool bluefs-log-dump
can you still reproduce this? do you have (or can you generate) a core file? THe log doesn't tell us where it faile... Sage Weil
03:45 PM Bug #22115 (Duplicate): OSD SIGABRT on bluestore_prefer_deferred_size = 104857600: assert(_buffer...
see #21932 Sage Weil
03:43 PM Bug #22543 (Can't reproduce): OSDs can not start after shutdown, killed by OOM killer during PGs ...
Sage Weil
03:40 PM Bug #22066 (Duplicate): bluestore osd asserts repeatedly with ceph-12.2.1/src/include/buffer.h: 8...
see #21932, pending backport, should be in 12.2.3 Sage Weil
03:34 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Martin, can you check your dmesg/kernel log and see if there are any media errors? The crc value is for a single blo... Sage Weil
03:16 PM Backport #22264 (Resolved): luminous: bluestore: db.slow used when db is not full
Igor Fedotov
03:02 PM Backport #22264: luminous: bluestore: db.slow used when db is not full
luminous cherry-pick is merged. Sage Weil
06:00 AM Bug #22616: bluestore_cache_data uses too much memory
I did some test with bluestore_default_buffered_read = false
The bluestore_cache_data now only use around a fe...
frank lin
 

Also available in: Atom