Project

General

Profile

Activity

From 06/20/2018 to 07/19/2018

07/19/2018

11:02 PM Bug #25001: Crashing OSDs after going from 12.2.5 -> 12.2.6 -> 13.2.0
Attaching thread dump. Brad Hubbard
10:48 PM Bug #25001: Crashing OSDs after going from 12.2.5 -> 12.2.6 -> 13.2.0
Looks like we are passing a bad bluestore_pextent_t into txc->released.insert.... Brad Hubbard
03:25 PM Bug #25001 (Can't reproduce): Crashing OSDs after going from 12.2.5 -> 12.2.6 -> 13.2.0
This bug has been opened following on from http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-July/028232.html
...
Troy Ablan
08:43 PM Bug #25006 (Can't reproduce): bad csum during upgrade test
Run: http://pulpito.ceph.com/yuriw-2018-07-18_21:43:34-upgrade:luminous-x-mimic-distro-basic-smithi/
Job: 2796264
L...
Yuri Weinstein
08:14 PM Bug #24968: Compaction error: Corruption: block checksum mismatch
Markus,
before redeploying the OSDs can you monitor current memory usage with top and/or free tools for a while?
Ju...
Igor Fedotov
03:22 PM Bug #24968: Compaction error: Corruption: block checksum mismatch
Hi Igor,
thanks for your feedback. The ceph servers have 128GB RAM with 10 1.8TB HDD and 3 1.92TB SSD plus an 640G...
Markus Stockhausen
09:51 AM Bug #24968: Compaction error: Corruption: block checksum mismatch
Markus,
first of all - I think 'improper' device size reporting is unrelated to this issue. This report contains jus...
Igor Fedotov
12:05 PM Backport #24799 (Resolved): mimic: FAILED assert(0 == "can't mark unloaded shard dirty") with com...
Igor Fedotov
09:28 AM Bug #24639: [segfault] segfault in BlueFS::read
Rowan,
do you remember what were BlueFS volume sizes for that breaking OSDs?
Igor Fedotov
09:15 AM Bug #24560 (Resolved): BitmapAllocator::_mark_allocated parameter overflow.
Igor Fedotov

07/18/2018

07:31 PM Bug #24968: Compaction error: Corruption: block checksum mismatch
One strange thing from the detail log (out) is a size mismatch for bdev 2:
At the beginning of the log we see:
<p...
Markus Stockhausen
02:56 PM Backport #24887 (Resolved): mimic: Multiple races related to destruction of SharedBlob and BlueSt...
Nathan Cutler
02:17 PM Backport #24887: mimic: Multiple races related to destruction of SharedBlob and BlueStore::split_...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23065
merged
Yuri Weinstein

07/17/2018

07:35 PM Bug #24968: Compaction error: Corruption: block checksum mismatch
Ceph.conf... Markus Stockhausen
07:31 PM Bug #24968: Compaction error: Corruption: block checksum mismatch
System is Centos 7.5 with longterm kernel 4.14.52 (kernel-ml spinoff from ELRepo) Markus Stockhausen
07:23 PM Bug #24968: Compaction error: Corruption: block checksum mismatch
Log of
CEPH_ARGS="--debug-bluestore 20 --debug-bluefs 20 --err-to-stderr --log-file out" ceph-bluestore-tool fsck ...
Markus Stockhausen
07:17 PM Bug #24968 (Closed): Compaction error: Corruption: block checksum mismatch
I'm unning ceph luminous 12.2.5 for a few weeks now. Unitl now only very light usage. Today we started our first ceph... Markus Stockhausen

07/16/2018

03:19 AM Backport #24887 (In Progress): mimic: Multiple races related to destruction of SharedBlob and Blu...
https://github.com/ceph/ceph/pull/23065 Prashant D
02:35 AM Backport #24886 (In Progress): luminous: Multiple races related to destruction of SharedBlob and ...
https://github.com/ceph/ceph/pull/23064 Prashant D

07/13/2018

04:51 PM Bug #24903 (Fix Under Review): Update 12.2.5 -> 12.2.6: block.db symlink exists but target unusable
https://github.com/ceph/ceph/pull/23031 Sage Weil
02:59 PM Bug #24903: Update 12.2.5 -> 12.2.6: block.db symlink exists but target unusable
What is happening here is that ceph-volume was relying on bluestore to set these links, and then when bluestore chang... Alfredo Deza
10:07 AM Bug #24903 (Resolved): Update 12.2.5 -> 12.2.6: block.db symlink exists but target unusable
After updating from 12.2.5 to 12.2.6 BlueStore OSDs with a separate blocks.db device will not restart:
2018-07-13 ...
Robert Sander
10:54 AM Bug #24906 (Closed): fio with bluestore crushed
... Honggang Yang
08:03 AM Bug #24901 (Resolved): Client reads fail due to bad CRC under high memory pressure on OSDs
I've seen problems with read failures due to CRC mismatches on two completely independent clusters with different har... Paul Emmerich

07/12/2018

10:49 AM Bug #22977 (Resolved): High CPU load caused by operations on onode_map
Nathan Cutler
10:49 AM Backport #24720 (Resolved): mimic: High CPU load caused by operations on onode_map
Nathan Cutler
12:05 AM Backport #24720: mimic: High CPU load caused by operations on onode_map
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/22777
merged
Yuri Weinstein
10:36 AM Backport #24769 (Resolved): mimic: set correctly shard for existed Collection.
Nathan Cutler
10:17 AM Backport #24887 (Resolved): mimic: Multiple races related to destruction of SharedBlob and BlueSt...
https://github.com/ceph/ceph/pull/23065 Nathan Cutler
10:17 AM Backport #24886 (Resolved): luminous: Multiple races related to destruction of SharedBlob and Blu...
https://github.com/ceph/ceph/pull/23064 Nathan Cutler
03:02 AM Bug #24859: Multiple races related to destruction of SharedBlob and BlueStore::split_cache()
https://github.com/ceph/ceph/pull/22972 Sage Weil
03:02 AM Bug #24859 (Pending Backport): Multiple races related to destruction of SharedBlob and BlueStore:...
Sage Weil
12:00 AM Backport #24799: mimic: FAILED assert(0 == "can't mark unloaded shard dirty") with compression en...
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/22910
merged
Yuri Weinstein

07/11/2018

08:11 PM Backport #24769: mimic: set correctly shard for existed Collection.
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22859
merged
Yuri Weinstein

07/10/2018

08:12 PM Bug #24859 (Resolved): Multiple races related to destruction of SharedBlob and BlueStore::split_c...
Radoslaw Zarzynski

07/09/2018

02:00 PM Bug #24715: FAILED assert(0 == "put on missing extent (nothing before)")
These crashes could be explained basing on:
* the race in *SharedBlob::put()* that was fixed in v12.2.6,
* memory r...
Radoslaw Zarzynski

07/06/2018

09:53 PM Backport #24261 (Resolved): mimic: bluestore: flush_commit is racy
Nathan Cutler
09:48 PM Backport #24261: mimic: bluestore: flush_commit is racy
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22382
merged
Yuri Weinstein
09:21 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
@Igor - please use "Copied To" instead of "Related To" for the links to the backport issues. All good otherwise, than... Nathan Cutler
06:43 PM Bug #23540 (Pending Backport): FAILED assert(0 == "can't mark unloaded shard dirty") with compres...
Sage Weil
09:16 PM Backport #24799: mimic: FAILED assert(0 == "can't mark unloaded shard dirty") with compression en...
mimic backport for https://tracker.ceph.com/issues/23540 Nathan Cutler
04:29 PM Backport #24799: mimic: FAILED assert(0 == "can't mark unloaded shard dirty") with compression en...
https://github.com/ceph/ceph/pull/22910 Igor Fedotov
04:25 PM Backport #24799 (Resolved): mimic: FAILED assert(0 == "can't mark unloaded shard dirty") with com...
https://github.com/ceph/ceph/pull/22910 Igor Fedotov
09:16 PM Backport #24798: luminous: FAILED assert(0 == "can't mark unloaded shard dirty") with compression...
luminous backport for https://tracker.ceph.com/issues/23540 Nathan Cutler
04:24 PM Backport #24798: luminous: FAILED assert(0 == "can't mark unloaded shard dirty") with compression...
https://github.com/ceph/ceph/pull/22909 Igor Fedotov
04:11 PM Backport #24798 (Resolved): luminous: FAILED assert(0 == "can't mark unloaded shard dirty") with ...
https://github.com/ceph/ceph/pull/22909 Igor Fedotov
01:22 PM Backport #24260 (In Progress): luminous: bluestore: flush_commit is racy
https://github.com/ceph/ceph/pull/22904 Igor Fedotov
04:26 AM Backport #24260: luminous: bluestore: flush_commit is racy
So it looks like ch was introduced by https://github.com/ceph/ceph/commit/abd58ad0b9a1a1564f96e2e2b8e1d2d7c832f7cf (w... Nathan Cutler
04:20 AM Backport #24260: luminous: bluestore: flush_commit is racy
According to @Prashant this depends on https://github.com/ceph/ceph/pull/20173 which seems too big to backport to lum... Nathan Cutler

07/05/2018

11:46 AM Bug #23540 (Fix Under Review): FAILED assert(0 == "can't mark unloaded shard dirty") with compres...
https://github.com/ceph/ceph/pull/22873 Igor Fedotov

07/04/2018

11:01 PM Backport #24770 (In Progress): luminous: set correctly shard for existed Collection.
Nathan Cutler
10:52 PM Backport #24770 (Resolved): luminous: set correctly shard for existed Collection.
https://github.com/ceph/ceph/pull/22860 Nathan Cutler
11:01 PM Backport #24769 (In Progress): mimic: set correctly shard for existed Collection.
Nathan Cutler
10:52 PM Backport #24769 (Resolved): mimic: set correctly shard for existed Collection.
https://github.com/ceph/ceph/pull/22859 Nathan Cutler
07:22 PM Bug #24761 (Pending Backport): set correctly shard for existed Collection.
Sage Weil

07/03/2018

11:29 PM Bug #24761: set correctly shard for existed Collection.
https://github.com/ceph/ceph/pull/22733 jianpeng ma
11:29 PM Bug #24761 (Resolved): set correctly shard for existed Collection.
For existed Collection, the constructor be called in _open_collections.
But m_finisher_num can't setup when enable b...
jianpeng ma
04:06 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
ceph-post-file: 1b1d42bb-6cae-430a-8fe7-974ce077b8dc
May (or may not) help, it's around loglevel5 I guess.
Peter Gervai

07/02/2018

01:47 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
Much better now :)
Thanks a lot!!!
Igor Fedotov
01:45 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
arghh. my mistake, http://77.247.180.45/download/ceph-osd.11.log.debug.gz
Igor Fedotov wrote:
> Yohay,
> I ca...
Yohay Azulay
01:42 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
Yohay,
I can't access the file at the link you provided, "Not found" returned..
Igor Fedotov

06/30/2018

07:28 AM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
Got it.. here it is: http://77.247.180.45/ceph-osd.11.log.debug.gz
Sage Weil wrote:
> Hi everyone,
>
> Is so...
Yohay Azulay

06/29/2018

07:08 PM Backport #24720 (In Progress): mimic: High CPU load caused by operations on onode_map
Patrick Donnelly
07:08 PM Backport #24720 (Resolved): mimic: High CPU load caused by operations on onode_map
https://github.com/ceph/ceph/pull/22777 Patrick Donnelly
07:05 PM Bug #22977 (Pending Backport): High CPU load caused by operations on onode_map
https://github.com/ceph/ceph/pull/22294 Patrick Donnelly
04:56 PM Bug #24715: FAILED assert(0 == "put on missing extent (nothing before)")
Hmm, some time ago there was an issue with "racy `SharedBlob::put()`":http://tracker.ceph.com/issues/24211. Looks wor... Radoslaw Zarzynski
04:33 PM Bug #24715 (In Progress): FAILED assert(0 == "put on missing extent (nothing before)")
Radoslaw Zarzynski
04:25 PM Bug #24715 (Duplicate): FAILED assert(0 == "put on missing extent (nothing before)")
Reported on ceph-users by Dyweni (see http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-June/027632.html).
...
Radoslaw Zarzynski
09:37 AM Documentation #24712 (New): Memory recommendations for bluestore
We configured the following setting for our luminous/bluestore osds.... Marc Schöchlin

06/26/2018

02:21 AM Bug #24598 (Resolved): ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
Sage Weil

06/25/2018

03:01 PM Bug #24598 (Fix Under Review): ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
https://github.com/ceph/ceph/pull/22691 Igor Fedotov

06/23/2018

11:32 AM Bug #24639: [segfault] segfault in BlueFS::read
I'm about to see if I can get a debug build reproducing it; but in the meantime I'm pretty sure this is the line:
...
Rowan James
11:13 AM Bug #24639 (Can't reproduce): [segfault] segfault in BlueFS::read
Via ceph-deploy on my admin host; I created two encrypted bluestore OSDs which after between 4 and 24 hours started p... Rowan James

06/22/2018

04:44 PM Bug #24598: ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
A bit more detail:... Sage Weil
04:31 PM Bug #24598: ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
This looks like a BitmapAllocator bug:
First, during mount,...
Sage Weil
08:57 AM Bug #24319 (Resolved): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2 failed fsck wi...
Nathan Cutler
08:57 AM Bug #24550 (Resolved): collection sequencers are not reused; delete and create collection reordered
Nathan Cutler
08:57 AM Backport #24581 (Resolved): mimic: collection sequencers are not reused; delete and create collec...
Nathan Cutler
08:54 AM Backport #24502 (Resolved): mimic: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2 fa...
Nathan Cutler
08:46 AM Backport #24503 (Resolved): luminous: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2...
Nathan Cutler
12:30 AM Backport #24503: luminous: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2 failed fsc...
merged https://github.com/ceph/ceph/pull/22650 Yuri Weinstein

06/20/2018

07:54 PM Bug #24598 (Resolved): ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
... Sage Weil
07:46 PM Backport #24582 (Rejected): luminous: collection sequencers are not reused; delete and create col...
affected code not in luminous Sage Weil
 

Also available in: Atom