Project

General

Profile

Activity

From 02/20/2019 to 03/21/2019

03/21/2019

12:18 AM Feature #38816: Deferred writes do not work for random writes
I want bluestore to be able to buffer(defer), say, 30 seconds of random writes in RocksDB at SSD speed. I expect back... Марк Коренберг

03/20/2019

10:55 AM Bug #38738: ceph ssd osd latency increase over time, until restart
hoan nv wrote:
> do you have temporary solutions for this issue.
>
> I tried move device class from ssd to hdd bu...
Igor Fedotov
08:48 AM Bug #38738: ceph ssd osd latency increase over time, until restart
do you have temporary solutions for this issue.
I tried move device class from ssd to hdd but no luck.
My clust...
hoan nv

03/19/2019

07:16 PM Feature #38816 (In Progress): Deferred writes do not work for random writes
Well, how to reproduce:
osd.11 is a bluestore OSD with RocksDB on SSD, and main data on HDD.
ceph osd pool cr...
Марк Коренберг
05:51 PM Bug #38795: fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.BlobReuseOnOverwrite
The issue persist till the second cache rebalance occurs after fsck completion. So a workaround for UT might be to wa... Igor Fedotov

03/18/2019

03:20 PM Bug #38738: ceph ssd osd latency increase over time, until restart
hoan nv wrote:
> i have same issue
Version's ceph is 13.2.2
hoan nv
03:19 PM Bug #38738: ceph ssd osd latency increase over time, until restart
i have same issue
hoan nv
12:20 PM Bug #38795 (Resolved): fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.BlobReuseOnOverwrite
if bluestore_fsck_on_mkfs is enabled the test case fails in Mimic and Luminous:
[ RUN ] ObjectStore/StoreTestSp...
Igor Fedotov

03/15/2019

03:16 PM Backport #38779 (In Progress): mimic: ceph_test_objecstore: bluefs mount fail with overlapping op...
Nathan Cutler
03:14 PM Backport #38779 (Resolved): mimic: ceph_test_objecstore: bluefs mount fail with overlapping op_al...
https://github.com/ceph/ceph/pull/26983 Nathan Cutler
03:15 PM Backport #38778 (In Progress): luminous: ceph_test_objecstore: bluefs mount fail with overlapping...
Nathan Cutler
03:14 PM Backport #38778 (Resolved): luminous: ceph_test_objecstore: bluefs mount fail with overlapping op...
https://github.com/ceph/ceph/pull/26979 Nathan Cutler
03:13 PM Bug #24598 (Pending Backport): ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
Nathan Cutler
12:52 PM Bug #38761 (Fix Under Review): Bitmap allocator might fail to return contiguous chunk despite hav...
Igor Fedotov
11:16 AM Bug #38761 (Resolved): Bitmap allocator might fail to return contiguous chunk despite having enou...
This happens when allocator has contiguous 4GB-aligned chunks to allocate from only. Internal stuff searching for fre... Igor Fedotov
12:51 PM Bug #38760 (Fix Under Review): BlueFS might request more space from slow device than is actually ...
Igor Fedotov
11:09 AM Bug #38760 (Resolved): BlueFS might request more space from slow device than is actually needed
When expanding slow device BlueFS has two sizes - one that it actually need for the current action and one that is a ... Igor Fedotov

03/14/2019

10:08 PM Bug #38745 (In Progress): spillover that doesn't make sense
... Sage Weil
11:52 AM Bug #38738: ceph ssd osd latency increase over time, until restart
Anton,
there is a thread named "ceph osd commit latency increase over time, until
restart" at ceph-users mail li...
Igor Fedotov
10:38 AM Bug #38738 (Resolved): ceph ssd osd latency increase over time, until restart
We register disk latency for VMs on SSD pool increase over time.
The VM disk latency normally is 0.5-3 ms.
The VM ...
Anton Usanov
09:55 AM Bug #38363: Failure in assert when calling: ceph-volume lvm prepare --bluestore --data /dev/sdg
I tested more with exactly the same hardware (PowerEdge R730xd). I tried to setup ceph luminous on Ububntu 16.04 and ... Rainer Krienke

03/13/2019

09:16 AM Support #38707 (Closed): Ceph OSD Down & Out - can't bring back up - Caught signal (Segmentation ...
I noticed that in my 3-node, 12-osd cluster (3 OSD per Node), one node has all 3 of its OSDs marked "Down" and "Out".... Liam Retrams

03/12/2019

03:39 PM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
Yes, I've thought of that but I haven't tested it... However this is rather strange then. Who does the fsync if BlueF... Vitaliy Filippov
03:00 PM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
Sage Weil
02:59 PM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
This goes away after you write more metadta into rocksdb and it starts overwriting previous wal files. The purpose o... Sage Weil
12:13 PM Bug #38574 (Resolved): mimic: Unable to recover from ENOSPC in BlueFS
Nathan Cutler
02:31 AM Backport #38586 (In Progress): luminous: OSD crashes in get_str_map while creating with ceph-volume
https://github.com/ceph/ceph/pull/26900 Prashant D

03/11/2019

08:18 PM Bug #38272 (Fix Under Review): "no available blob id" assertion might occur
Igor Fedotov
07:54 PM Bug #38395 (Resolved): luminous: write following remove might access previous onode
Igor Fedotov
07:41 PM Bug #38395: luminous: write following remove might access previous onode
https://github.com/ceph/ceph/pull/26540 merged Yuri Weinstein
04:57 PM Backport #38663 (Resolved): luminous: mimic: Unable to recover from ENOSPC in BlueFS
https://github.com/ceph/ceph/pull/26866 Neha Ojha
01:45 PM Backport #38663 (In Progress): luminous: mimic: Unable to recover from ENOSPC in BlueFS
Nathan Cutler
01:41 PM Backport #38663 (Resolved): luminous: mimic: Unable to recover from ENOSPC in BlueFS
https://github.com/ceph/ceph/pull/26866 Nathan Cutler

03/08/2019

08:41 PM Bug #38574 (Pending Backport): mimic: Unable to recover from ENOSPC in BlueFS
Neha Ojha
08:39 PM Bug #38574: mimic: Unable to recover from ENOSPC in BlueFS
https://github.com/ceph/ceph/pull/26735 merged Yuri Weinstein
01:32 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
FYI and FWIW, Boris Ranto put 14.0.1 into F30/rawhide. It's sort of Standard Operating Procedure (SOP) to put early r... Kaleb KEITHLEY
10:34 AM Bug #38637: BlueStore::ExtentMap::fault_range() assert
Can you make sure the underlying device is OK as a first step? This error might indicate corruption. It may be also b... Brad Hubbard
09:17 AM Bug #38637 (Won't Fix): BlueStore::ExtentMap::fault_range() assert
Hi,
I have rook with ceph ceph-12.2.4
3 Mon, 5 OSD.
For a last few hours one of my OSD is in crashing loop.
<...
Karol Chrapek

03/07/2019

01:49 PM Bug #38557 (Closed): pkg dependency issues upgrading from 12.2.y to 14.x.y
Nathan Cutler
07:14 AM Backport #38587 (In Progress): mimic: OSD crashes in get_str_map while creating with ceph-volume
Ashish Singh

03/06/2019

09:55 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
So that's why write_big operations may be also deferred just like write_small's. OK, thank you very much, it's clear now Vitaliy Filippov
08:29 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
It's not deferring because at the layer that deferring happens, we're talking about blobs (not writes), and the blogs... Sage Weil
04:08 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
Forgot to mention, this was Ceph 14.1.0 Vitaliy Filippov
09:25 AM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
I've just tried to set
[osd]
bluestore_prefer_deferred_size_hdd = 4194304
On a test HDD plugged in my laptop. ...
Vitaliy Filippov
06:54 PM Bug #38557: pkg dependency issues upgrading from 12.2.y to 14.x.y
accidentally opened against bluestore. You may close it.
See https://tracker.ceph.com/issues/38612 instead.
Kaleb KEITHLEY

03/05/2019

05:45 PM Backport #38587 (Resolved): mimic: OSD crashes in get_str_map while creating with ceph-volume
https://github.com/ceph/ceph/pull/26810 Nathan Cutler
05:45 PM Backport #38586 (Resolved): luminous: OSD crashes in get_str_map while creating with ceph-volume
https://github.com/ceph/ceph/pull/26900 Nathan Cutler
03:18 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
I've just verified deferred writes behavior for 4M writes using objectstore FIO plugin.
Indeed bluestore splits writ...
Igor Fedotov
08:10 AM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
Sage Weil wrote:
> > all writes of size 4MB with bluestore_prefer_deferred_size_hdd < 524288 go HDD directly. >= 524...
Марк Коренберг
11:46 AM Bug #38363: Failure in assert when calling: ceph-volume lvm prepare --bluestore --data /dev/sdg
I finally found the extended debug log in /var/log/ceph/ceph-osd.0.log. I attached the log output file (44k) to this ... Rainer Krienke

03/04/2019

11:23 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
> all writes of size 4MB with bluestore_prefer_deferred_size_hdd < 524288 go HDD directly. >= 524288 through SSD (I m... Sage Weil
05:55 PM Bug #38574 (Resolved): mimic: Unable to recover from ENOSPC in BlueFS
This the same issue as https://tracker.ceph.com/issues/36268.
We have alternate fix for mimic, which will be backpor...
Neha Ojha
03:36 PM Bug #38329 (Pending Backport): OSD crashes in get_str_map while creating with ceph-volume
Sage Weil
03:23 PM Bug #36268 (Resolved): Unable to recover from ENOSPC in BlueFS
Alternative fix for mimic and luminous: https://github.com/ceph/ceph/pull/26735 Sage Weil

03/03/2019

08:07 PM Bug #38559 (Resolved): 50-100% iops lost due to bluefs_preextend_wal_files = false
Hi.
I was investigating why RocksDB performance is so bad considering random 4K iops. I was looking at strace and ...
Vitaliy Filippov
01:30 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
We have upgraded to 12.2.11. During reboots the following would pass by:
[16:20:59] @ bitrot: osd.17 [ERR] 7...
Stefan Kooman
11:55 AM Bug #38557 (Closed): pkg dependency issues upgrading from 12.2.y to 14.x.y
Description of problem:
With respect to https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
Kaleb KEITHLEY

03/02/2019

02:28 PM Bug #38554 (Duplicate): ObjectStore/StoreTestSpecificAUSize.TooManyBlobsTest/2 fail, Expected: (r...
... Sage Weil

03/01/2019

04:44 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
FYI, I think I hit another case with this with this in the last two weeks.
A RGW only case where if you would list...
Wido den Hollander
03:27 PM Bug #36455 (Resolved): BlueStore: ENODATA not fully handled
Nathan Cutler
03:27 PM Backport #37825 (Resolved): luminous: BlueStore: ENODATA not fully handled
Nathan Cutler
03:10 PM Backport #36641 (New): mimic: Unable to recover from ENOSPC in BlueFS
Nathan Cutler
03:10 PM Backport #36640 (New): luminous: Unable to recover from ENOSPC in BlueFS
Nathan Cutler
03:09 PM Bug #36268 (Pending Backport): Unable to recover from ENOSPC in BlueFS
Sage, did you mean to cancel the mimic and luminous backports when you changed the status to Resolved? Nathan Cutler
10:46 AM Bug #25077 (New): Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
Igor Fedotov
10:46 AM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
Stefan Kooman wrote:
> @Igor Fedotov:
>
> We are using ceph balancer to get PGs balanced accross the cluster. The...
Igor Fedotov
07:56 AM Bug #38363: Failure in assert when calling: ceph-volume lvm prepare --bluestore --data /dev/sdg
I tried but the output of ceph-volume remains the same....
I added this to /etc/ceph/ceph.conf on my testing nod...
Rainer Krienke

02/28/2019

07:30 PM Backport #37825: luminous: BlueStore: ENODATA not fully handled
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25855
merged
Yuri Weinstein
05:15 PM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
We're not sure how to proceed without being able to reprdocue the crash, and we have never seen this.
1. Would it...
Sage Weil
04:41 PM Bug #38329 (Fix Under Review): OSD crashes in get_str_map while creating with ceph-volume
reproduce this and got a core.
I think the problem is an empty string passed to trim() in str_map.cc. Fix here: h...
Sage Weil
03:37 PM Bug #23206 (Rejected): ceph-osd daemon crashes - *** Caught signal (Aborted) **
not enough info Sage Weil
03:35 PM Bug #24639 (Can't reproduce): [segfault] segfault in BlueFS::read
sounds like a hardware problem then! Sage Weil
03:34 PM Bug #25098: Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)`
Current status:
We want a more concrete source of truth for whether the db and/or wal partitions should exist--som...
Sage Weil
03:31 PM Bug #34526 (Duplicate): OSD crash in KernelDevice::direct_read_unaligned while scrubbing
Sage Weil
09:55 AM Bug #34526: OSD crash in KernelDevice::direct_read_unaligned while scrubbing
IMO this is BlueStore (or more precisely BlueFS and/or RocksDB) related.
And I think it's duplicate of #36482
O...
Igor Fedotov
03:30 PM Bug #36268 (Resolved): Unable to recover from ENOSPC in BlueFS
Sage Weil
03:30 PM Bug #36331 (Need More Info): FAILED ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2 ...
this was an ubuntu 18.04 kernel. maybe this was the pread vs swap zeroed pages kerenl bug?
i think we need anothe...
Sage Weil
03:27 PM Bug #36364 (Can't reproduce): Bluestore OSD IO Hangs near Flush (flush in 90.330556)
Sage Weil
03:27 PM Bug #38049 (Resolved): random osds failing in thread_name:bstore_kv_final
Sage Weil
03:23 PM Bug #38250 (Need More Info): assert failure crash prevents ceph-osd from running
Is the errno EIO in this case?
On read error we do crash and fail the OSD. There is generally no recovery path fo...
Sage Weil
03:18 PM Bug #38272 (In Progress): "no available blob id" assertion might occur
Sage Weil
03:16 PM Bug #38363 (Need More Info): Failure in assert when calling: ceph-volume lvm prepare --bluestore ...
Can you reproduce this with debug_bluestore=20, debug_bluefs=20, debug_bdev=20?
Thanks!
Sage Weil
03:14 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- it looks like implementing readahead in bluefs would help
- we think newer rocksdb does its own readahead
Sage Weil
02:41 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
We've got another occurrence for this issue too.
Omap listing for specific onode consistently takes ~2 mins while d...
Igor Fedotov
02:19 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
I think this is the same issue:
https://marc.info/?l=ceph-devel&m=155134206210976&w=2
Igor Fedotov
03:04 PM Bug #37914 (Can't reproduce): bluestore: segmentation fault
no logs or core. hoping it was teh hypercombined bufferlist memory corruption issue Sage Weil
09:57 AM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
@Igor Fedotov:
We are using ceph balancer to get PGs balanced accross the cluster. The day after the crashes, the ...
Stefan Kooman

02/27/2019

11:18 AM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
Check, I'll collect the needed information. Note, during the restarts of the storage servers the *same* OSDs crashed ... Stefan Kooman
07:09 AM Feature #38494: Bluestore: issue discards on everything non-discarded during deep-scrubs
Included link is just related PR. Марк Коренберг
07:07 AM Feature #38494: Bluestore: issue discards on everything non-discarded during deep-scrubs
text formatting of previous message is wrong. I did not want to stroke-out the text. Марк Коренберг
07:07 AM Feature #38494 (New): Bluestore: issue discards on everything non-discarded during deep-scrubs
Yes, we have bdev_enable_discard and bdev_async_discard, but they are not documented.
Ubuntu issues ...
Марк Коренберг

02/26/2019

09:25 PM Bug #38489 (Resolved): bluestore_prefer_deferred_size_hdd units are not clear
I have done an experiment. I made a pool with one PG of size 1. Next I run this command:
rados bench -p qwe -b 4M ...
Марк Коренберг
02:43 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
Stefan, to make sure (at our best) this is exactly the same bug, could you please check PG states with ceph-objectsto... Igor Fedotov

02/22/2019

04:05 PM Bug #37733 (Resolved): os/bluestore: fixup access a destroy cond cause deadlock or undefine behav...
Nathan Cutler
04:05 PM Backport #38142 (Resolved): luminous: os/bluestore: fixup access a destroy cond cause deadlock or...
Nathan Cutler
03:32 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
Added related-to link to #38144 where the GCC 9 FTBFS is being discussed. A patch has been proposed there, but it inc... Nathan Cutler

02/21/2019

09:52 PM Backport #38142: luminous: os/bluestore: fixup access a destroy cond cause deadlock or undefine b...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26261
merged
Yuri Weinstein
03:49 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
(original reporter here)
I have following customisation in ceph.conf:...
Tomasz Torcz
01:00 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
We *think* we have hit this same issue "in the field" on a Luminous 12.2.8 cluster:
2019-02-20 18:42:45.261357 7fd...
Stefan Kooman

02/20/2019

11:15 AM Bug #38395 (Fix Under Review): luminous: write following remove might access previous onode
Igor Fedotov
10:35 AM Bug #38395: luminous: write following remove might access previous onode
Igor Fedotov
10:25 AM Bug #38395 (Resolved): luminous: write following remove might access previous onode
So the sequence is as follows:
T1:
remove A
T2:
touch A
write A
In Luminous there is a chance that A is rem...
Igor Fedotov
 

Also available in: Atom