Activity
From 02/20/2019 to 03/21/2019
03/21/2019
- 12:18 AM Feature #38816: Deferred writes do not work for random writes
- I want bluestore to be able to buffer(defer), say, 30 seconds of random writes in RocksDB at SSD speed. I expect back...
03/20/2019
- 10:55 AM Bug #38738: ceph ssd osd latency increase over time, until restart
- hoan nv wrote:
> do you have temporary solutions for this issue.
>
> I tried move device class from ssd to hdd bu... - 08:48 AM Bug #38738: ceph ssd osd latency increase over time, until restart
- do you have temporary solutions for this issue.
I tried move device class from ssd to hdd but no luck.
My clust...
03/19/2019
- 07:16 PM Feature #38816 (In Progress): Deferred writes do not work for random writes
- Well, how to reproduce:
osd.11 is a bluestore OSD with RocksDB on SSD, and main data on HDD.
ceph osd pool cr... - 05:51 PM Bug #38795: fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.BlobReuseOnOverwrite
- The issue persist till the second cache rebalance occurs after fsck completion. So a workaround for UT might be to wa...
03/18/2019
- 03:20 PM Bug #38738: ceph ssd osd latency increase over time, until restart
- hoan nv wrote:
> i have same issue
Version's ceph is 13.2.2
- 03:19 PM Bug #38738: ceph ssd osd latency increase over time, until restart
- i have same issue
- 12:20 PM Bug #38795 (Resolved): fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.BlobReuseOnOverwrite
- if bluestore_fsck_on_mkfs is enabled the test case fails in Mimic and Luminous:
[ RUN ] ObjectStore/StoreTestSp...
03/15/2019
- 03:16 PM Backport #38779 (In Progress): mimic: ceph_test_objecstore: bluefs mount fail with overlapping op...
- 03:14 PM Backport #38779 (Resolved): mimic: ceph_test_objecstore: bluefs mount fail with overlapping op_al...
- https://github.com/ceph/ceph/pull/26983
- 03:15 PM Backport #38778 (In Progress): luminous: ceph_test_objecstore: bluefs mount fail with overlapping...
- 03:14 PM Backport #38778 (Resolved): luminous: ceph_test_objecstore: bluefs mount fail with overlapping op...
- https://github.com/ceph/ceph/pull/26979
- 03:13 PM Bug #24598 (Pending Backport): ceph_test_objecstore: bluefs mount fail with overlapping op_alloc_add
- 12:52 PM Bug #38761 (Fix Under Review): Bitmap allocator might fail to return contiguous chunk despite hav...
- 11:16 AM Bug #38761 (Resolved): Bitmap allocator might fail to return contiguous chunk despite having enou...
- This happens when allocator has contiguous 4GB-aligned chunks to allocate from only. Internal stuff searching for fre...
- 12:51 PM Bug #38760 (Fix Under Review): BlueFS might request more space from slow device than is actually ...
- 11:09 AM Bug #38760 (Resolved): BlueFS might request more space from slow device than is actually needed
- When expanding slow device BlueFS has two sizes - one that it actually need for the current action and one that is a ...
03/14/2019
- 10:08 PM Bug #38745 (In Progress): spillover that doesn't make sense
- ...
- 11:52 AM Bug #38738: ceph ssd osd latency increase over time, until restart
- Anton,
there is a thread named "ceph osd commit latency increase over time, until
restart" at ceph-users mail li... - 10:38 AM Bug #38738 (Resolved): ceph ssd osd latency increase over time, until restart
- We register disk latency for VMs on SSD pool increase over time.
The VM disk latency normally is 0.5-3 ms.
The VM ... - 09:55 AM Bug #38363: Failure in assert when calling: ceph-volume lvm prepare --bluestore --data /dev/sdg
- I tested more with exactly the same hardware (PowerEdge R730xd). I tried to setup ceph luminous on Ububntu 16.04 and ...
03/13/2019
- 09:16 AM Support #38707 (Closed): Ceph OSD Down & Out - can't bring back up - Caught signal (Segmentation ...
- I noticed that in my 3-node, 12-osd cluster (3 OSD per Node), one node has all 3 of its OSDs marked "Down" and "Out"....
03/12/2019
- 03:39 PM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
- Yes, I've thought of that but I haven't tested it... However this is rather strange then. Who does the fsync if BlueF...
- 03:00 PM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
- 02:59 PM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
- This goes away after you write more metadta into rocksdb and it starts overwriting previous wal files. The purpose o...
- 12:13 PM Bug #38574 (Resolved): mimic: Unable to recover from ENOSPC in BlueFS
- 02:31 AM Backport #38586 (In Progress): luminous: OSD crashes in get_str_map while creating with ceph-volume
- https://github.com/ceph/ceph/pull/26900
03/11/2019
- 08:18 PM Bug #38272 (Fix Under Review): "no available blob id" assertion might occur
- 07:54 PM Bug #38395 (Resolved): luminous: write following remove might access previous onode
- 07:41 PM Bug #38395: luminous: write following remove might access previous onode
- https://github.com/ceph/ceph/pull/26540 merged
- 04:57 PM Backport #38663 (Resolved): luminous: mimic: Unable to recover from ENOSPC in BlueFS
- https://github.com/ceph/ceph/pull/26866
- 01:45 PM Backport #38663 (In Progress): luminous: mimic: Unable to recover from ENOSPC in BlueFS
- 01:41 PM Backport #38663 (Resolved): luminous: mimic: Unable to recover from ENOSPC in BlueFS
- https://github.com/ceph/ceph/pull/26866
03/08/2019
- 08:41 PM Bug #38574 (Pending Backport): mimic: Unable to recover from ENOSPC in BlueFS
- 08:39 PM Bug #38574: mimic: Unable to recover from ENOSPC in BlueFS
- https://github.com/ceph/ceph/pull/26735 merged
- 01:32 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
- FYI and FWIW, Boris Ranto put 14.0.1 into F30/rawhide. It's sort of Standard Operating Procedure (SOP) to put early r...
- 10:34 AM Bug #38637: BlueStore::ExtentMap::fault_range() assert
- Can you make sure the underlying device is OK as a first step? This error might indicate corruption. It may be also b...
- 09:17 AM Bug #38637 (Won't Fix): BlueStore::ExtentMap::fault_range() assert
- Hi,
I have rook with ceph ceph-12.2.4
3 Mon, 5 OSD.
For a last few hours one of my OSD is in crashing loop.
<...
03/07/2019
- 01:49 PM Bug #38557 (Closed): pkg dependency issues upgrading from 12.2.y to 14.x.y
- 07:14 AM Backport #38587 (In Progress): mimic: OSD crashes in get_str_map while creating with ceph-volume
03/06/2019
- 09:55 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- So that's why write_big operations may be also deferred just like write_small's. OK, thank you very much, it's clear now
- 08:29 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- It's not deferring because at the layer that deferring happens, we're talking about blobs (not writes), and the blogs...
- 04:08 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- Forgot to mention, this was Ceph 14.1.0
- 09:25 AM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- I've just tried to set
[osd]
bluestore_prefer_deferred_size_hdd = 4194304
On a test HDD plugged in my laptop. ... - 06:54 PM Bug #38557: pkg dependency issues upgrading from 12.2.y to 14.x.y
- accidentally opened against bluestore. You may close it.
See https://tracker.ceph.com/issues/38612 instead.
03/05/2019
- 05:45 PM Backport #38587 (Resolved): mimic: OSD crashes in get_str_map while creating with ceph-volume
- https://github.com/ceph/ceph/pull/26810
- 05:45 PM Backport #38586 (Resolved): luminous: OSD crashes in get_str_map while creating with ceph-volume
- https://github.com/ceph/ceph/pull/26900
- 03:18 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- I've just verified deferred writes behavior for 4M writes using objectstore FIO plugin.
Indeed bluestore splits writ... - 08:10 AM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- Sage Weil wrote:
> > all writes of size 4MB with bluestore_prefer_deferred_size_hdd < 524288 go HDD directly. >= 524... - 11:46 AM Bug #38363: Failure in assert when calling: ceph-volume lvm prepare --bluestore --data /dev/sdg
- I finally found the extended debug log in /var/log/ceph/ceph-osd.0.log. I attached the log output file (44k) to this ...
03/04/2019
- 11:23 PM Bug #38489: bluestore_prefer_deferred_size_hdd units are not clear
- > all writes of size 4MB with bluestore_prefer_deferred_size_hdd < 524288 go HDD directly. >= 524288 through SSD (I m...
- 05:55 PM Bug #38574 (Resolved): mimic: Unable to recover from ENOSPC in BlueFS
- This the same issue as https://tracker.ceph.com/issues/36268.
We have alternate fix for mimic, which will be backpor... - 03:36 PM Bug #38329 (Pending Backport): OSD crashes in get_str_map while creating with ceph-volume
- 03:23 PM Bug #36268 (Resolved): Unable to recover from ENOSPC in BlueFS
- Alternative fix for mimic and luminous: https://github.com/ceph/ceph/pull/26735
03/03/2019
- 08:07 PM Bug #38559 (Resolved): 50-100% iops lost due to bluefs_preextend_wal_files = false
- Hi.
I was investigating why RocksDB performance is so bad considering random 4K iops. I was looking at strace and ... - 01:30 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- We have upgraded to 12.2.11. During reboots the following would pass by:
[16:20:59] @ bitrot: osd.17 [ERR] 7... - 11:55 AM Bug #38557 (Closed): pkg dependency issues upgrading from 12.2.y to 14.x.y
- Description of problem:
With respect to https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
03/02/2019
- 02:28 PM Bug #38554 (Duplicate): ObjectStore/StoreTestSpecificAUSize.TooManyBlobsTest/2 fail, Expected: (r...
- ...
03/01/2019
- 04:44 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- FYI, I think I hit another case with this with this in the last two weeks.
A RGW only case where if you would list... - 03:27 PM Bug #36455 (Resolved): BlueStore: ENODATA not fully handled
- 03:27 PM Backport #37825 (Resolved): luminous: BlueStore: ENODATA not fully handled
- 03:10 PM Backport #36641 (New): mimic: Unable to recover from ENOSPC in BlueFS
- 03:10 PM Backport #36640 (New): luminous: Unable to recover from ENOSPC in BlueFS
- 03:09 PM Bug #36268 (Pending Backport): Unable to recover from ENOSPC in BlueFS
- Sage, did you mean to cancel the mimic and luminous backports when you changed the status to Resolved?
- 10:46 AM Bug #25077 (New): Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- 10:46 AM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- Stefan Kooman wrote:
> @Igor Fedotov:
>
> We are using ceph balancer to get PGs balanced accross the cluster. The... - 07:56 AM Bug #38363: Failure in assert when calling: ceph-volume lvm prepare --bluestore --data /dev/sdg
- I tried but the output of ceph-volume remains the same....
I added this to /etc/ceph/ceph.conf on my testing nod...
02/28/2019
- 07:30 PM Backport #37825: luminous: BlueStore: ENODATA not fully handled
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25855
merged - 05:15 PM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
- We're not sure how to proceed without being able to reprdocue the crash, and we have never seen this.
1. Would it... - 04:41 PM Bug #38329 (Fix Under Review): OSD crashes in get_str_map while creating with ceph-volume
- reproduce this and got a core.
I think the problem is an empty string passed to trim() in str_map.cc. Fix here: h... - 03:37 PM Bug #23206 (Rejected): ceph-osd daemon crashes - *** Caught signal (Aborted) **
- not enough info
- 03:35 PM Bug #24639 (Can't reproduce): [segfault] segfault in BlueFS::read
- sounds like a hardware problem then!
- 03:34 PM Bug #25098: Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)`
- Current status:
We want a more concrete source of truth for whether the db and/or wal partitions should exist--som... - 03:31 PM Bug #34526 (Duplicate): OSD crash in KernelDevice::direct_read_unaligned while scrubbing
- 09:55 AM Bug #34526: OSD crash in KernelDevice::direct_read_unaligned while scrubbing
- IMO this is BlueStore (or more precisely BlueFS and/or RocksDB) related.
And I think it's duplicate of #36482
O... - 03:30 PM Bug #36268 (Resolved): Unable to recover from ENOSPC in BlueFS
- 03:30 PM Bug #36331 (Need More Info): FAILED ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2 ...
- this was an ubuntu 18.04 kernel. maybe this was the pread vs swap zeroed pages kerenl bug?
i think we need anothe... - 03:27 PM Bug #36364 (Can't reproduce): Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- 03:27 PM Bug #38049 (Resolved): random osds failing in thread_name:bstore_kv_final
- 03:23 PM Bug #38250 (Need More Info): assert failure crash prevents ceph-osd from running
- Is the errno EIO in this case?
On read error we do crash and fail the OSD. There is generally no recovery path fo... - 03:18 PM Bug #38272 (In Progress): "no available blob id" assertion might occur
- 03:16 PM Bug #38363 (Need More Info): Failure in assert when calling: ceph-volume lvm prepare --bluestore ...
- Can you reproduce this with debug_bluestore=20, debug_bluefs=20, debug_bdev=20?
Thanks! - 03:14 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- - it looks like implementing readahead in bluefs would help
- we think newer rocksdb does its own readahead
- 02:41 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- We've got another occurrence for this issue too.
Omap listing for specific onode consistently takes ~2 mins while d... - 02:19 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- I think this is the same issue:
https://marc.info/?l=ceph-devel&m=155134206210976&w=2 - 03:04 PM Bug #37914 (Can't reproduce): bluestore: segmentation fault
- no logs or core. hoping it was teh hypercombined bufferlist memory corruption issue
- 09:57 AM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- @Igor Fedotov:
We are using ceph balancer to get PGs balanced accross the cluster. The day after the crashes, the ...
02/27/2019
- 11:18 AM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- Check, I'll collect the needed information. Note, during the restarts of the storage servers the *same* OSDs crashed ...
- 07:09 AM Feature #38494: Bluestore: issue discards on everything non-discarded during deep-scrubs
- Included link is just related PR.
- 07:07 AM Feature #38494: Bluestore: issue discards on everything non-discarded during deep-scrubs
- text formatting of previous message is wrong. I did not want to stroke-out the text.
- 07:07 AM Feature #38494 (New): Bluestore: issue discards on everything non-discarded during deep-scrubs
- Yes, we have bdev_enable_discard and bdev_async_discard, but they are not documented.
Ubuntu issues ...
02/26/2019
- 09:25 PM Bug #38489 (Resolved): bluestore_prefer_deferred_size_hdd units are not clear
- I have done an experiment. I made a pool with one PG of size 1. Next I run this command:
rados bench -p qwe -b 4M ... - 02:43 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- Stefan, to make sure (at our best) this is exactly the same bug, could you please check PG states with ceph-objectsto...
02/22/2019
- 04:05 PM Bug #37733 (Resolved): os/bluestore: fixup access a destroy cond cause deadlock or undefine behav...
- 04:05 PM Backport #38142 (Resolved): luminous: os/bluestore: fixup access a destroy cond cause deadlock or...
- 03:32 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
- Added related-to link to #38144 where the GCC 9 FTBFS is being discussed. A patch has been proposed there, but it inc...
02/21/2019
- 09:52 PM Backport #38142: luminous: os/bluestore: fixup access a destroy cond cause deadlock or undefine b...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26261
merged - 03:49 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
- (original reporter here)
I have following customisation in ceph.conf:... - 01:00 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- We *think* we have hit this same issue "in the field" on a Luminous 12.2.8 cluster:
2019-02-20 18:42:45.261357 7fd...
02/20/2019
- 11:15 AM Bug #38395 (Fix Under Review): luminous: write following remove might access previous onode
- 10:35 AM Bug #38395: luminous: write following remove might access previous onode
- 10:25 AM Bug #38395 (Resolved): luminous: write following remove might access previous onode
- So the sequence is as follows:
T1:
remove A
T2:
touch A
write A
In Luminous there is a chance that A is rem...
Also available in: Atom