Project

General

Profile

Activity

From 05/03/2020 to 06/01/2020

06/01/2020

10:21 PM Bug #45519: OSD asserts during block allocation for BlueFS
For the sake of record the following thread has some helpful info on assertion: h->file->fnode.ino != 1
I did som...
Igor Fedotov
10:07 PM Bug #45519: OSD asserts during block allocation for BlueFS
Aleksei Zakharov wrote:
>
> Turning on stupid allocator with "bad" OSD turned into an impossibility to start this ...
Igor Fedotov
10:47 AM Bug #45519: OSD asserts during block allocation for BlueFS
We added some new OSD's. During backfill process we were affected by another issue(https://tracker.ceph.com/issues/45... Aleksei Zakharov
10:16 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Anyway suggesting to compact DB manually.
Besides recently we switched backed to direct IO for bluefs, see https:/...
Igor Fedotov
10:18 AM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Nope, we use NVMe SSD's.
The issue starts with high page cache usage. Bluefs uses buffered IO, it reads a lot from...
Aleksei Zakharov
09:30 PM Bug #45613 (Resolved): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
Igor Fedotov
03:48 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
https://github.com/ceph/ceph/pull/35293 - Fix for octopus has been merged and released in 15.2.3. Neha Ojha

05/30/2020

03:29 PM Bug #44359: Raw usage reported by 'ceph osd df' incorrect when using WAL/DB on another drive
Same here. Fresh Cluster - completely empty. "Raw Use" corresponds to Size of DB+WAL/DB Partition located on separate... Tobias Fischer

05/29/2020

07:32 PM Bug #45788 (Fix Under Review): ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 failed
Igor Fedotov
07:32 PM Bug #45788 (Pending Backport): ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 failed
Igor Fedotov
07:14 PM Bug #45788 (Resolved): ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentation/2 failed
Originally appeared in https://tracker.ceph.com/issues/45519
but cloning to another ticket since looks unrelated.
...
Igor Fedotov
07:17 PM Bug #45519: OSD asserts during block allocation for BlueFS
@Neha, @Kefu - made another ticket for QA test case failure, IMO unrelated.
https://tracker.ceph.com/issues/45788
Igor Fedotov
04:20 PM Bug #45519: OSD asserts during block allocation for BlueFS
We have similar issues on our cluster. ~1 billion objects in EC (8+3) pools, 540 OSDs, Nautilus 14.2.8. No tuning o... Simon Leinen
04:32 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
Yeah, that's a known issue with RocksDB/BlueStore.
Manual compaction using "ceph-kvstore-tool bluestore-kv <path-t...
Igor Fedotov
12:18 PM Bug #45765 (Resolved): BlueStore::_collection_list causes huge latency growth pg deletion
Hi!
We have ceph v14.2.7 cluster with about 2 billions of objects. Each object is less than 4K size. One PG have abo...
Aleksei Zakharov

05/28/2020

08:30 PM Bug #45519: OSD asserts during block allocation for BlueFS
... Neha Ojha
02:25 PM Bug #44213: Erasure coded pool might need much more disk space than expected
We need to run some performance tests with 4k min_alloc_size. Neha Ojha
12:50 AM Bug #43147 (New): segv in LruOnodeCacheShard::_pin
Reopening this since I have seen it in /a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi... Brad Hubbard

05/27/2020

01:47 PM Bug #45519: OSD asserts during block allocation for BlueFS
... Kefu Chai

05/26/2020

04:53 AM Bug #45703 (New): LruOnodeCacheShard::_add: Assertion `!safemode_or_autounlink || node_algorithms...
/a/yuriw-2020-05-22_19:55:53-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5083216... Brad Hubbard

05/25/2020

02:26 PM Bug #45110 (Resolved): Extent leak after main device expand
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:05 AM Backport #45126 (Resolved): nautilus: Extent leak after main device expand
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34711
m...
Nathan Cutler

05/24/2020

09:05 PM Backport #45684 (Resolved): nautilus: Large (>=2 GB) writes are incomplete when bluefs_buffered_i...
https://github.com/ceph/ceph/pull/35404 Nathan Cutler
09:05 PM Backport #45683 (Rejected): mimic: Large (>=2 GB) writes are incomplete when bluefs_buffered_io =...
Nathan Cutler
09:05 PM Backport #45682 (Resolved): octopus: Large (>=2 GB) writes are incomplete when bluefs_buffered_io...
https://github.com/ceph/ceph/pull/35446 Nathan Cutler

05/22/2020

07:37 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
OK thanks. I'm also just confirming that nautilus would be immune, even though 14.2.10 will change bluefs_bufferio_io... Dan van der Ster
05:30 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
At the same time IIUC it's OSD restart which reveals data corruption - while OSD is running it doesn't read from WAL ... Igor Fedotov
05:26 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
@Dan - right. Setting bluefs_preextend_wal_files seems better to me (one can even try to do that on the fly) but I'd ... Igor Fedotov
04:27 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
@Igor: So setting bluefs_preextend_wal_files=false and/or bluefs_buffered_io=true should workaround the issue until t... Dan van der Ster
04:02 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
So the bug is caused by submitting overlapping write requests for BlueFS WAL via libaio. BlueFS::_flush_range might e... Igor Fedotov
03:53 PM Bug #45613 (Fix Under Review): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 f...
Igor Fedotov
03:45 PM Bug #45657: false positive check in KernelDevice::aio_log_start
Also the checking for overlapped io is currently broken due to preextended WAL prefilling in BlueFS::_flush_range. Wr... Igor Fedotov
02:23 PM Bug #45657 (New): false positive check in KernelDevice::aio_log_start
When enabling bdev_debug_inflight_ios KernelDevice might improperly detect conflicts for inflight read ops. This is c... Igor Fedotov
04:31 AM Bug #45337 (Pending Backport): Large (>=2 GB) writes are incomplete when bluefs_buffered_io = true
Kefu Chai
04:30 AM Bug #45337 (Resolved): Large (>=2 GB) writes are incomplete when bluefs_buffered_io = true
Kefu Chai
04:29 AM Bug #45335 (Resolved): cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_...
Kefu Chai

05/21/2020

10:35 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
[Redacted] Brad Hubbard
10:00 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
/a/bhubbard-2020-05-20_04:16:41-rados-wip-badone-testing-7-distro-basic-smithi/5071455 Brad Hubbard
01:18 AM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
Reproduces pretty easily on master: /a/nojha-2020-05-20_19:45:21-rados-master-distro-basic-smithi/5073678. Haven't be... Neha Ojha
08:22 PM Bug #44937: bluestore rocksdb max_background_compactions regression in 12.2.13
@Igor Yeah, you are correct. The pr#3926 commits landed after Mimic:... Dan Hill

05/20/2020

03:16 PM Backport #45126: nautilus: Extent leak after main device expand
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/34711
merged
Yuri Weinstein
12:14 PM Bug #45613: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
store_test's log contains:
-17> 2020-05-19T10:00:14.304+0000 7f8dfa42f240 3 rocksdb: [db/db_impl_open.cc:518] db...
Igor Fedotov
12:18 AM Bug #45613 (Resolved): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 failed
... Neha Ojha
11:41 AM Backport #45064 (Resolved): nautilus: bluestore: unused calculation is broken
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34794
m...
Nathan Cutler

05/19/2020

03:33 PM Backport #45064: nautilus: bluestore: unused calculation is broken
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34794
merged
Yuri Weinstein
09:35 AM Bug #43814 (Resolved): common/bl: claim_append() corrupts memory when a bl consecutively has at l...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler

05/18/2020

09:50 AM Backport #43920 (Resolved): nautilus: common/bl: claim_append() corrupts memory when a bl consecu...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34516
m...
Nathan Cutler
09:50 AM Backport #43087 (Resolved): nautilus: bluefs: sync_metadata leaks dirty files if log_t is empty
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34515
m...
Nathan Cutler

05/15/2020

11:15 PM Backport #43920: nautilus: common/bl: claim_append() corrupts memory when a bl consecutively has ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34516
merged
Yuri Weinstein
11:14 PM Backport #43087: nautilus: bluefs: sync_metadata leaks dirty files if log_t is empty
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34515
merged
Yuri Weinstein
03:52 PM Bug #44937 (Need More Info): bluestore rocksdb max_background_compactions regression in 12.2.13
Neha Ojha
03:27 PM Bug #44937: bluestore rocksdb max_background_compactions regression in 12.2.13
Users have been using >=2 (up to 8) background_compactions in rocksdb long before we set the default to 2, without an... Josh Durgin
02:37 PM Bug #45519: OSD asserts during block allocation for BlueFS
It look like someone has edited my previous message:)
Yes, this check in _flush_and_sync_log doesn't work.
I've t...
Aleksei Zakharov

05/13/2020

03:13 PM Backport #45354 (Need More Info): octopus: ceph_test_objectstore: src/os/bluestore/bluestore_type...
Nathan Cutler
02:41 PM Bug #45519: OSD asserts during block allocation for BlueFS
>We also see some new OSDs asserting:... Aleksei Zakharov
10:33 AM Bug #45519: OSD asserts during block allocation for BlueFS
>Igor, thanks for your answers!
>If I understand right: high bluestore fragmentation makes it impossible to alloca...
Aleksei Zakharov

05/12/2020

05:41 PM Bug #45519: OSD asserts during block allocation for BlueFS
Alexei, it looks like your OSDs have pretty fragmented free space (and presumably quite high utilization) which resul... Igor Fedotov
02:33 PM Bug #45519 (New): OSD asserts during block allocation for BlueFS
Hi all.
We use ceph as the rados object storage for small (<=4K) objects. We have about 2 billions of objects in one...
Aleksei Zakharov

05/11/2020

02:23 PM Bug #44774 (Resolved): ceph-bluestore-tool --command bluefs-bdev-new-wal may damage bluefs
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:20 PM Bug #45250 (Resolved): check-generated.sh finds error in ceph-dencoder
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler

05/10/2020

04:10 AM Bug #43068: on disk size (81292) does not match object info size (81237)
Igor Fedotov wrote:
> And is this correct that Rados/BlueStore is valid (and is in line with the content) but CephFS...
Patrick Donnelly

05/08/2020

10:17 PM Bug #45335: cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_sharding mi...
https://github.com/ceph/ceph/pull/34970 Sebastian Wagner
10:05 PM Bug #45335: cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_sharding mi...
this also blocks getting a valid test run for any upgrade PR in cephadm .. Michael Fritch
03:32 PM Bug #45335: cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_sharding mi...
because of this issue, affected Pulpito runs now take about 12h before they're killed. Instead of the usual 1h or so. Sebastian Wagner
02:00 PM Bug #45335: cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_sharding mi...
http://pulpito.ceph.com/teuthology-2020-05-03_07:01:02-rados-master-distro-basic-smithi/5018156/ Sebastian Wagner
07:28 AM Bug #45335: cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_sharding mi...
http://pulpito.ceph.com/swagner-2020-05-07_16:10:50-rados-wip-swagner2-testing-2020-05-07-1308-distro-basic-smithi/50... Sebastian Wagner

05/07/2020

05:46 PM Bug #44757 (Resolved): perf regression due to bluefs_buffered_io=true
Nathan Cutler
09:50 AM Backport #45426 (In Progress): octopus: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
https://github.com/ceph/ceph/pull/34943 Igor Fedotov
09:37 AM Backport #45426 (Resolved): octopus: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
https://github.com/ceph/ceph/pull/34943 Igor Fedotov
06:33 AM Bug #44880: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
Need that octopus backport.
/a/yuriw-2020-05-05_15:20:13-rados-wip-yuri8-testing-2020-05-04-2117-octopus-distro-ba...
Brad Hubbard

05/06/2020

02:05 PM Bug #45133 (Resolved): BlueStore asserting on fs upgrade tests
Igor Fedotov
01:00 PM Backport #45127 (Resolved): octopus: Extent leak after main device expand
Nathan Cutler

05/05/2020

04:30 PM Backport #45330 (Resolved): nautilus: check-generated.sh finds error in ceph-dencoder
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34832
m...
Nathan Cutler
04:30 PM Backport #45045 (Resolved): nautilus: ceph-bluestore-tool --command bluefs-bdev-new-wal may damag...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34796
m...
Nathan Cutler
04:30 PM Backport #45123: nautilus: OSD might fail to recover after ENOSPC crash
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34611
m...
Nathan Cutler
09:08 AM Backport #45123 (Resolved): nautilus: OSD might fail to recover after ENOSPC crash
Igor Fedotov
04:26 PM Backport #45348 (Resolved): octopus: BlueStore asserting on fs upgrade tests
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34610
m...
Nathan Cutler
04:26 PM Backport #45122: octopus: OSD might fail to recover after ENOSPC crash
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34610
m...
Nathan Cutler
09:06 AM Backport #45122 (Resolved): octopus: OSD might fail to recover after ENOSPC crash
Igor Fedotov
04:25 PM Backport #44819 (Resolved): octopus: perf regression due to bluefs_buffered_io=true
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34353
m...
Nathan Cutler
04:25 PM Backport #45044 (Resolved): octopus: ceph-bluestore-tool --command bluefs-bdev-new-wal may damage...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34795
m...
Nathan Cutler
04:24 PM Backport #45063 (Resolved): octopus: bluestore: unused calculation is broken
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34793
m...
Nathan Cutler
09:08 AM Bug #45112 (Resolved): OSD might fail to recover after ENOSPC crash
Igor Fedotov

05/04/2020

08:43 PM Backport #45330: nautilus: check-generated.sh finds error in ceph-dencoder
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34832
merged
Yuri Weinstein
08:43 PM Backport #45045: nautilus: ceph-bluestore-tool --command bluefs-bdev-new-wal may damage bluefs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34796
merged
Yuri Weinstein
08:37 PM Backport #45123: nautilus: OSD might fail to recover after ENOSPC crash
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/34611
merged
Yuri Weinstein
08:33 PM Backport #45348: octopus: BlueStore asserting on fs upgrade tests
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/34610
merged
Yuri Weinstein
08:33 PM Backport #45122: octopus: OSD might fail to recover after ENOSPC crash
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/34610
merged
Yuri Weinstein
07:53 PM Bug #44880 (Pending Backport): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
/a/yuriw-2020-05-02_20:02:46-rados-wip-yuri6-testing-2020-04-30-2259-octopus-distro-basic-smithi/5016613 Neha Ojha
07:34 PM Backport #45044: octopus: ceph-bluestore-tool --command bluefs-bdev-new-wal may damage bluefs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34795
merged
Yuri Weinstein
09:07 AM Bug #45335: cephadm upgrade: OSD.0 is not coming back after restart: rocksdb: verify_sharding mi...
http://pulpito.ceph.com/swagner-2020-04-30_09:15:46-rados-wip-swagner-testing-2020-04-29-1246-distro-basic-smithi/500... Sebastian Wagner
 

Also available in: Atom