Activity
From 06/16/2020 to 07/15/2020
07/15/2020
- 10:58 PM Bug #46490: osds crashing during deep-scrub
- The whole logfile is 60G. The file 'osd-164-fsck.out.gz' that I uploaded with the last update is the console output o...
- 08:37 PM Bug #46490: osds crashing during deep-scrub
- Hi Lawrence
looks like some data corruption (multiple objects referring to the same disk extent) which causes decomp... - 08:02 AM Bug #46490: osds crashing during deep-scrub
- Hi Igor, thanks for looking into this.
You were right about the corrupted backtrace. Although select_prefer_bdef ca... - 09:28 PM Bug #43147: segv in LruOnodeCacheShard::_pin
- /a/yuriw-2020-07-13_23:06:23-rados-wip-yuri5-testing-2020-07-13-1944-octopus-distro-basic-smithi/5224399
- 05:35 PM Bug #46552: Rescue procedure for extremely large bluefs log
- Neha Ojha wrote:
> Octopus: -https://github.com/ceph/ceph/pull/36112-
https://github.com/ceph/ceph/pull/36123
> ... - 04:18 PM Bug #46552 (Resolved): Rescue procedure for extremely large bluefs log
- This feature was developed on luminous before being merged into master.
Luminous: https://github.com/ceph/ceph/pul... - 09:51 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- I have fixed this bug in pr: https://github.com/ceph/ceph/pull/36108 , can you help to have a review ? @Igor Fedotov...
- 08:51 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- I am sorry, There is a problem with the pr link given above( This abnormal phenomenon is caused by the pr(https://git...
- 08:36 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- hongsong wu wrote:
> Affected Versions: v12.2.12~~v12.2.13, v13.2.5 ~~ v13.2.10
07/14/2020
- 03:31 AM Bug #46525 (Need More Info): osd crush
- my env:
ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
CentOS Linux release 7....
07/13/2020
- 09:37 AM Bug #46490: osds crashing during deep-scrub
- First of all IMO the Nautilus back trace isn't valid - there is no RocksDBBlueFSVolumeSelector::select_prefer_bdev ca...
07/11/2020
- 03:08 PM Bug #46490 (Need More Info): osds crashing during deep-scrub
- During scrubbing osds from our 8+3 EC-pool seem to be randomly crashing with the backtrace:...
- 08:52 AM Backport #46009 (In Progress): octopus: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentatio...
07/10/2020
- 05:54 PM Bug #44880 (Resolved): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 02:50 PM Bug #38745: spillover that doesn't make sense
- And also here in graphs you can see the bluefs db used is decreasing after slow used increases but still slow bytes i...
- 12:58 PM Bug #38745: spillover that doesn't make sense
- Thanks @Igor for your help again.
I saw a new behavior now and I don't see any level gets score 1.0 but ceph says th... - 06:07 AM Backport #45426 (Resolved): octopus: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34943
m...
07/08/2020
- 07:23 PM Bug #46055: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- https://github.com/ceph/ceph/pull/34943 merged
- 07:22 PM Backport #45426: octopus: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- https://github.com/ceph/ceph/pull/34943 merged
- 09:59 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- Adam, mind taking a look?
- 06:44 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- Affected Versions: v12.2.12~~v12.2.13, v13.2.5 ~~ v13.2.9
- 06:09 AM Bug #46411 (Rejected): mimic: Disks associated to osds have small write io even on an idle ceph c...
- * 1.anomalies
When the ceph cluster is idle,you can found that the disks associated to osds have small write io e... - 09:33 AM Bug #46124: Potential race condition regression around new OSD flock()s
- I'm able to reproduce the issue with src/objectstore/store_test and pretty pawerfull HW:
-10> 2020-07-08T09:28:28....
07/06/2020
- 10:33 PM Bug #46124 (New): Potential race condition regression around new OSD flock()s
- 10:30 PM Bug #46124: Potential race condition regression around new OSD flock()s
- @Neha Ojha:
I think the bug remains as real as it gets; I did not retract that this is a bug. With my last comment...
07/03/2020
- 03:37 PM Backport #46350 (Resolved): octopus: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompressi...
- https://github.com/ceph/ceph/pull/37373
07/02/2020
- 02:09 PM Bug #46124 (Closed): Potential race condition regression around new OSD flock()s
- Please feel free to reopen if you find a real bug somewhere.
- 01:59 PM Bug #45613 (Pending Backport): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 f...
07/01/2020
- 12:56 PM Bug #46054 (Resolved): RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion...
06/30/2020
- 09:45 PM Bug #44880: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- Backporting note: needs to be backported together with follow-on fix. See the octopus backport PR and #45426
- 09:44 PM Bug #46055 (Resolved): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- backport tracked via https://tracker.ceph.com/issues/44880
- 02:21 AM Bug #46055: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- Being backported in https://github.com/ceph/ceph/pull/34943
- 02:20 AM Bug #46055 (Pending Backport): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- 04:48 PM Bug #46270: mimic:osd can not start
- This just looks like bluefs is running out of space. Mimic is EOL, I'd recommend you to upgrade and report back if yo...
- 06:40 AM Bug #46270 (Can't reproduce): mimic:osd can not start
- My env:
[root@mon1 test]# ceph -v
ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)
[r...
06/27/2020
- 02:52 PM Bug #46054 (Fix Under Review): RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): A...
- 02:51 PM Bug #46054: RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion `last_ref'...
- hi Adam, i am working on this issue. as i've run into it twice. and i feel obliged to fix it. as i failed to identify...
- 08:18 AM Bug #46054: RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion `last_ref'...
- /a//kchai-2020-06-27_07:37:00-rados-wip-kefu-testing-2020-06-27-1407-distro-basic-smithi/5183643
06/26/2020
- 05:35 PM Bug #46054: RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion `last_ref'...
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176446
06/24/2020
- 08:43 PM Backport #46195 (Resolved): luminous: BlueFS replay log grows without end
- https://github.com/ceph/ceph/pull/35776
- 08:43 PM Backport #46194 (Resolved): nautilus: BlueFS replay log grows without end
- https://github.com/ceph/ceph/pull/37948
- 08:43 PM Backport #46193 (Resolved): octopus: BlueFS replay log grows without end
- https://github.com/ceph/ceph/pull/36621
- 08:43 PM Backport #46192 (Rejected): mimic: BlueFS replay log grows without end
- 02:35 PM Bug #45903 (Pending Backport): BlueFS replay log grows without end
- It will be good to get the fix into luminous and mimic for affected users.
- 10:37 AM Backport #45682 (Resolved): octopus: Large (>=2 GB) writes are incomplete when bluefs_buffered_io...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35446
m...
06/23/2020
- 08:12 PM Backport #45682: octopus: Large (>=2 GB) writes are incomplete when bluefs_buffered_io = true
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35446
merged
06/21/2020
- 09:09 PM Bug #46124: Potential race condition regression around new OSD flock()s
- > I suspect that Ceph starts other threads (using clone() on Linux) while the lock is held
Sorry, this should be t... - 05:27 PM Bug #46124: Potential race condition regression around new OSD flock()s
- From the strace above, we can see that there's always a `close()` after a matching `flock()` within the same PID, so ...
- 01:53 PM Bug #46124: Potential race condition regression around new OSD flock()s
- Another question:
Would it not be better to use OFD locks (Open File Description locks), that is via ... - 03:18 AM Bug #46124: Potential race condition regression around new OSD flock()s
- In case it helps, here are `strace` invocations, each showing slightly different behaviour and error messages, that i...
- 03:08 AM Bug #46124: Potential race condition regression around new OSD flock()s
- I did not experience that in Mimic.
- 03:07 AM Bug #46124 (Resolved): Potential race condition regression around new OSD flock()s
- In #38150 and PR https://github.com/ceph/ceph/pull/26245, a new `flock()` approach was introuduced.
When I use `ce... - 03:08 AM Bug #38150: KernelDevice exclusive lock broken
- I suspect this may have introduced a regression: #46124
06/20/2020
06/18/2020
06/17/2020
- 04:06 PM Bug #46055 (Resolved): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- ...
- 03:55 PM Bug #46054 (Resolved): RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion...
- ...
- 03:46 PM Bug #45765: BlueStore::_collection_list causes huge latency growth pg deletion
- We don't use RGW, we use self-written client which operates with small objects (~500-700-byte objects). Load is not t...
06/16/2020
- 09:25 PM Bug #46027: bufferlist c_str() sometimes clears assignment to mempool
- PR: https://github.com/ceph/ceph/pull/35584.
- 08:02 AM Bug #46027 (Resolved): bufferlist c_str() sometimes clears assignment to mempool
- Sometimes c_str() needs to rebuild underlying buffer::raw.
It that case original assignment to mempool is lost. - 09:23 AM Bug #45994 (Triaged): OSD crash - in thread tp_osd_tp
- 09:20 AM Bug #45994: OSD crash - in thread tp_osd_tp
- Increasing suicide timeout doesn't look like the proper way of dealing with this issue.
I presume you're suffering... - 06:04 AM Bug #45994: OSD crash - in thread tp_osd_tp
- hi ,
We have seen the issue to be caused by heartbeat timeout resolved by increasing the timer. Hence can this tick...
Also available in: Atom