Activity
From 06/19/2020 to 07/18/2020
07/18/2020
- 12:13 PM Backport #46584 (Need More Info): octopus: os/bluestore: simplify Onode pin/unpin logic.
- this backport is on hold because the fix is baking in master
- 12:12 PM Bug #43147 (Pending Backport): segv in LruOnodeCacheShard::_pin
- Neha - is it OK to backport this to Octopus now?
07/17/2020
- 05:35 PM Bug #38554: ObjectStore/StoreTestSpecificAUSize.TooManyBlobsTest/2 fail, Expected: (res_stat.allo...
- Igor, I am seeing this failure on latest nautilus....
- 12:58 PM Bug #46490: osds crashing during deep-scrub
- ... Sorry, the preceding "921747443:" in line 6 is just a remnant line number of the grep -n I did initally and forgo...
- 12:52 PM Bug #46490: osds crashing during deep-scrub
- The output for grep -e "_verify_csum bad" -e "fsck error" on the log file is:...
- 12:47 PM Bug #46490: osds crashing during deep-scrub
- I'm sorry, i wasn't clear in that. Yes, this is the only output. (The line "-9999>[...]" appears only in the console ...
- 10:45 AM Bug #46490: osds crashing during deep-scrub
- Lawrence Smith wrote:
> The whole logfile is 60G. The file 'osd-164-fsck.out.gz' that I uploaded with the last updat... - 12:22 PM Backport #46599 (In Progress): octopus: Rescue procedure for extremely large bluefs log
- 12:22 PM Backport #46599 (Resolved): octopus: Rescue procedure for extremely large bluefs log
- https://github.com/ceph/ceph/pull/36123
- 12:17 PM Backport #46598 (Resolved): luminous: Rescue procedure for extremely large bluefs log
- 12:16 PM Backport #46598 (Resolved): luminous: Rescue procedure for extremely large bluefs log
- https://github.com/ceph/ceph/pull/35776
(Note: this is not a cherry-pick from master. Rather, the master PR is bas... - 11:14 AM Backport #46584 (Resolved): octopus: os/bluestore: simplify Onode pin/unpin logic.
- https://github.com/ceph/ceph/pull/36795
- 11:13 AM Backport #45684 (Resolved): nautilus: Large (>=2 GB) writes are incomplete when bluefs_buffered_i...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35404
m...
07/16/2020
- 05:11 PM Backport #45684: nautilus: Large (>=2 GB) writes are incomplete when bluefs_buffered_io = true
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35404
merged - 03:24 PM Bug #46575: os/bluestore: simplify Onode pin/unpin logic.
- Should fix issues like https://tracker.ceph.com/issues/43147 that we've been seeing.
- 03:23 PM Bug #46575 (Resolved): os/bluestore: simplify Onode pin/unpin logic.
- We want to let this bake in master before backporting to octopus.
07/15/2020
- 10:58 PM Bug #46490: osds crashing during deep-scrub
- The whole logfile is 60G. The file 'osd-164-fsck.out.gz' that I uploaded with the last update is the console output o...
- 08:37 PM Bug #46490: osds crashing during deep-scrub
- Hi Lawrence
looks like some data corruption (multiple objects referring to the same disk extent) which causes decomp... - 08:02 AM Bug #46490: osds crashing during deep-scrub
- Hi Igor, thanks for looking into this.
You were right about the corrupted backtrace. Although select_prefer_bdef ca... - 09:28 PM Bug #43147: segv in LruOnodeCacheShard::_pin
- /a/yuriw-2020-07-13_23:06:23-rados-wip-yuri5-testing-2020-07-13-1944-octopus-distro-basic-smithi/5224399
- 05:35 PM Bug #46552: Rescue procedure for extremely large bluefs log
- Neha Ojha wrote:
> Octopus: -https://github.com/ceph/ceph/pull/36112-
https://github.com/ceph/ceph/pull/36123
> ... - 04:18 PM Bug #46552 (Resolved): Rescue procedure for extremely large bluefs log
- This feature was developed on luminous before being merged into master.
Luminous: https://github.com/ceph/ceph/pul... - 09:51 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- I have fixed this bug in pr: https://github.com/ceph/ceph/pull/36108 , can you help to have a review ? @Igor Fedotov...
- 08:51 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- I am sorry, There is a problem with the pr link given above( This abnormal phenomenon is caused by the pr(https://git...
- 08:36 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- hongsong wu wrote:
> Affected Versions: v12.2.12~~v12.2.13, v13.2.5 ~~ v13.2.10
07/14/2020
- 03:31 AM Bug #46525 (Need More Info): osd crush
- my env:
ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
CentOS Linux release 7....
07/13/2020
- 09:37 AM Bug #46490: osds crashing during deep-scrub
- First of all IMO the Nautilus back trace isn't valid - there is no RocksDBBlueFSVolumeSelector::select_prefer_bdev ca...
07/11/2020
- 03:08 PM Bug #46490 (Need More Info): osds crashing during deep-scrub
- During scrubbing osds from our 8+3 EC-pool seem to be randomly crashing with the backtrace:...
- 08:52 AM Backport #46009 (In Progress): octopus: ObjectStore/StoreTestSpecificAUSize.ExcessiveFragmentatio...
07/10/2020
- 05:54 PM Bug #44880 (Resolved): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 02:50 PM Bug #38745: spillover that doesn't make sense
- And also here in graphs you can see the bluefs db used is decreasing after slow used increases but still slow bytes i...
- 12:58 PM Bug #38745: spillover that doesn't make sense
- Thanks @Igor for your help again.
I saw a new behavior now and I don't see any level gets score 1.0 but ceph says th... - 06:07 AM Backport #45426 (Resolved): octopus: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34943
m...
07/08/2020
- 07:23 PM Bug #46055: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- https://github.com/ceph/ceph/pull/34943 merged
- 07:22 PM Backport #45426: octopus: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- https://github.com/ceph/ceph/pull/34943 merged
- 09:59 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- Adam, mind taking a look?
- 06:44 AM Bug #46411: mimic: Disks associated to osds have small write io even on an idle ceph cluster
- Affected Versions: v12.2.12~~v12.2.13, v13.2.5 ~~ v13.2.9
- 06:09 AM Bug #46411 (Rejected): mimic: Disks associated to osds have small write io even on an idle ceph c...
- * 1.anomalies
When the ceph cluster is idle,you can found that the disks associated to osds have small write io e... - 09:33 AM Bug #46124: Potential race condition regression around new OSD flock()s
- I'm able to reproduce the issue with src/objectstore/store_test and pretty pawerfull HW:
-10> 2020-07-08T09:28:28....
07/06/2020
- 10:33 PM Bug #46124 (New): Potential race condition regression around new OSD flock()s
- 10:30 PM Bug #46124: Potential race condition regression around new OSD flock()s
- @Neha Ojha:
I think the bug remains as real as it gets; I did not retract that this is a bug. With my last comment...
07/03/2020
- 03:37 PM Backport #46350 (Resolved): octopus: ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompressi...
- https://github.com/ceph/ceph/pull/37373
07/02/2020
- 02:09 PM Bug #46124 (Closed): Potential race condition regression around new OSD flock()s
- Please feel free to reopen if you find a real bug somewhere.
- 01:59 PM Bug #45613 (Pending Backport): ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixCompression/2 f...
07/01/2020
- 12:56 PM Bug #46054 (Resolved): RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion...
06/30/2020
- 09:45 PM Bug #44880: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- Backporting note: needs to be backported together with follow-on fix. See the octopus backport PR and #45426
- 09:44 PM Bug #46055 (Resolved): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- backport tracked via https://tracker.ceph.com/issues/44880
- 02:21 AM Bug #46055: ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- Being backported in https://github.com/ceph/ceph/pull/34943
- 02:20 AM Bug #46055 (Pending Backport): ObjectStore/StoreTestSpecificAUSize.SpilloverTest/2 failed
- 04:48 PM Bug #46270: mimic:osd can not start
- This just looks like bluefs is running out of space. Mimic is EOL, I'd recommend you to upgrade and report back if yo...
- 06:40 AM Bug #46270 (Can't reproduce): mimic:osd can not start
- My env:
[root@mon1 test]# ceph -v
ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)
[r...
06/27/2020
- 02:52 PM Bug #46054 (Fix Under Review): RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): A...
- 02:51 PM Bug #46054: RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion `last_ref'...
- hi Adam, i am working on this issue. as i've run into it twice. and i feel obliged to fix it. as i failed to identify...
- 08:18 AM Bug #46054: RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion `last_ref'...
- /a//kchai-2020-06-27_07:37:00-rados-wip-kefu-testing-2020-06-27-1407-distro-basic-smithi/5183643
06/26/2020
- 05:35 PM Bug #46054: RocksDBResharding: rocksdb::ColumnFamilySet::~ColumnFamilySet(): Assertion `last_ref'...
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176446
06/24/2020
- 08:43 PM Backport #46195 (Resolved): luminous: BlueFS replay log grows without end
- https://github.com/ceph/ceph/pull/35776
- 08:43 PM Backport #46194 (Resolved): nautilus: BlueFS replay log grows without end
- https://github.com/ceph/ceph/pull/37948
- 08:43 PM Backport #46193 (Resolved): octopus: BlueFS replay log grows without end
- https://github.com/ceph/ceph/pull/36621
- 08:43 PM Backport #46192 (Rejected): mimic: BlueFS replay log grows without end
- 02:35 PM Bug #45903 (Pending Backport): BlueFS replay log grows without end
- It will be good to get the fix into luminous and mimic for affected users.
- 10:37 AM Backport #45682 (Resolved): octopus: Large (>=2 GB) writes are incomplete when bluefs_buffered_io...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35446
m...
06/23/2020
- 08:12 PM Backport #45682: octopus: Large (>=2 GB) writes are incomplete when bluefs_buffered_io = true
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35446
merged
06/21/2020
- 09:09 PM Bug #46124: Potential race condition regression around new OSD flock()s
- > I suspect that Ceph starts other threads (using clone() on Linux) while the lock is held
Sorry, this should be t... - 05:27 PM Bug #46124: Potential race condition regression around new OSD flock()s
- From the strace above, we can see that there's always a `close()` after a matching `flock()` within the same PID, so ...
- 01:53 PM Bug #46124: Potential race condition regression around new OSD flock()s
- Another question:
Would it not be better to use OFD locks (Open File Description locks), that is via ... - 03:18 AM Bug #46124: Potential race condition regression around new OSD flock()s
- In case it helps, here are `strace` invocations, each showing slightly different behaviour and error messages, that i...
- 03:08 AM Bug #46124: Potential race condition regression around new OSD flock()s
- I did not experience that in Mimic.
- 03:07 AM Bug #46124 (Resolved): Potential race condition regression around new OSD flock()s
- In #38150 and PR https://github.com/ceph/ceph/pull/26245, a new `flock()` approach was introuduced.
When I use `ce... - 03:08 AM Bug #38150: KernelDevice exclusive lock broken
- I suspect this may have introduced a regression: #46124
06/20/2020
Also available in: Atom