Activity
From 09/01/2022 to 09/30/2022
09/30/2022
- 07:13 PM Bug #17170 (Fix Under Review): mon/monclient: update "unable to obtain rotating service keys when...
- 04:49 PM Bug #57105: quincy: ceph osd pool set <pool> size math error
- Looks like in both cases something is being subtracted from an zero value unsigned int64 and overflowing.
2^64 − ... - 03:37 PM Bug #57105: quincy: ceph osd pool set <pool> size math error
- Setting the size (from 3) to 2, then setting it to 1 works......
- 03:38 AM Bug #57105: quincy: ceph osd pool set <pool> size math error
- I created a new cluster today to do a very specific test and ran into this (or something like it) again today. In th...
- 10:40 AM Bug #49777 (Resolved): test_pool_min_size: 'check for active or peered' reached maximum tries (5)...
- 10:39 AM Backport #57022 (Resolved): pacific: test_pool_min_size: 'check for active or peered' reached max...
- 09:28 AM Bug #50192 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
- 09:27 AM Backport #50274 (Resolved): pacific: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get...
- 09:27 AM Bug #53516 (Resolved): Disable health warning when autoscaler is on
- 09:27 AM Backport #53644 (Resolved): pacific: Disable health warning when autoscaler is on
- 09:27 AM Bug #51942 (Resolved): src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>())
- 09:26 AM Backport #53339 (Resolved): pacific: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<cons...
- 09:26 AM Bug #55001 (Resolved): rados/test.sh: Early exit right after LibRados global tests complete
- 09:26 AM Backport #57029 (Resolved): pacific: rados/test.sh: Early exit right after LibRados global tests ...
- 09:26 AM Bug #57119 (Resolved): Heap command prints with "ceph tell", but not with "ceph daemon"
- 09:25 AM Backport #57313 (Resolved): pacific: Heap command prints with "ceph tell", but not with "ceph dae...
- 05:18 AM Backport #57372 (Resolved): quincy: segfault in librados via libcephsqlite
- 04:23 AM Bug #57532: Notice discrepancies in the performance of mclock built-in profiles
- As Sridhar has mentioned in the BZ, the Case 2 results are due to the max limit setting for best effort clients. This...
- 02:19 AM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
- /a/yuriw-2022-09-27_23:37:28-rados-wip-yuri2-testing-2022-09-27-1455-distro-default-smithi/7046230/
09/29/2022
- 08:37 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- - This was visible again in LRC upgrade today....
- 07:31 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
- yuriw-2022-09-27_23:37:28-rados-wip-yuri2-testing-2022-09-27-1455-distro-default-smithi/7046253
- 07:21 PM Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults...
- yuriw-2022-09-27_23:37:28-rados-wip-yuri2-testing-2022-09-27-1455-distro-default-smithi/7046234
- 06:02 PM Bug #55435 (Resolved): mon/Elector: notify_ranked_removed() does not properly erase dead_ping in ...
- 06:01 PM Backport #56550 (Resolved): pacific: mon/Elector: notify_ranked_removed() does not properly erase...
- 03:55 PM Bug #54611 (Resolved): prometheus metrics shows incorrect ceph version for upgraded ceph daemon
- 03:54 PM Backport #55309 (Resolved): pacific: prometheus metrics shows incorrect ceph version for upgraded...
- 02:52 PM Bug #57727: mon_cluster_log_file_level option doesn't take effect
- Yes. I was trying to close it as a duplicate after editing my comment. Thank you for closing it.
- 02:50 PM Bug #57727 (Duplicate): mon_cluster_log_file_level option doesn't take effect
- Ah, you edited your comment to say "Closing this tracker as a duplicate of 57049".
- 02:48 PM Bug #57727 (Fix Under Review): mon_cluster_log_file_level option doesn't take effect
- 02:41 PM Bug #57727: mon_cluster_log_file_level option doesn't take effect
- Hi Ilya,
I had a PR#47480 opened for this issue but closed it in favor of PR#47502. We have a old tracker 57049 fo... - 02:00 PM Bug #57727 (Duplicate): mon_cluster_log_file_level option doesn't take effect
- This appears to be regression introduced in quincy in https://github.com/ceph/ceph/pull/42014:...
- 02:44 PM Bug #57049: cluster logging does not adhere to mon_cluster_log_file_level
- I had a PR#47480 opened for this issue but closed it in favor of PR#47502. The PR#47502 addresses this issue along wi...
- 02:15 PM Backport #56735 (Resolved): octopus: unessesarily long laggy PG state
- 02:14 PM Bug #50806 (Resolved): osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recovery_state.get_pg_lo...
- 02:13 PM Backport #50893 (Resolved): pacific: osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recovery_s...
- 02:07 PM Bug #55158 (Resolved): mon/OSDMonitor: properly set last_force_op_resend in stretch mode
- 02:07 PM Backport #55281 (Resolved): pacific: mon/OSDMonitor: properly set last_force_op_resend in stretch...
- 11:58 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
- I was not able to reproduce it with the more debug messages, I created PR with the debug message and will wait for re...
- 07:28 AM Bug #56289 (Duplicate): crash: void PeeringState::check_past_interval_bounds() const: abort
- 07:28 AM Bug #54710 (Duplicate): crash: void PeeringState::check_past_interval_bounds() const: abort
- 07:28 AM Bug #54709 (Duplicate): crash: void PeeringState::check_past_interval_bounds() const: abort
- 07:21 AM Bug #54708 (Duplicate): crash: void PeeringState::check_past_interval_bounds() const: abort
- 07:02 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- Radoslaw Zarzynski wrote:
> A note from the bug scrub: work in progress.
WIP: https://gist.github.com/Matan-B/ca5... - 02:47 AM Bug #57532: Notice discrepancies in the performance of mclock built-in profiles
- Hi Bharath, could you also add the mClock configuration values from osd config show command here?
09/28/2022
- 06:03 PM Bug #53806 (New): unessesarily long laggy PG state
- Reopening b/c the original fix had to be reverted: https://github.com/ceph/ceph/pull/44499#issuecomment-1247315820.
- 05:54 PM Bug #57618: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
- Note from a scrub: might we worth talking about.
- 05:51 PM Bug #57650 (In Progress): mon-stretch: reweighting an osd to a big number, then back to original ...
- 05:51 PM Bug #57678 (Fix Under Review): Mon fail to send pending metadata through MMgrUpdate after an upgr...
- 05:50 PM Bug #57698: osd/scrub: "scrub a chunk" requests are sent to the wrong set of replicas
- What are symptoms? How bad is it? A hang maybe? I'm asking to understand the impact.
- 05:48 PM Bug #57698 (In Progress): osd/scrub: "scrub a chunk" requests are sent to the wrong set of replicas
- IIRC Ronen has mentioned the scrub code interchanges @get_acting_set()@ and @get_acting_recovery_backfill()@.
- 01:40 PM Bug #57698 (Resolved): osd/scrub: "scrub a chunk" requests are sent to the wrong set of replicas
- The Primary registers its intent to scrub with the 'get_actingset()', as it should.
But the actual chunk requests ar... - 05:45 PM Bug #57699 (In Progress): slow osd boot with valgrind (reached maximum tries (50) after waiting f...
- Marking WIP per our morning talk.
- 01:58 PM Bug #57699 (Pending Backport): slow osd boot with valgrind (reached maximum tries (50) after wait...
- /a/yuriw-2022-09-23_20:38:59-rados-wip-yuri6-testing-2022-09-23-1008-quincy-distro-default-smithi/7042504 ...
- 05:44 PM Backport #57705 (Resolved): pacific: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when redu...
- 05:44 PM Backport #57704 (Resolved): quincy: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reduc...
- 05:43 PM Bug #57529 (In Progress): mclock backfill is getting higher priority than WPQ
- Marking as WIP as IIRC Sridhar was talking about this issue during core standups.
- 05:42 PM Bug #57573 (In Progress): intrusive_lru leaking memory when
- As I understood:
1. @evit()@ intends to not free too much (which makes sense).
2. The dtor reuses @evict()@ for c... - 05:39 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- A note from the bug scrub: work in progress.
- 05:35 PM Bug #50089 (Pending Backport): mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing n...
- 11:06 AM Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors i...
- ...
- 11:03 AM Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors i...
- I am seeing the same crash in version : ceph version 16.2.10 and just noticed that PR linked in this thread is merged...
- 01:10 PM Backport #57696 (Resolved): quincy: ceph log last command fail to log by verbosity level
- https://github.com/ceph/ceph/pull/50407
- 01:04 PM Feature #52424 (Resolved): [RFE] Limit slow request details to mgr log
- 01:03 PM Bug #57340 (Pending Backport): ceph log last command fail to log by verbosity level
09/27/2022
- 01:02 PM Bug #17170 (New): mon/monclient: update "unable to obtain rotating service keys when osd init" to...
- 01:02 PM Bug #17170 (Closed): mon/monclient: update "unable to obtain rotating service keys when osd init"...
- This report can technically have other causes, but it's just always because the OSDs are too far out of clock sync wi...
- 03:12 AM Bug #57678 (Resolved): Mon fail to send pending metadata through MMgrUpdate after an upgrade resu...
- The prometheus metrics still showing older ceph version for upgraded mon. This issue is observed if we upgrade cluste...
09/26/2022
- 02:44 PM Bug #51688 (In Progress): "stuck peering for" warning is misleading
- 02:44 PM Bug #51688: "stuck peering for" warning is misleading
- Shreyansh Sancheti is working on this bug.
- 01:11 PM Backport #57258 (In Progress): pacific: Assert in Ceph messenger
- 12:29 PM Backport #56722 (In Progress): pacific: osd thread deadlock
- 09:20 AM Backport #55633: octopus: ceph-osd takes all memory before oom on boot
- Konstantin Shalygin wrote:
> Igor, seems when `version` filed is not set it's possible to change issue `status`
>
...
09/24/2022
- 08:08 AM Bug #56495 (Resolved): Log at 1 when Throttle::get_or_fail() fails
- 08:08 AM Backport #56641 (Resolved): quincy: Log at 1 when Throttle::get_or_fail() fails
- 08:07 AM Backport #56642 (Resolved): pacific: Log at 1 when Throttle::get_or_fail() fails
- 08:04 AM Backport #57257 (Resolved): quincy: Assert in Ceph messenger
- 08:03 AM Backport #56723 (Resolved): quincy: osd thread deadlock
- 07:58 AM Backport #55633: octopus: ceph-osd takes all memory before oom on boot
- Igor, seems when `version` filed is not set it's possible to change issue `status`
Radoslaw, what is the current s... - 07:57 AM Backport #55633 (In Progress): octopus: ceph-osd takes all memory before oom on boot
- 07:56 AM Backport #55631 (Resolved): pacific: ceph-osd takes all memory before oom on boot
- Now PR merged, set resolved
09/22/2022
- 08:30 PM Backport #56642: pacific: Log at 1 when Throttle::get_or_fail() fails
- Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/47764
merged - 05:11 PM Bug #57650: mon-stretch: reweighting an osd to a big number, then back to original causes uneven ...
- ceph osd tree:...
- 05:09 PM Bug #57650 (In Progress): mon-stretch: reweighting an osd to a big number, then back to original ...
- Reweight an osd from 0.0900 to 0.7000
and then reweight back to 0.0900. Causes uneven weights between
two zones rep... - 03:03 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
- The same issue was reported in telemetry also on version 15.0.0:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5H... - 02:07 PM Bug #57570 (Fix Under Review): mon-stretched_cluster: Site weights are not monitored post stretch...
- 12:43 PM Bug #57632 (In Progress): test_envlibrados_for_rocksdb: free(): invalid pointer
- 06:44 AM Bug #57632 (Closed): test_envlibrados_for_rocksdb: free(): invalid pointer
- /a/kchai-2022-08-23_13:19:39-rados-wip-kefu-testing-2022-08-22-2243-distro-default-smithi/6987883/...
- 06:45 AM Bug #57163 (Resolved): free(): invalid pointer
- test_envlibrados_for_rocksdb failure will be tracked here: https://tracker.ceph.com/issues/57632
- 05:23 AM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- Thanks for the reproducer Laura, I'm looking into the failures.
09/21/2022
- 10:28 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
- Telemetry also caught this on v14.1.1. Copying that link here to provide the full picture:
http://telemetry.front.... - 10:00 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
- Caught by Telemetry, happened twice on one 16.2.7 cluster:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/... - 09:59 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
- Might be Tracker #39659, but there aren't any logs anymore, so no way to be sure.
- 09:58 PM Bug #57628 (In Progress): osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_sinc...
- /a/yuriw-2022-09-09_14:59:25-rados-wip-yuri2-testing-2022-09-06-1007-pacific-distro-default-smithi/7022809...
- 03:18 PM Bug #51688: "stuck peering for" warning is misleading
- Peering PGs can be simulated in a vstart cluster by marking an OSD down with `./bin/ceph osd down <id>`.
- 02:42 PM Bug #51688: "stuck peering for" warning is misleading
- The relevant code would be in `src/mon/PGMap.cc` and `src/mon/PGMap.h`.
- 11:07 AM Bug #57618: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
- It will only happen with EC pools, the hang will happen when not all osd are up, but still, i'm not sure if we suppos...
- 06:28 AM Bug #57618 (Pending Backport): rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
- Job stopped with...
- 10:39 AM Bug #57616 (Resolved): osd/scrub: on_replica_init() cannot be called twice
09/20/2022
- 12:23 PM Bug #57616 (Resolved): osd/scrub: on_replica_init() cannot be called twice
- on_replica_init() may be called twice for a specific scrub-chunk request from a replica.
But after 30facb0f2b, it st... - 09:00 AM Backport #57373 (In Progress): pacific: segfault in librados via libcephsqlite
09/19/2022
- 03:37 PM Bug #57340: ceph log last command fail to log by verbosity level
- https://github.com/ceph/ceph/pull/47873 merged
- 03:09 PM Bug #57600 (New): thrash-erasure-code: AssertionError: wait_for_recovery timeout due to "active+r...
- /a/yuriw-2022-08-24_16:39:47-rados-wip-yuri4-testing-2022-08-24-0707-pacific-distro-default-smithi/6990392
Descripti... - 03:08 PM Bug #57599 (New): thrash-erasure-code: AssertionError: wait_for_recovery timeout due to "recoveri...
- /a/yuriw-2022-06-23_21:29:45-rados-wip-yuri4-testing-2022-06-22-1415-pacific-distro-default-smithi/6895209
Descrip... - 02:45 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- /a/yuriw-2022-09-14_13:16:11-rados-wip-yuri6-testing-2022-09-13-1352-distro-default-smithi/7032356
- 02:45 PM Bug #56149: thrash-erasure-code: AssertionError: wait_for_recovery timeout due to "active+recover...
- Matan Breizman wrote:
> /a/yuriw-2022-09-14_13:16:11-rados-wip-yuri6-testing-2022-09-13-1352-distro-default-smithi/7... - 12:27 PM Bug #56149: thrash-erasure-code: AssertionError: wait_for_recovery timeout due to "active+recover...
- /a/yuriw-2022-09-14_13:16:11-rados-wip-yuri6-testing-2022-09-13-1352-distro-default-smithi/7032356
09/16/2022
- 10:22 PM Cleanup #57587 (Fix Under Review): mon: fix Elector warnings
- 07:33 PM Cleanup #57587 (Resolved): mon: fix Elector warnings
- `ninja mon -j$(nproc)` on the latest main branch....
- 05:27 PM Bug #57585 (Fix Under Review): ceph versions : mds : remove empty list entries from ceph versions
- 05:25 PM Bug #57585 (Pending Backport): ceph versions : mds : remove empty list entries from ceph versions
- Downstream BZ https://bugzilla.redhat.com/show_bug.cgi?id=2110933
- 03:31 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- /a/yuriw-2022-09-15_17:53:16-rados-quincy-release-distro-default-smithi/7034203/
- 03:29 PM Bug #36304: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_split_child(PG*)
- /a/yuriw-2022-09-15_17:53:16-rados-quincy-release-distro-default-smithi/7034166
- 11:19 AM Fix #57577 (Resolved): osd: Improve osd bench accuracy by using buffers with random patterns
- The osd bench currently uses buffers filled with the same character
for all the writes issued. Buffers can be filled... - 06:22 AM Cleanup #52752: fix warnings
- May be evident with "ninja common -j$(nproc)".
- 06:20 AM Cleanup #52754: windows warnings
- Should be evident when running "ninja client -j$(nproc)".
09/15/2022
- 09:34 PM Cleanup #52754: windows warnings
- New link: https://jenkins.ceph.com/job/ceph-dev-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=windows,DIST=w...
- 09:32 PM Bug #53251: compiler warning about deprecated fmt::format_to()
- Check by running `ninja mon $(nproc)` under the ceph/build directory
- 08:15 PM Bug #57573 (Pending Backport): intrusive_lru leaking memory when
- Values allocated during inserts in the lru defined in
src/common/intrusive_lru.h that are
unreferenced are sometim... - 03:42 PM Feature #57557: Ability to roll-back the enabled stretch-cluster configuration
- As discussed in https://bugzilla.redhat.com/show_bug.cgi?id=2094016, this hasn't been implemented yet.
- 12:47 PM Feature #57557 (New): Ability to roll-back the enabled stretch-cluster configuration
- We have enabled a stretch-cluster configuration on a pre-production system with several already existing and used poo...
- 03:32 PM Bug #57570 (Pending Backport): mon-stretched_cluster: Site weights are not monitored post stretch...
- Site weights are not monitored post-stretch mode deployment.
Basically, after we successfully enabled stretch mode,... - 02:33 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- Running the reproducer to see whether this bug also occurs on main:
http://pulpito.front.sepia.ceph.com/lflores-2022... - 04:50 AM Backport #57545 (In Progress): quincy: CommandFailedError: Command failed (workunit test rados/te...
- 04:48 AM Backport #57544 (In Progress): pacific: CommandFailedError: Command failed (workunit test rados/t...
- 12:58 AM Backport #57313 (In Progress): pacific: Heap command prints with "ceph tell", but not with "ceph ...
09/14/2022
- 09:18 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- Quincy revert PR https://github.com/ceph/ceph/pull/48104
Not sure if we want to put this as a "fix". - 09:03 PM Bug #57546 (Fix Under Review): rados/thrash-erasure-code: wait_for_recovery timeout due to "activ...
- When testing the Quincy RC for 17.2.4, we discovered this failure:
Description: rados/thrash-erasure-code/{ceph cl... - 07:24 PM Backport #57545 (Resolved): quincy: CommandFailedError: Command failed (workunit test rados/test_...
- https://github.com/ceph/ceph/pull/48113
- 07:24 PM Backport #57544 (Resolved): pacific: CommandFailedError: Command failed (workunit test rados/test...
- https://github.com/ceph/ceph/pull/48112
- 07:16 PM Bug #45721 (Pending Backport): CommandFailedError: Command failed (workunit test rados/test_pytho...
- 03:01 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- /a/yuriw-2022-09-10_14:05:53-rados-quincy-release-distro-default-smithi/7024401...
- 09:44 AM Bug #49524 (In Progress): ceph_test_rados_delete_pools_parallel didn't start
- 09:44 AM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
- My theory is that fork failed, which caused all the test not to run, this is the only place we won't get any printing...
- 09:35 AM Bug #45702 (Fix Under Review): PGLog::read_log_and_missing: ceph_assert(miter == missing.get_item...
- 07:10 AM Bug #57533 (Resolved): Able to modify the mclock reservation, weight and limit parameters when bu...
[ceph: root@magna086 /]# ceph config get osd osd_mclock_scheduler_client_res
1
[ceph: root@magna086 /]# ceph conf...- 06:27 AM Bug #57532: Notice discrepancies in the performance of mclock built-in profiles
- From the following data, I noticed that -
1. In the case-1, for all profiles the IO reservations for high_clien... - 06:23 AM Bug #57532 (Duplicate): Notice discrepancies in the performance of mclock built-in profiles
- Downstream BZ- https://bugzilla.redhat.com/show_bug.cgi?id=2126274
09/13/2022
- 09:16 PM Bug #57529 (Resolved): mclock backfill is getting higher priority than WPQ
- Downstream BZ - https://bugzilla.redhat.com/show_bug.cgi?id=2126559
Version - 17.2.1
09/12/2022
- 11:39 PM Bug #48840 (Closed): Octopus: Assert failure: test_ceph_osd_pool_create_utf8
- Closing, as this was only reported in Octopus, which is EOL.
- 08:15 PM Bug #57310: StriperTest: The futex facility returned an unexpected error code
- @Radek yes, thanks, this issue is still ongoing.
- 07:13 PM Bug #57310: StriperTest: The futex facility returned an unexpected error code
- Accordingly to "Patrick's comment in PR #47841":https://github.com/ceph/ceph/pull/47841 it doesn't addresses the prob...
- 06:57 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
- Let's move back to it next week.
- 06:54 PM Bug #57467 (Resolved): EncodingException.Macros fails on make check on quincy
- 06:51 PM Bug #51194: PG recovery_unfound after scrub repair failed on primary
- Would be great to have a replicator for this. Let's see whether we can make a standalone test exercising the sequence...
- 06:33 PM Bug #43268: Restrict admin socket commands more from the Ceph tool
- Tagging as medium-hanging-fruit as, IIUC, we would need to:
0. (only if necessary): introduce a config variable to... - 02:34 PM Feature #53050 (Resolved): Support blocklisting a CIDR range
- 02:34 PM Backport #55747 (Resolved): pacific: Support blocklisting a CIDR range
- 02:14 PM Bug #53729 (Pending Backport): ceph-osd takes all memory before oom on boot
- 02:14 PM Backport #55631: pacific: ceph-osd takes all memory before oom on boot
- https://github.com/ceph/ceph/pull/47701
- 11:07 AM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
- The printing will be flushed only after the process complete, in that case of ceph_test_rados_delete_pools_parallel, ...
- 06:53 AM Bug #54558: malformed json in a Ceph RESTful API call can stop all ceph-mon services
- Ilya Dryomov wrote:
> I don't think https://github.com/ceph/ceph/pull/45547 is a complete fix, see my comment in the...
09/11/2022
- 03:42 PM Bug #51194: PG recovery_unfound after scrub repair failed on primary
- Just hit this in a v15.2.15 cluster too. Michel which version does your cluster run?
- 05:05 AM Backport #57496 (In Progress): quincy: Invalid read of size 8 in handle_recovery_delete()
- 05:03 AM Backport #57496 (Resolved): quincy: Invalid read of size 8 in handle_recovery_delete()
- https://github.com/ceph/ceph/pull/48039
09/09/2022
- 06:50 PM Backport #57257: quincy: Assert in Ceph messenger
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47931
merged - 04:59 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
- Nitzan, can you please take a look at this issue? seems intermittent, but still exists
- 04:44 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
- These failures are diagnosed by noting the failed pid (in this case 59576), and backtracking to see which test it was...
- 04:09 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2022-09-05_13:59:13-rados-wip-yuri10-testing-2022-09-04-0811-quincy-distro-default-smithi/7012481
Needs a...
09/08/2022
- 04:38 PM Backport #57209: quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47932
merged - 04:37 PM Bug #57467 (Fix Under Review): EncodingException.Macros fails on make check on quincy
- 11:18 AM Bug #57467: EncodingException.Macros fails on make check on quincy
- Since this has probably fallen off of Kefu's radar, I went ahead and opened https://github.com/ceph/ceph/pull/48016.
- 03:26 PM Documentation #57448 (Resolved): Doc: Update release notes on the fix for high CPU usage during r...
- 03:26 PM Backport #57461 (Resolved): quincy: Doc: Update release notes on the fix for high CPU usage durin...
09/07/2022
- 10:09 PM Bug #57467: EncodingException.Macros fails on make check on quincy
- There was an attempt to fix this issue here: https://github.com/ceph/ceph/pull/47938
- 10:07 PM Bug #57467 (Resolved): EncodingException.Macros fails on make check on quincy
- irvingi07: https://jenkins.ceph.com/job/ceph-pull-requests/103416...
- 06:12 PM Bug #54558: malformed json in a Ceph RESTful API call can stop all ceph-mon services
- I don't think https://github.com/ceph/ceph/pull/45547 is a complete fix, see my comment in the PR.
- 03:01 PM Bug #55233: librados C++ API requires C++17 to build
- https://github.com/ceph/ceph/pull/46005 merged
- 02:57 PM Backport #56736: quincy: unessesarily long laggy PG state
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47901
merged - 02:55 PM Backport #55297: quincy: malformed json in a Ceph RESTful API call can stop all ceph-mon services
- nikhil kshirsagar wrote:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/... - 02:12 PM Backport #57461 (In Progress): quincy: Doc: Update release notes on the fix for high CPU usage du...
- 02:03 PM Backport #57461 (Resolved): quincy: Doc: Update release notes on the fix for high CPU usage durin...
- https://github.com/ceph/ceph/pull/48004
- 12:47 PM Bug #46847: Loss of placement information on OSD reboot
- The PR https://github.com/ceph/ceph/pull/40849 for adding the test was marked stale. I left a comment and it would be...
- 12:10 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
- Took a look at why peering was happening in the first place. Looking at PG 7.16 logs below, we can see that the balan...
- 05:25 AM Backport #57346: quincy: expected valgrind issues and found none
- /a/yuriw-2022-09-03_14:52:22-rados-wip-yuri-testing-2022-09-02-0945-quincy-distro-default-smithi/7009611
- 02:57 AM Bug #42884: OSDMapTest.CleanPGUpmaps failure
- adami03: https://jenkins.ceph.com/job/ceph-pull-requests/103395/console
09/06/2022
- 08:45 PM Backport #55309: pacific: prometheus metrics shows incorrect ceph version for upgraded ceph daemon
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47693
merged - 08:42 PM Backport #55305: quincy: Manager is failing to keep updated metadata in daemon_state for upgraded...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46559
merged - 05:57 PM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
- /a/yuriw-2022-08-22_16:21:19-rados-wip-yuri8-testing-2022-08-22-0646-distro-default-smithi/6985175
- 03:19 PM Documentation #57448 (Resolved): Doc: Update release notes on the fix for high CPU usage during r...
- 02:15 PM Backport #57312 (Resolved): quincy: Heap command prints with "ceph tell", but not with "ceph daemon"
- 12:59 PM Backport #55156 (Resolved): pacific: mon: config commands do not accept whitespace style config name
- 12:29 PM Backport #55308 (Resolved): pacific: Manager is failing to keep updated metadata in daemon_state ...
- 05:56 AM Backport #57443 (In Progress): quincy: osd: Update osd's IOPS capacity using async Context comple...
- 05:09 AM Backport #57443 (Resolved): quincy: osd: Update osd's IOPS capacity using async Context completio...
- https://github.com/ceph/ceph/pull/47983
- 04:42 AM Fix #57040 (Pending Backport): osd: Update osd's IOPS capacity using async Context completion ins...
09/05/2022
- 02:10 PM Backport #56641: quincy: Log at 1 when Throttle::get_or_fail() fails
- Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/47765
merged - 02:02 PM Backport #57372: quincy: segfault in librados via libcephsqlite
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47909
merged
09/04/2022
- 02:22 PM Bug #53000: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_d...
- ...
- 08:08 AM Backport #57346: quincy: expected valgrind issues and found none
- /a/yuriw-2022-09-02_15:23:14-rados-wip-yuri6-testing-2022-09-01-1034-quincy-distro-default-smithi/7008140/
/a/yuri...
09/02/2022
- 05:38 PM Backport #57117: quincy: mon: race condition between `mgr fail` and MgrMonitor::prepare_beacon()
- Should be backported with https://github.com/ceph/ceph/pull/47834.
- 05:08 PM Backport #57346 (In Progress): quincy: expected valgrind issues and found none
- 05:06 PM Backport #57209 (In Progress): quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- 04:59 PM Backport #55972 (Resolved): quincy: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10...
- 04:59 PM Backport #55972: quincy: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e...
- Already in quincy. See https://github.com/ceph/ceph/pull/46498.
- 04:49 PM Backport #57257 (In Progress): quincy: Assert in Ceph messenger
- 04:48 PM Backport #56723 (In Progress): quincy: osd thread deadlock
- 04:47 PM Backport #56655 (In Progress): quincy: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierF...
- 04:46 PM Backport #56602 (In Progress): quincy: ceph report missing osdmap_clean_epochs if answered by peon
- 04:34 PM Backport #55543 (In Progress): quincy: should use TCMalloc for better performance
- 04:34 PM Backport #55282 (In Progress): quincy: osd: add scrub duration for scrubs after recovery
- 04:31 PM Backport #56648 (In Progress): quincy: [Progress] Do not show NEW PG_NUM value for pool if autosc...
- 04:18 PM Backport #57312: quincy: Heap command prints with "ceph tell", but not with "ceph daemon"
- Laura Flores wrote:
> https://github.com/ceph/ceph/pull/47825
merged - 02:20 PM Bug #54172 (Resolved): ceph version 16.2.7 PG scrubs not progressing
- 02:19 PM Backport #56409 (Resolved): pacific: ceph version 16.2.7 PG scrubs not progressing
- 02:12 PM Feature #54600 (Resolved): Add scrub_duration to pg dump json format
- 02:12 PM Backport #54602 (Duplicate): quincy: Add scrub_duration to pg dump json format
- 02:10 PM Backport #54601 (Resolved): quincy: Add scrub_duration to pg dump json format
- 02:10 PM Backport #55065 (Rejected): quincy: osd_fast_shutdown_notify_mon option should be true by default
- 01:59 PM Backport #56551 (Resolved): quincy: mon/Elector: notify_ranked_removed() does not properly erase ...
- 01:53 PM Backport #57030 (Resolved): quincy: rados/test.sh: Early exit right after LibRados global tests c...
- 01:45 PM Backport #57289 (Rejected): quincy: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<Parallel...
- This backport ticket is a result of a thinko. Rejecting.
- 01:44 PM Backport #57288 (Rejected): pacific: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<Paralle...
- https://github.com/ceph/ceph/pull/45582 is NOT the fix. This backport ticket is a result of a thinko. Rejecting.
- 01:40 PM Bug #53000 (New): OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_...
- Sorry, moving back to @New@.
- 01:31 PM Bug #53740 (Resolved): mon: all mon daemon always crash after rm pool
- No need for backporting to quincy – the fix is already there (see the comment in the backport ticket). Resolving.
- 01:30 PM Backport #53977 (Rejected): quincy: mon: all mon daemon always crash after rm pool
- 01:29 PM Backport #53977: quincy: mon: all mon daemon always crash after rm pool
- The fix is already in quincy:...
- 01:02 PM Backport #56408 (Resolved): quincy: ceph version 16.2.7 PG scrubs not progressing
- 01:01 PM Backport #55157 (Resolved): quincy: mon: config commands do not accept whitespace style config name
- 12:55 PM Backport #55632 (Resolved): quincy: ceph-osd takes all memory before oom on boot
- The last missing part (the online dups trimming) is merged.
- 02:47 AM Bug #57119 (Pending Backport): Heap command prints with "ceph tell", but not with "ceph daemon"
- 02:39 AM Bug #57165: expected valgrind issues and found none
- Quincy runs:
https://pulpito.ceph.com/yuriw-2022-09-01_16:26:28-rados-wip-lflores-testing-2-2022-08-26-2240-quincy...
09/01/2022
- 11:03 PM Bug #57119: Heap command prints with "ceph tell", but not with "ceph daemon"
- https://github.com/ceph/ceph/pull/47650 merged
- 04:13 PM Backport #57372 (In Progress): quincy: segfault in librados via libcephsqlite
- 04:05 PM Backport #57372 (Resolved): quincy: segfault in librados via libcephsqlite
- https://github.com/ceph/ceph/pull/47909
- 04:05 PM Backport #57373 (Resolved): pacific: segfault in librados via libcephsqlite
- https://github.com/ceph/ceph/pull/48187
- 04:00 PM Bug #57152 (Pending Backport): segfault in librados via libcephsqlite
- 01:39 PM Backport #56736 (In Progress): quincy: unessesarily long laggy PG state
- 01:35 PM Bug #57163 (Fix Under Review): free(): invalid pointer
- >Many thanks to Josh for suggesting we may be dealing with a compiler mismatch here and sorry if you were working on ...
- 12:30 PM Bug #57163: free(): invalid pointer
- /a/yuriw-2022-09-01_00:21:36-rados-wip-yuri7-testing-2022-08-31-0841-distro-default-smithi/7003413
- 01:24 PM Backport #56734 (In Progress): pacific: unessesarily long laggy PG state
- 10:36 AM Bug #49231: MONs unresponsive over extended periods of time
- OK, I did some more work and it looks like I can trigger the issue with some certainty by failing an MDS that was up ...
Also available in: Atom