Project

General

Profile

Activity

From 07/27/2023 to 08/25/2023

08/25/2023

06:41 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=15d62a24ad12be22753ffcc0a78cd90cf7... Laura Flores
05:43 PM Bug #62588 (New): ceph config set allows WHO to be osd.*, which is misleading
We came across a customer cluster who uses `ceph config set osd.* ...` thinking it would apply to *all* OSDs.
In fa...
Dan van der Ster

08/24/2023

08:26 PM Bug #62578 (New): mon: osd pg-upmap-items command causes PG_DEGRADED warnings
... Patrick Donnelly
12:16 PM Bug #62568: Coredump in rados_aio_write_op_operate
Is there any recommendation to proceed with latest version to overcome this coredump? Nokia ceph-users
12:12 PM Bug #62568 (New): Coredump in rados_aio_write_op_operate
We are facing crash issue in the function rados_aio_write_op_operate().Please find the stack trace below,
current ...
Nokia ceph-users
04:11 AM Backport #59537 (Resolved): quincy: osd/scrub: verify SnapMapper consistency not backported
Konstantin Shalygin

08/23/2023

10:18 PM Bug #49727: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
/a/yuriw-2023-08-21_23:10:07-rados-pacific-release-distro-default-smithi/7375579 Laura Flores
09:53 PM Bug #62557: rados: Teuthology test failure due to "MDS_CLIENTS_LAGGY" warning
Description: rados/dashboard/{centos_8.stream_container_tools clusters/{2-node-mgr} debug/mgr mon_election/connectivi... Laura Flores
09:45 PM Bug #62557 (New): rados: Teuthology test failure due to "MDS_CLIENTS_LAGGY" warning
Description: rados/dashboard/{centos_8.stream_container_tools clusters/{2-node-mgr} debug/mgr mon_election/classic ra... Laura Flores
12:03 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/yuriw-2023-08-22_18:16:03-rados-wip-yuri10-testing-2023-08-17-1444-distro-default-smithi/7376687 Matan Breizman

08/22/2023

06:05 PM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
/a/yuriw-2023-08-17_21:18:20-rados-wip-yuri11-testing-2023-08-17-0823-distro-default-smithi/7372203 Laura Flores
05:16 PM Bug #49689 (Resolved): osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch...
Matan Breizman
05:14 PM Backport #61149 (Resolved): pacific: osd/PeeringState.cc: ceph_abort_msg("past_interval start int...
Matan Breizman
04:20 PM Bug #62529 (New): PrimaryLogPG::log_op_stats uses `now` time vs op end time when calculating op l...
* With `debug_osd` 15 or higher, a log line called `log_op_stats` is output, which ends with a reported latency value... Michael Kidd
09:54 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
# ceph daemon osd.0 config show | grep _message_
"osd_client_message_cap": "0",
"osd_client_message_size_ca...
jianwei zhang
09:36 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...

Continue to increase client traffic
When the. osd_client_message_size_cap upper limit is reached, it will still...
jianwei zhang
09:29 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
After loosening the restriction on osd_client_message_cap, msgr-worker cpu is reduced to 1%.
But the throttle-osd_c...
jianwei zhang
09:25 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
... jianwei zhang
09:24 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
... jianwei zhang
09:07 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
reproduce :... jianwei zhang
08:03 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
... jianwei zhang
07:58 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!https://tracker.ceph.com/attachments/download/6631/osd-cpu-history-change.jpg! jianwei zhang
07:57 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!osd-cpu-history-change!
As shown in the figure,
16:00 ~ 20:00 osd_client_message_cap=256, cpu high to 200%
af...
jianwei zhang
07:45 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!osd-cpu-history-change! jianwei zhang
07:36 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
Question:
Osd throttle-osd_client_messages has been producing a lot of get_or_fail_fail.
Top-p < osd.pid > see th...
jianwei zhang
07:29 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!https://tracker.ceph.com/attachments/download/6628/top-osd-cpu.jpg!
!https://tracker.ceph.com/attachments/download/...
jianwei zhang
07:28 AM Bug #62512 (New): osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_f...
problem:
osd high cpu
!top-osd-cpu!
!top-H-msgr-worker-cpu!...
jianwei zhang

08/17/2023

10:19 PM Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
/a/yuriw-2023-08-16_22:44:42-rados-wip-yuri7-testing-2023-08-16-1309-pacific-distro-default-smithi/7371564/remote/smi... Laura Flores
04:35 PM Backport #62479 (In Progress): quincy: ceph status does not report an application is not enabled ...
Prashant D
05:50 AM Backport #62479 (Resolved): quincy: ceph status does not report an application is not enabled on ...
https://github.com/ceph/ceph/pull/53042 Backport Bot
04:33 PM Backport #62478 (In Progress): reef: ceph status does not report an application is not enabled on...
Prashant D
05:50 AM Backport #62478 (Resolved): reef: ceph status does not report an application is not enabled on th...
https://github.com/ceph/ceph/pull/53041 Backport Bot
01:29 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
Seen in a Pacific run : /a/yuriw-2023-08-16_22:40:18-rados-wip-yuri2-testing-2023-08-16-1142-pacific-distro-default-s... Aishwarya Mathuria
05:50 AM Bug #57097 (Pending Backport): ceph status does not report an application is not enabled on the p...
Prashant D

08/16/2023

05:23 PM Bug #62470 (Need More Info): Rook: OSD Crash Looping / Caught signal (Aborted) / thread_name:tp_o...
After a discussion on Rook Github:[[https://github.com/rook/rook/discussions/12713#discussioncomment-6730118]], they ... Richard Durso

08/15/2023

08:17 PM Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults...
/a/yuriw-2023-08-10_20:19:11-rados-wip-yuri2-testing-2023-08-08-0755-pacific-distro-default-smithi/7366113
This on...
Laura Flores

08/14/2023

05:55 PM Bug #62382 (Fix Under Review): mon/MonClient: ms_handle_fast_authentication return value ignored
Patrick Donnelly

08/11/2023

07:28 PM Bug #62248: upstream Quincy incorrectly reporting pgs backfill_toofull
We have seen this behavior for ec 2+2 pool and from the osd debug logs the PG were marked as backfill_toofull based o... Prashant D
05:31 PM Backport #62031 (Resolved): pacific: ceph config set using osd/host mask not working
Konstantin Shalygin
03:28 PM Backport #62031: pacific: ceph config set using osd/host mask not working
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/52468
merged
Yuri Weinstein
03:27 PM Backport #61488: pacific: ceph: osd blocklist does not accept v2/v1: prefix for addr
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/51812
merged
Yuri Weinstein
03:21 PM Backport #59537: quincy: osd/scrub: verify SnapMapper consistency not backported
Ronen Friedman wrote:
> https://github.com/ceph/ceph/pull/52256 is pending review
merged
Yuri Weinstein
03:20 PM Backport #61569: quincy: the mgr, osd version information missing in "ceph versions" command duri...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/52161
merged
Yuri Weinstein
07:43 AM Bug #58156 (Resolved): Monitors do not permit OSD to join after upgrading to Quincy
Igor Fedotov
07:43 AM Backport #59455 (Resolved): pacific: Monitors do not permit OSD to join after upgrading to Quincy
Igor Fedotov

08/10/2023

10:32 PM Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
/a/yuriw-2023-08-02_20:21:03-rados-wip-yuri3-testing-2023-08-01-0825-pacific-distro-default-smithi/7358894 Laura Flores
08:42 PM Bug #62400 (New): test_pool_min_size: wait_for_clean passed with 0 PGs
/a/ksirivad-2023-08-10_16:54:25-rados:thrash-erasure-code-reef-release-distro-default-smithi/7364891... Kamoltat (Junior) Sirivadhna
04:14 AM Bug #62379: mon: "config dump" command's output (normal vs json) is not consistent in terms of di...
Test Results With Fix:... Sridhar Seshasayee
12:19 AM Bug #62382 (Fix Under Review): mon/MonClient: ms_handle_fast_authentication return value ignored
https://github.com/ceph/ceph/blob/fd7710f37dce0fa9b9c7bf38dad70fcf96c2e626/src/mon/MonClient.cc#L1605
and
https...
Patrick Donnelly

08/09/2023

09:05 PM Bug #62338 (Fix Under Review): osd: choose_async_recovery_ec may select an acting set < min_size
Radoslaw Zarzynski
04:16 PM Bug #62248: upstream Quincy incorrectly reporting pgs backfill_toofull
All ceph logs have been copied to the root dir of each node in /root/230809-1607_ceph. So regardless if the cluster i... Tim Wilkinson
02:25 PM Bug #62248: upstream Quincy incorrectly reporting pgs backfill_toofull
This issue was reproduced on one of the clusters. A PG dump snapshot was taken at the time and both _debug_osd_ and _... Tim Wilkinson
03:20 PM Bug #44089 (Fix Under Review): mon: --format=json does not work for config get or show
Leonid Usov
02:23 PM Bug #62379 (Fix Under Review): mon: "config dump" command's output (normal vs json) is not consis...
Sridhar Seshasayee
01:02 PM Bug #62379 (Resolved): mon: "config dump" command's output (normal vs json) is not consistent in ...
The "ceph config dump" command without the json formatted output shows
the localized option names and their values. ...
Sridhar Seshasayee

08/08/2023

05:03 PM Bug #44089: mon: --format=json does not work for config get or show
as for the mysterious newline in front of the json output, here's the root cause from @src/ceph.in:1275@... Leonid Usov
03:56 PM Bug #61762: PGs are stucked in creating+peering when starting up OSDs
Looking at one of the PGs that is in creating+peering, we can see that
it is blocked by OSD.2...
Kamoltat (Junior) Sirivadhna
03:54 PM Bug #61762: PGs are stucked in creating+peering when starting up OSDs
Changing the title to a more accurate one.
Kamoltat (Junior) Sirivadhna
04:08 AM Bug #44715 (Resolved): common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back(...
Konstantin Shalygin
04:06 AM Backport #52791 (Resolved): pacific: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_fli...
Konstantin Shalygin

08/07/2023

09:41 PM Bug #61815 (Fix Under Review): PgScrubber cluster warning is misspelled
Laura Flores
08:24 PM Backport #52791: pacific: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.ba...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/51249
merged
Yuri Weinstein
05:55 PM Bug #62248: upstream Quincy incorrectly reporting pgs backfill_toofull
It appears to be consistent. Does the debug_osd=20 need to be engaged prior to the issue or can we enable it for some... Tim Wilkinson
05:25 PM Bug #62248 (Need More Info): upstream Quincy incorrectly reporting pgs backfill_toofull
1. How reproducible this issue is?
2. May we have logs with @debug_osd=20@?
Radoslaw Zarzynski
05:14 PM Bug #62167 (Fix Under Review): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missi...
Radoslaw Zarzynski
05:11 PM Bug #59172 (In Progress): test_pool_min_size: AssertionError: wait_for_clean: failed before timeo...
Radoslaw Zarzynski
02:26 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
For case1, the rerun suggests that we can recover with k+1 OSDs, therefore, I think the problem might be more to do w... Kamoltat (Junior) Sirivadhna
02:22 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
case 2 basically is related to https://tracker.ceph.com/issues/61762 Kamoltat (Junior) Sirivadhna
02:17 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
The issue is not so easily reproduced since:
EC pool case rerun: https://pulpito.ceph.com/ksirivad-2023-08-04_14:3...
Kamoltat (Junior) Sirivadhna
01:51 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
There are two cases of failure in these tracker:
1. EC pool OSD thrash with 4+2 profile with 8 OSDs, we fail 3/8 O...
Kamoltat (Junior) Sirivadhna
05:10 PM Bug #62119 (In Progress): timeout on reserving replicsa
Radoslaw Zarzynski
02:55 PM Bug #57570: mon-stretched_cluster: Site weights are not monitored post stretch mode deployment
https://github.com/ceph/ceph/pull/52457 merged Yuri Weinstein
05:21 AM Bug #59291: pg_pool_t version compatibility issue
Radoslaw Zarzynski wrote:
> The proposal we discussed:
>
> [...]
look good to me, thanks share
jianwei zhang

08/05/2023

12:49 AM Bug #62338: osd: choose_async_recovery_ec may select an acting set < min_size
Workaround for this issue is to set osd_async_recovery_min_cost to a very large value.... Prashant D

08/04/2023

10:11 PM Bug #62338 (Pending Backport): osd: choose_async_recovery_ec may select an acting set < min_size
choose_async_recovery_ec may remove OSDs from the acting set as long as PeeringState::recoverable evaluates to true. ... Samuel Just
03:19 PM Backport #59456 (Resolved): quincy: Monitors do not permit OSD to join after upgrading to Quincy
Konstantin Shalygin
03:08 PM Backport #59456: quincy: Monitors do not permit OSD to join after upgrading to Quincy
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/51102
merged
Yuri Weinstein
05:10 AM Bug #62167: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
https://github.com/ceph/ceph/pull/52806 Myoungwon Oh

08/03/2023

11:51 AM Bug #43887 (Resolved): ceph_test_rados_delete_pools_parallel failure
Nitzan Mordechai
11:46 AM Backport #58613 (Resolved): pacific: pglog growing unbounded on EC with copy by ref
Nitzan Mordechai

08/02/2023

10:14 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
Update:
/a/yuriw-2023-07-28_23:11:59-rados-reef-release-distro-default-smithi/7355972/teuthology.log
Right befo...
Kamoltat (Junior) Sirivadhna
08:26 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
taking a look ... Kamoltat (Junior) Sirivadhna
05:41 AM Bug #61824: mgr/prometheus: Prometheus metrics type counter decreasing
Yes, I can confirm that metrics went wrong only in this scenario:
1. Added new osds nodes to pool
2. Increased pg...
Jonas Nemeikšis
05:09 AM Bug #52136 (Resolved): Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
05:08 AM Backport #58314 (Resolved): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
05:07 AM Backport #58611 (Resolved): pacific: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatc...
Nitzan Mordechai

08/01/2023

06:52 PM Bug #59813 (Fix Under Review): crash: void PaxosService::propose_pending(): assert(have_pending)
Patrick Donnelly
06:39 PM Bug #59813 (In Progress): crash: void PaxosService::propose_pending(): assert(have_pending)
Patrick Donnelly
04:02 PM Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due...
/a/yuriw-2023-07-28_23:11:59-rados-reef-release-distro-default-smithi/7355972 Laura Flores
03:58 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
/a/yuriw-2023-07-28_23:11:59-rados-reef-release-distro-default-smithi/7356250 Laura Flores
03:13 PM Backport #58611: pacific: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.W...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/49943
merged
Yuri Weinstein
03:13 PM Backport #58314: pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/49521
merged
Yuri Weinstein
03:12 PM Bug #62248: upstream Quincy incorrectly reporting pgs backfill_toofull
In time the cluster settled and completed its recovery without issue. The same issue has been observed on a different... Tim Wilkinson
02:36 PM Bug #62119: timeout on reserving replicsa
PGScrubResourcesOK gets queued but is dequeued very late (almost 6 seconds later). This is causing replica reservatio... Aishwarya Mathuria
02:21 PM Bug #62119: timeout on reserving replicsa
Reproduced in : /a/amathuri-2023-07-28_11:16:38-rados:thrash-main-distro-default-smithi/7355161/ with debug_osd 30 logs. Aishwarya Mathuria
07:51 AM Bug #57915: LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
didn't reoccur yet Nitzan Mordechai
07:50 AM Feature #61788: Adding missing types to ceph-dencoder
The following PRs will add the missing types:
https://github.com/ceph/ceph/pull/52482
https://github.com/ceph/ceph/...
Nitzan Mordechai
07:48 AM Feature #61788 (Fix Under Review): Adding missing types to ceph-dencoder
Nitzan Mordechai
06:54 AM Bug #61962: trim_maps - possible leak on `skip_maps`
The wip PR is currently blocked by https://github.com/ceph/ceph/pull/52339 Matan Breizman
05:13 AM Backport #62252 (In Progress): quincy: qa/standalone/osd/divergent-priors.sh fails in test TEST_d...
Nitzan Mordechai
05:12 AM Backport #62251 (In Progress): reef: qa/standalone/osd/divergent-priors.sh fails in test TEST_div...
Nitzan Mordechai
12:45 AM Bug #62214: crush: a crush rule with multiple choose steps will not retry an earlier step if it c...
I haven't been able to find anything in OSDMonitor that would reweight the bucket as the OSDs are marked out -- it wo... Samuel Just
12:22 AM Bug #62214: crush: a crush rule with multiple choose steps will not retry an earlier step if it c...
Yes, this assessment looks correct to me. It's a big part of why chooseleaf exists — the CRUSH state machine just doe... Greg Farnum

07/31/2023

07:08 PM Bug #61824: mgr/prometheus: Prometheus metrics type counter decreasing
We've turned off the autoscaler in all clusters. Yes, freshly added OSDs map into the affected pool. Maybe it is rela... Jonas Nemeikšis
06:41 PM Bug #61824 (Need More Info): mgr/prometheus: Prometheus metrics type counter decreasing
I was looking for @num_rd_kb@ in the OSD code. It looks to me it almost never goes down but there is the logic in @sp... Radoslaw Zarzynski
06:45 PM Bug #61962: trim_maps - possible leak on `skip_maps`
Bump up. Radoslaw Zarzynski
06:43 PM Bug #62007 (Closed): ninja fails in Ubuntu 22.04 - "g++-11: fatal error: Killed signal terminated...
This looks like OOM killed compiler. Radoslaw Zarzynski
06:39 PM Bug #61820 (Fix Under Review): mon: segfault on rocksdb opening
Neha Ojha
06:26 PM Bug #56650: ceph df reports invalid MAX AVAIL value for stretch mode crush rule
The fix is undergoing extended testing. Radoslaw Zarzynski
06:21 PM Bug #62169 (Fix Under Review): Decimal values truncated for osd_op_history_slow_op_threshold
Neha Ojha
06:20 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
We agreed it's not a blocker for Reef. Radoslaw Zarzynski
06:19 PM Bug #62171 (In Progress): All OSD shards should use the same scheduler type when osd_op_queue=deb...
Radoslaw Zarzynski
06:18 PM Bug #62167 (In Progress): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing()....
Radoslaw Zarzynski
06:18 PM Bug #61861: LRC Erasure Coding profile not working as expected
Pinged Yaarit. Radoslaw Zarzynski
06:15 PM Bug #61968 (In Progress): rados::connect() gets segement fault
Radoslaw Zarzynski
06:14 PM Bug #62205 (Won't Fix): objects degraded inaccurate
Nautilus is EOL. Closing :-(. Radoslaw Zarzynski
06:13 PM Bug #62209: can not promote object at readonly tier mode
Cache-tiering got deprecated in Reef. Radoslaw Zarzynski
06:12 PM Bug #62213: crush: choose leaf with type = 0 may incorrectly map out osds
This tracker got added to the agenda of 8/8/2023 RADOS Team Meeting. Radoslaw Zarzynski
06:11 PM Bug #62214: crush: a crush rule with multiple choose steps will not retry an earlier step if it c...
This tracker got added to the agenda of 8/8/2023 RADOS Team Meeting. Radoslaw Zarzynski
06:09 PM Backport #62252 (Resolved): quincy: qa/standalone/osd/divergent-priors.sh fails in test TEST_dive...
https://github.com/ceph/ceph/pull/52722 Backport Bot
06:09 PM Backport #62251 (Resolved): reef: qa/standalone/osd/divergent-priors.sh fails in test TEST_diverg...
https://github.com/ceph/ceph/pull/52721 Backport Bot
06:06 PM Bug #62225: pacific upgrade test fails on 'ceph versions | jq -e' command
Radoslaw Zarzynski wrote:
> Was it seen on octopus2quincy or pacific2quincy? Asking b/c pacific goes EOL.
Octopus...
Laura Flores
06:03 PM Bug #62225: pacific upgrade test fails on 'ceph versions | jq -e' command
Was it seen on octopus2quincy or pacific2quincy? Asking b/c pacific goes EOL. Radoslaw Zarzynski
06:06 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
bump up. Radoslaw Zarzynski
06:01 PM Bug #56034 (Pending Backport): qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent...
Radoslaw Zarzynski
01:51 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
/a/yuriw-2023-07-28_14:25:29-rados-wip-yuri7-testing-2023-07-27-1336-quincy-distro-default-smithi/7355532 Aishwarya Mathuria
06:00 PM Bug #62076 (In Progress): reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMu...
Radoslaw Zarzynski
05:58 PM Bug #61948 (Need More Info): Failed assert "pg_upmap_primaries.empty()" in the read balancer
@Need for info@ for now but in longer term we might close if reproduction is impossible. Radoslaw Zarzynski
05:56 PM Bug #62119: timeout on reserving replicsa
Important but not urgent. Radoslaw Zarzynski
03:37 PM Bug #62248 (Need More Info): upstream Quincy incorrectly reporting pgs backfill_toofull
Our fault insertion testing with 17.2.6 is showing a cluster status reporting PGs toofull but when we dig into the PG... Tim Wilkinson
11:50 AM Bug #49961: scrub/osd-recovery-scrub.sh: TEST_recovery_scrub_1 failed
Seen in a Pacific run:
/a/yuriw-2023-07-26_15:54:22-rados-wip-yuri6-testing-2023-07-24-0819-pacific-distro-default-s...
Sridhar Seshasayee
08:44 AM Bug #62235 (New): Pacific: Assert failure: test_ceph_osd_pool_create_utf8
FAIL: test_rados.TestCommand.test_ceph_osd_pool_create_utf8
Was observed in Octopus originally --> https://tracker...
Sridhar Seshasayee

07/28/2023

10:05 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
So the upmap command mapped pg 56.19 from osd.4 to osd.6. Then on osd.6 it went into a laggy state. Laura Flores
09:55 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
From the above tests, it didn't reproduce. So, we know the issue is transient. Laura Flores
09:46 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
... Laura Flores
08:00 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
The laggy PG (56.19) is only in osd.6:
/a/yuriw-2023-07-14_23:37:57-fs-wip-yuri8-testing-2023-07-14-0803-reef-distro...
Laura Flores
07:58 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
Rerunning the same test 10x here to check for any reproducers:
http://pulpito.front.sepia.ceph.com/lflores-2023-07-2...
Laura Flores
07:52 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
Snippet from the cluster log, which is basically what Patrick already described above:... Laura Flores
07:42 PM Bug #62076: reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems)
Hey Venky, Patrick, has this failure reproduced at all since the tracker was opened? Laura Flores
04:36 PM Bug #62225: pacific upgrade test fails on 'ceph versions | jq -e' command
/a/yuriw-2023-07-20_14:33:07-rados-wip-yuri11-testing-2023-07-18-0927-pacific-distro-default-smithi/7344744 Laura Flores
04:25 PM Bug #62225 (New): pacific upgrade test fails on 'ceph versions | jq -e' command
/a/yuriw-2023-07-19_14:33:14-rados-wip-yuri11-testing-2023-07-18-0927-pacific-distro-default-smithi/7343484... Laura Flores
04:01 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2023-07-19_14:33:14-rados-wip-yuri11-testing-2023-07-18-0927-pacific-distro-default-smithi/7343461 Laura Flores
04:17 AM Bug #62214 (New): crush: a crush rule with multiple choose steps will not retry an earlier step i...
Debatably, this isn't a bug, but it does appear to be at least a counterintuitive behavior.
In the case of a crush...
Samuel Just
03:47 AM Bug #62213 (New): crush: choose leaf with type = 0 may incorrectly map out osds
The motivating example was:
rule ecpool-86 {
id 86
type erasure
step set_chooseleaf_tries 5
st...
Samuel Just

07/27/2023

03:16 PM Bug #59656: pg_upmap_primary timeout
Ah, I see. The majority of the primary balancer is OSDMap code, which is made up of ldout functions. So, all the rele... Laura Flores
02:43 PM Bug #59656: pg_upmap_primary timeout
Yes, ldout is only used in the OSDMap, I was instead talking of others sources file code; like OSd.cc Kevin NGUETCHOUANG
02:41 PM Bug #59656: pg_upmap_primary timeout
Pretty sure only ldout is used in the OSDMap code. Laura Flores
02:35 PM Bug #59656: pg_upmap_primary timeout
even for the dout ? because it works for ldout functions but with dout i don't see it working Kevin NGUETCHOUANG
02:31 PM Bug #59656: pg_upmap_primary timeout
Hi Kevin, you can use debug_osd=10 to see high level information, and debug_osd=20 to see all the nitty-gritty stuff. Laura Flores
01:54 PM Bug #59656: pg_upmap_primary timeout
Can i have also have the logs enabled (from dout function) for primarylogpg if i set debug_osd to 10 ? Kevin NGUETCHOUANG
02:31 PM Bug #62209 (New): can not promote object at readonly tier mode

steps:
* set cache pool for a data pool, readonly mode
* put object to data pool
* get object from data pool
* ...
Jack Lv
11:57 AM Bug #62205: objects degraded inaccurate
this is the mgr log with debug_mgr=20 xueyong lu
11:38 AM Bug #62205 (Won't Fix): objects degraded inaccurate
my ceph version is 14.2.8
i have a pool(public.rgw.buckets.ia.data) with 0 objects,but when i out a osd in the cur...
xueyong lu
05:24 AM Bug #59531: quincy: "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.0...
FWIW - seem in main branch run
https://pulpito.ceph.com/vshankar-2023-07-26_04:54:56-fs-wip-vshankar-testing-2...
Venky Shankar
 

Also available in: Atom