Project

General

Profile

Activity

From 08/16/2023 to 09/14/2023

09/14/2023

09:02 PM Backport #59702 (Resolved): reef: mon: FAILED ceph_assert(osdmon()->is_writeable())
https://github.com/ceph/ceph/pull/51409
Merged
Kamoltat (Junior) Sirivadhna
08:59 PM Bug #58894 (Resolved): [pg-autoscaler][mgr] does not throw warn to increase PG count on pools wit...
All backports has resolved. Kamoltat (Junior) Sirivadhna
08:58 PM Backport #62820 (Resolved): reef: [pg-autoscaler][mgr] does not throw warn to increase PG count o...
Merged Kamoltat (Junior) Sirivadhna
07:28 PM Bug #46437 (Closed): Admin Socket leaves behind .asok files after daemons (ex: RGW) shut down gra...
Ali Maredia
06:54 PM Backport #51195 (In Progress): pacific: [rfe] increase osd_max_write_op_reply_len default value t...
Konstantin Shalygin
05:24 PM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer
Mosharaf Hossain wrote:
> Stefan Kooman wrote:
> > @Mosharaf Hossain:
> >
> > Do you also have a performance ove...
Laura Flores
05:17 PM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer

Stefan Kooman wrote:
> @Mosharaf Hossain:
>
> Do you also have a performance overview when you were runni...
Mosharaf Hossain
03:36 PM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer
After running `ceph osd dump`, you should see entries like this at the end of the output, which indicate each pg you ... Laura Flores
03:13 PM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer
Hi Mosharaf, as Stefan wrote above, you can get your osdmap file by running the following command, where "osdmap" is ... Laura Flores
10:10 AM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer
@Mosharaf Hossain:
What kind of client do you use to access the VM storage (i.e. kernel client rbd, krbd, or librb...
Stefan Kooman
09:52 AM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer
The dashboard values show all "0", but the graph indicates it's still doing IO, as does "ceph -s". It might (also) be... Stefan Kooman
09:47 AM Bug #62836: CEPH zero iops after upgrade to Reef and manual read balancer
@Mosharaf Hossain:
Do you also have a performance overview when you were running Quincy? Quincy would then be the ...
Stefan Kooman
04:31 AM Bug #62836 (Need More Info): CEPH zero iops after upgrade to Reef and manual read balancer
We've recently performed an upgrade on our Cephadm cluster, transitioning from Ceph Quiency to Reef. However, followi... Mosharaf Hossain
01:54 PM Backport #50697 (In Progress): pacific: common: the dump of thread IDs is in dec instead of hex
Konstantin Shalygin
01:52 PM Backport #56649 (In Progress): pacific: [Progress] Do not show NEW PG_NUM value for pool if autos...
Konstantin Shalygin
01:50 PM Backport #56648 (Resolved): quincy: [Progress] Do not show NEW PG_NUM value for pool if autoscale...
Konstantin Shalygin
01:50 PM Backport #50831 (In Progress): pacific: pacific ceph-mon: mon initial failed on aarch64
Konstantin Shalygin
10:29 AM Bug #62826 (Fix Under Review): crushmap holds the previous rule for an EC pool created with name ...
Nitzan Mordechai
07:44 AM Bug #62839 (New): Teuthology failure in LibRadosTwoPoolsPP.HitSetWrite

Branch tested: wip-rf-fshut ('main' b690343128 as of 12.9.23 + changes to one shutdown function (commit 210dbd4ff19...
Ronen Friedman

09/13/2023

10:58 PM Bug #62833: [Reads Balancer] osdmaptool with with --read option creates suggestions for primary O...
The symptom is an error like this after applying a pg-upmap-primary command:... Laura Flores
10:56 PM Bug #62833 (Fix Under Review): [Reads Balancer] osdmaptool with with --read option creates sugges...
Laura Flores
06:07 PM Bug #62833 (Resolved): [Reads Balancer] osdmaptool with with --read option creates suggestions fo...
See the BZ for more details: https://bugzilla.redhat.com/show_bug.cgi?id=2237574 Laura Flores
09:02 PM Backport #61569 (Resolved): quincy: the mgr, osd version information missing in "ceph versions" c...
Prashant D
06:57 PM Bug #62568: Coredump in rados_aio_write_op_operate
... Radoslaw Zarzynski
06:52 PM Bug #62704: Cephfs different monitor has a different LAST_DEEP_SCRUB state
Is this consistent? Was there a third attempt, on mon node 1, showing @1586:7591@ again?
And BTW, this is mgr-handle...
Radoslaw Zarzynski
06:43 PM Bug #62213: crush: choose leaf with type = 0 may incorrectly map out osds
Bump up. Radoslaw Zarzynski
06:42 PM Bug #62769 (Duplicate): ninja fails during build osd.cc
Neha Ojha
06:34 PM Bug #62776: rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled
Not a priority. Radoslaw Zarzynski
06:33 PM Bug #62777: rados/valgrind-leaks: expected valgrind issues and found none
Yeah, we have a test intentionally causing a leak just to ensure valgrind truly works.
I wonder what might if this t...
Radoslaw Zarzynski
06:28 PM Bug #62788 (Rejected): mon: mon store db loss file
This likely is a corruption of filesystem / hardware error.... Radoslaw Zarzynski
06:19 PM Bug #62119: timeout on reserving replicsa
Bumping this up. Radoslaw Zarzynski
06:18 PM Bug #50245: TEST_recovery_scrub_2: Not enough recovery started simultaneously
note from scrub: let's observe. Radoslaw Zarzynski
06:17 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
The fix approved but waits for QA. Bumping this up. Radoslaw Zarzynski
03:28 PM Bug #62832 (Pending Backport): common: config_proxy deadlock during shutdown (and possibly other ...
Saw this deadlock in teuthology where I was doing parallel `ceph config set` commands:... Patrick Donnelly
01:50 PM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
Responses to your questions.
Q: how to test and get osd_mclock_max_sequential_bandwidth_hdd and osd_mclock_max_ca...
Sridhar Seshasayee
12:21 PM Bug #62826 (Fix Under Review): crushmap holds the previous rule for an EC pool created with name ...
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2224324
Description of problem:
If an EC pool is created with...
Nitzan Mordechai

09/12/2023

07:03 PM Bug #62669 (Fix Under Review): Pacific: multiple scrub and deep-scrub start message repeating for...
Prashant D
05:38 PM Backport #62820 (Resolved): reef: [pg-autoscaler][mgr] does not throw warn to increase PG count o...
Kamoltat (Junior) Sirivadhna
05:08 PM Bug #58894 (Pending Backport): [pg-autoscaler][mgr] does not throw warn to increase PG count on p...
Oops, looks this tracker missed a reef backport (the patches are absent in v18.2.0). Radoslaw Zarzynski
04:53 PM Backport #62819 (In Progress): reef: osd: choose_async_recovery_ec may select an acting set < min...
https://github.com/ceph/ceph/pull/54550 Backport Bot
04:53 PM Backport #62818 (In Progress): pacific: osd: choose_async_recovery_ec may select an acting set < ...
https://github.com/ceph/ceph/pull/54548 Backport Bot
04:53 PM Backport #62817 (In Progress): quincy: osd: choose_async_recovery_ec may select an acting set < m...
https://github.com/ceph/ceph/pull/54549 Backport Bot
04:51 PM Bug #62338 (Pending Backport): osd: choose_async_recovery_ec may select an acting set < min_size
Radoslaw Zarzynski
03:09 PM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
A version has been modified, please review
For cost calculation, the core idea is to take the larger value of user i...
jianwei zhang
07:35 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
!https://tracker.ceph.com/attachments/download/6655/rados_bench_pr_52809.png!
osd/scheduler/mClockScheduler: Use s...
jianwei zhang
07:18 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
!https://tracker.ceph.com/attachments/download/6653/tell_bench.png!
!https://tracker.ceph.com/attachments/download...
jianwei zhang
07:16 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
another question :
how to test and get osd_mclock_max_sequential_bandwidth_hdd and osd_mclock_max_capacity_iops_hdd...
jianwei zhang
05:54 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
jianwei zhang wrote:
> One is not to add osd_bandwidth_cost_per_io cost:
> !https://tracker.ceph.com/attachments/do...
jianwei zhang
05:37 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
Please help me check if there are any errors in the process of calculating cost.
If nothing goes wrong,
Please disc...
jianwei zhang
05:35 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
Incremental step calculation method:... jianwei zhang
05:30 AM Bug #62812: osd: Is it necessary to unconditionally increase osd_bandwidth_cost_per_io in mClockS...
One is not to add osd_bandwidth_cost_per_io cost:
!https://tracker.ceph.com/attachments/download/6650/add_osd_bandwi...
jianwei zhang
05:29 AM Bug #62812 (Pending Backport): osd: Is it necessary to unconditionally increase osd_bandwidth_cos...
In this PR, the IOPS-based QoS cost calculation method is removed and the Bandwidth-based QoS cost calculation method... jianwei zhang
09:32 AM Bug #57628 (In Progress): osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_sinc...
Matan Breizman
09:31 AM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
WIP: https://gist.github.com/Matan-B/40b5a7ee30e9e73d20c052594365aae8
This seems to be highly related to map gap e...
Matan Breizman
05:25 AM Bug #62811: PGs stuck in backfilling state after their primary OSD is removed by setting its crus...
Analysis of the issue is performed by taking a single PG (6.15a) on osd.34 and on which backfill didn't start.
I hav...
Sridhar Seshasayee
03:43 AM Bug #62811 (New): PGs stuck in backfilling state after their primary OSD is removed by setting it...
I am pasting the problem description from the original BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2233777
<pr...
Sridhar Seshasayee

09/11/2023

07:22 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
selected by holding the ALT key :-) Yaarit Hatuka
06:49 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
If anyone knows how to properly select multiple affected versions, please go ahead.
v18.0.0, v14.0.0, v15.0.0, and...
Laura Flores
06:43 PM Bug #57628: osd:PeeringState.cc: FAILED ceph_assert(info.history.same_interval_since != 0)
/a/lflores-2023-09-08_20:36:06-rados-wip-lflores-testing-2-2023-09-08-1755-distro-default-smithi/7391621 Laura Flores
07:39 AM Bug #62788 (Rejected): mon: mon store db loss file
... yite gu

09/08/2023

11:54 PM Bug #62669: Pacific: multiple scrub and deep-scrub start message repeating for a same PG
Thanks to Cory and David for providing the osd.426 debug logs to find out the reason for scrub getting initiated for ... Prashant D
05:38 AM Bug #62669: Pacific: multiple scrub and deep-scrub start message repeating for a same PG
The multiple scrub starts messages for the same PG indicates that there is a problem with scrubbing in the pacific re... Prashant D
10:05 PM Bug #62777 (New): rados/valgrind-leaks: expected valgrind issues and found none
rados/valgrind-leaks/{1-start 2-inject-leak/mon centos_latest}
/a/yuriw-2023-08-11_02:49:40-rados-wip-yuri4-testin...
Laura Flores
07:43 PM Bug #62776 (New): rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled
Description: rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/connectivity msgr-failures/few msgr/async-v1... Laura Flores
01:08 PM Bug #62769 (Duplicate): ninja fails during build osd.cc
[124/210] Building CXX object src/osd/CMakeFiles/osd.dir/OSD.cc.o
FAILED: src/osd/CMakeFiles/osd.dir/OSD.cc.o
/usr...
MOHIT AGRAWAL

09/07/2023

04:41 PM Feature #61788 (Resolved): Adding missing types to ceph-dencoder
J. Eric Ivancich

09/06/2023

09:24 AM Bug #62596 (Closed): osd: Remove leaked clone objects (SnapMapper malformed key)
Matan Breizman
08:25 AM Bug #62704: Cephfs different monitor has a different LAST_DEEP_SCRUB state
Unrelated to cephfs - moving to RADOS component. Venky Shankar

09/05/2023

10:00 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2023-09-01_19:14:47-rados-wip-batrick-testing-20230831.124848-pacific-distro-default-smithi/7386290 Laura Flores
08:20 PM Bug #50245 (New): TEST_recovery_scrub_2: Not enough recovery started simultaneously
/a/yuriw-2023-08-15_18:58:56-rados-wip-yuri3-testing-2023-08-15-0955-distro-default-smithi/7369212
Worth looking i...
Laura Flores
08:10 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
/a/yuriw-2023-08-16_18:39:08-rados-wip-yuri3-testing-2023-08-15-0955-distro-default-smithi/7370286/ Kamoltat (Junior) Sirivadhna
08:10 PM Bug #62119: timeout on reserving replicsa
/a/yuriw-2023-08-15_18:58:56-rados-wip-yuri3-testing-2023-08-15-0955-distro-default-smithi/7369280
/a/yuriw-2023-08-...
Laura Flores
07:38 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/yuriw-2023-08-15_18:58:56-rados-wip-yuri3-testing-2023-08-15-0955-distro-default-smithi/7369175 Laura Flores
07:17 AM Bug #62704 (Closed): Cephfs different monitor has a different LAST_DEEP_SCRUB state
I ran 'ceph pg dump pgs' on each monitor node and find that the values of REPORTED are inconsistent. For example:
Th...
fuchen ma

09/04/2023

11:44 AM Bug #59531: quincy: "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.0...
https://pulpito.ceph.com/rishabh-2023-08-25_06:38:25-fs-wip-rishabh-2023aug3-b5-testing-default-smithi/7379315 Rishabh Dave
08:08 AM Backport #59676: reef: osd:tick checking mon for new map
https://github.com/ceph/ceph/pull/53269 yite gu
08:03 AM Bug #62568: Coredump in rados_aio_write_op_operate
Hi, Any feedback ? Nokia ceph-users

09/03/2023

05:30 AM Documentation #62680 (In Progress): Docs for setting up multisite RGW don't work
This procedure is expected to be tested during the first week of September 2023. Zac Dover
05:29 AM Documentation #62680 (In Progress): Docs for setting up multisite RGW don't work
An email from Petr Bena:
Hello,
My goal is to setup multisite RGW with 2 separate CEPH clusters in separate dat...
Zac Dover

09/01/2023

06:55 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/teuthology/pdonnell-2023-08-31_15:31:51-fs-wip-batrick-testing-20230831.124848-pacific-distro-default-smithi/7385689... Patrick Donnelly

08/31/2023

11:14 PM Bug #62669: Pacific: multiple scrub and deep-scrub start message repeating for a same PG
The scrub "starts" message should be logged when remotes are reserved and scrubber is initiated for the PG. Checking ... Prashant D
07:56 PM Bug #62669: Pacific: multiple scrub and deep-scrub start message repeating for a same PG
The "starts" message was reintroduced in pacific with PR https://github.com/ceph/ceph/pull/48070. The multiple scrub ... Prashant D
07:51 PM Bug #62669 (Resolved): Pacific: multiple scrub and deep-scrub start message repeating for a same PG
The ceph cluster log reporting multiple "scrub starts" and "deep-scrub starts" for a same PG multiple times within sh... Prashant D
09:53 AM Bug #61140: crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd_fast_...
We just observed this "noise" for quite a few OSDs on rolling reboots.... would be nice to have this "not" treated an... Christian Rohmann
05:54 AM Bug #62568: Coredump in rados_aio_write_op_operate
We are considering for using quincy/reef to test BTW, Even if the memory is exhausted, the expected response from mal... Nokia ceph-users

08/30/2023

08:28 AM Bug #61962 (Fix Under Review): trim_maps - possible leak on `skip_maps`
Matan Breizman

08/29/2023

08:26 PM Bug #62400: test_pool_min_size: wait_for_clean passed with 0 PGs
yep normal priority is fine, at first I set it to urgent because I thought it was blocking a release Kamoltat (Junior) Sirivadhna
12:26 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
Taking a fresh look at this, thanks Radek. Brad Hubbard

08/28/2023

06:57 PM Bug #61762: PGs are stucked in creating+peering when starting up OSDs
Seems worth looking the OSD.2's log to determine why it's the blocker. Radoslaw Zarzynski
06:57 PM Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
Dan van der Ster wrote:
> @Laura thanks!
>
> One comment. We shouldn't prevent creating config options for daemon...
Laura Flores
06:46 PM Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
@Laura thanks!
One comment. We shouldn't prevent creating config options for daemons that do not exist -- e.g. we ...
Dan van der Ster
06:45 PM Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
Verified that this also occurs on reef/quincy. Laura Flores
06:42 PM Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
h3. Steps to reproduce:
1. Build a vstart cluster on the main branch
2. Run a `config set` command on an osd with...
Laura Flores
06:09 PM Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
Thanks Radek! I'll bring it to the Grace Hopper Open Source Day! Laura Flores
05:54 PM Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
This is an ideal task for a beginner / hackathon. Radoslaw Zarzynski
06:49 PM Bug #62248: upstream Quincy incorrectly reporting pgs backfill_toofull
We talked about this bug during the last BZ scrub. IIRC the idea there was to document this behavior. Radoslaw Zarzynski
06:44 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
Let's keep an eye on the PR. Radoslaw Zarzynski
06:44 PM Bug #62529: PrimaryLogPG::log_op_stats uses `now` time vs op end time when calculating op latency
I'll bring this to Grace Hopper Open Source Day! Laura Flores
06:40 PM Bug #62529: PrimaryLogPG::log_op_stats uses `now` time vs op end time when calculating op latency
IIUC the linked method is being called asynchronously, after the op got completed.
However, the debug there claims t...
Radoslaw Zarzynski
06:34 PM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
Added to https://pad.ceph.com/p/performance_weekly. Radoslaw Zarzynski
06:32 PM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
Oops, perhaps (CHECK ME!) we aren't randomizing the retry times nor enlarging them with each failed attemp.
Anyway, ...
Radoslaw Zarzynski
06:18 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
This time it's CentOS!... Radoslaw Zarzynski
06:11 PM Bug #62568: Coredump in rados_aio_write_op_operate
At this stage I'm not sure it's actually a bug (not always coredump indicates a bug).
The crash was caused by callin...
Radoslaw Zarzynski
06:04 PM Bug #62578: mon: osd pg-upmap-items command causes PG_DEGRADED warnings
Form the the bug scrub: the @message@ mentions @acting [7,3]@ but the @osd pg-upmap-items@ requested @[0, 7]@.
Isn't...
Radoslaw Zarzynski
05:50 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
bump up (TODO(rzarzynski): review the 2nd PR). Radoslaw Zarzynski
05:44 PM Bug #59291: pg_pool_t version compatibility issue
Well, this case grown so much that, I'm afraid, there is single PR. Pasting my notes:... Radoslaw Zarzynski
05:41 PM Bug #62400: test_pool_min_size: wait_for_clean passed with 0 PGs
This looks to be a test issue. If we're wrong, please change the priority back. Radoslaw Zarzynski
01:50 PM Backport #62610 (Rejected): quincy: mon/OSDMonitor: do not propose on error in prepare_update
Due to numerous conflicts and this mostly being a minor performance fix, I'm closing this as rejected. Patrick Donnelly
01:34 PM Backport #62610 (Rejected): quincy: mon/OSDMonitor: do not propose on error in prepare_update
Backport Bot
01:37 PM Backport #62611 (In Progress): reef: mon/OSDMonitor: do not propose on error in prepare_update
Patrick Donnelly
01:34 PM Backport #62611 (In Progress): reef: mon/OSDMonitor: do not propose on error in prepare_update
https://github.com/ceph/ceph/pull/53186 Backport Bot
01:29 PM Bug #58972 (Pending Backport): mon/OSDMonitor: do not propose on error in prepare_update
Patrick Donnelly

08/27/2023

09:22 AM Bug #59478 (Closed): osd/scrub: verify SnapMapper consistency not backported
Wout van Heeswijk wrote:
> I there any update on this backport? This seems to be causing corruption in some of our c...
Matan Breizman
09:18 AM Bug #62596 (Closed): osd: Remove leaked clone objects (SnapMapper malformed key)
Clusters affected by the SnapMapper malformed key conversion [1] (which was fixed) may still suffer from space leak c... Matan Breizman

08/26/2023

03:03 PM Bug #59291: pg_pool_t version compatibility issue
Radoslaw Zarzynski wrote:
> The proposal we discussed:
>
> [...]
Hello rzarzyns, any update? please paste the ...
Honggang Yang

08/25/2023

06:41 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=15d62a24ad12be22753ffcc0a78cd90cf7... Laura Flores
05:43 PM Bug #62588 (New): ceph config set allows WHO to be osd.*, which is misleading
We came across a customer cluster who uses `ceph config set osd.* ...` thinking it would apply to *all* OSDs.
In fa...
Dan van der Ster

08/24/2023

08:26 PM Bug #62578 (New): mon: osd pg-upmap-items command causes PG_DEGRADED warnings
... Patrick Donnelly
12:16 PM Bug #62568: Coredump in rados_aio_write_op_operate
Is there any recommendation to proceed with latest version to overcome this coredump? Nokia ceph-users
12:12 PM Bug #62568 (New): Coredump in rados_aio_write_op_operate
We are facing crash issue in the function rados_aio_write_op_operate().Please find the stack trace below,
current ...
Nokia ceph-users
04:11 AM Backport #59537 (Resolved): quincy: osd/scrub: verify SnapMapper consistency not backported
Konstantin Shalygin

08/23/2023

10:18 PM Bug #49727: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
/a/yuriw-2023-08-21_23:10:07-rados-pacific-release-distro-default-smithi/7375579 Laura Flores
09:53 PM Bug #62557: rados: Teuthology test failure due to "MDS_CLIENTS_LAGGY" warning
Description: rados/dashboard/{centos_8.stream_container_tools clusters/{2-node-mgr} debug/mgr mon_election/connectivi... Laura Flores
09:45 PM Bug #62557 (New): rados: Teuthology test failure due to "MDS_CLIENTS_LAGGY" warning
Description: rados/dashboard/{centos_8.stream_container_tools clusters/{2-node-mgr} debug/mgr mon_election/classic ra... Laura Flores
12:03 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/yuriw-2023-08-22_18:16:03-rados-wip-yuri10-testing-2023-08-17-1444-distro-default-smithi/7376687 Matan Breizman

08/22/2023

06:05 PM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
/a/yuriw-2023-08-17_21:18:20-rados-wip-yuri11-testing-2023-08-17-0823-distro-default-smithi/7372203 Laura Flores
05:16 PM Bug #49689 (Resolved): osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch...
Matan Breizman
05:14 PM Backport #61149 (Resolved): pacific: osd/PeeringState.cc: ceph_abort_msg("past_interval start int...
Matan Breizman
04:20 PM Bug #62529 (New): PrimaryLogPG::log_op_stats uses `now` time vs op end time when calculating op l...
* With `debug_osd` 15 or higher, a log line called `log_op_stats` is output, which ends with a reported latency value... Michael Kidd
09:54 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
# ceph daemon osd.0 config show | grep _message_
"osd_client_message_cap": "0",
"osd_client_message_size_ca...
jianwei zhang
09:36 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...

Continue to increase client traffic
When the. osd_client_message_size_cap upper limit is reached, it will still...
jianwei zhang
09:29 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
After loosening the restriction on osd_client_message_cap, msgr-worker cpu is reduced to 1%.
But the throttle-osd_c...
jianwei zhang
09:25 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
... jianwei zhang
09:24 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
... jianwei zhang
09:07 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
reproduce :... jianwei zhang
08:03 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
... jianwei zhang
07:58 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!https://tracker.ceph.com/attachments/download/6631/osd-cpu-history-change.jpg! jianwei zhang
07:57 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!osd-cpu-history-change!
As shown in the figure,
16:00 ~ 20:00 osd_client_message_cap=256, cpu high to 200%
af...
jianwei zhang
07:45 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!osd-cpu-history-change! jianwei zhang
07:36 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
Question:
Osd throttle-osd_client_messages has been producing a lot of get_or_fail_fail.
Top-p < osd.pid > see th...
jianwei zhang
07:29 AM Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (o...
!https://tracker.ceph.com/attachments/download/6628/top-osd-cpu.jpg!
!https://tracker.ceph.com/attachments/download/...
jianwei zhang
07:28 AM Bug #62512 (New): osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_f...
problem:
osd high cpu
!top-osd-cpu!
!top-H-msgr-worker-cpu!...
jianwei zhang

08/17/2023

10:19 PM Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
/a/yuriw-2023-08-16_22:44:42-rados-wip-yuri7-testing-2023-08-16-1309-pacific-distro-default-smithi/7371564/remote/smi... Laura Flores
04:35 PM Backport #62479 (In Progress): quincy: ceph status does not report an application is not enabled ...
Prashant D
05:50 AM Backport #62479 (Resolved): quincy: ceph status does not report an application is not enabled on ...
https://github.com/ceph/ceph/pull/53042 Backport Bot
04:33 PM Backport #62478 (In Progress): reef: ceph status does not report an application is not enabled on...
Prashant D
05:50 AM Backport #62478 (Resolved): reef: ceph status does not report an application is not enabled on th...
https://github.com/ceph/ceph/pull/53041 Backport Bot
01:29 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
Seen in a Pacific run : /a/yuriw-2023-08-16_22:40:18-rados-wip-yuri2-testing-2023-08-16-1142-pacific-distro-default-s... Aishwarya Mathuria
05:50 AM Bug #57097 (Pending Backport): ceph status does not report an application is not enabled on the p...
Prashant D

08/16/2023

05:23 PM Bug #62470 (New): Rook: OSD Crash Looping / Caught signal (Aborted) / thread_name:tp_osd_tp
After a discussion on Rook Github:[[https://github.com/rook/rook/discussions/12713#discussioncomment-6730118]], they ... Richard Durso
 

Also available in: Atom