Project

General

Profile

Activity

From 05/18/2022 to 06/16/2022

06/16/2022

03:28 PM Bug #53685: Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.
/a/yuriw-2022-06-11_02:24:12-rados-quincy-release-distro-default-smithi/6873771 Laura Flores
12:40 PM Backport #53338: pacific: osd/scrub: src/osd/scrub_machine.cc: 55: FAILED ceph_assert(state_cast<...
It looks like this backport has been merged in https://github.com/ceph/ceph/pull/45374, and released in 16.2.8, so I ... Benoît Knecht
09:38 AM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
+*Observed this in a pacific run:*+
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-dis...
Sridhar Seshasayee
09:23 AM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-distro-default-smithi/6881131
Descrip...
Sridhar Seshasayee
09:14 AM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-distro-default-smithi/6881215 Sridhar Seshasayee
08:05 AM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
... Denis Polom
05:04 AM Bug #55153 (Fix Under Review): Make the mClock config options related to [res, wgt, lim] modifiab...
Sridhar Seshasayee
01:54 AM Bug #55750: mon: slow request of very long time
Neha Ojha wrote:
> yite gu wrote:
> > It appears that this mon request has been completed,but it have no erase from...
yite gu

06/15/2022

06:56 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
This `wait_for_clean` assertion failure is happening with the minsize_recovery thrasher, which is used by rados/thras... Laura Flores
06:21 PM Bug #55750: mon: slow request of very long time
yite gu wrote:
> It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded...
Neha Ojha
06:11 PM Bug #55776: octopus: map exx had wrong cluster addr
... Radoslaw Zarzynski
05:54 PM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
Could you please provide the output from @ceph osd lspools@ as well? Radoslaw Zarzynski
05:51 PM Bug #47300 (Resolved): mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
05:50 PM Backport #55513 (Resolved): quincy: mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
05:50 PM Backport #55514 (Resolved): pacific: mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
05:06 PM Backport #55514: pacific: mount.ceph fails to understand AAAA records from SRV record
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46112
merged
Yuri Weinstein
05:03 PM Backport #55296: pacific: malformed json in a Ceph RESTful API call can stop all ceph-mon services
nikhil kshirsagar wrote:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/...
Yuri Weinstein
10:05 AM Bug #56057 (Fix Under Review): Add health error if one or more OSDs registered v1/v2 public ip ad...
Prashant D
07:13 AM Bug #56057 (Pending Backport): Add health error if one or more OSDs registered v1/v2 public ip ad...
In a containerized environment after a OSD node reboot, some OSDs registered their public v1/v2 addresses on cluster ... Prashant D
09:10 AM Backport #56059 (In Progress): pacific: Assertion failure (ceph_assert(have_pending)) when creati...
Sridhar Seshasayee
08:55 AM Backport #56059 (Resolved): pacific: Assertion failure (ceph_assert(have_pending)) when creating ...
https://github.com/ceph/ceph/pull/46691 Backport Bot
09:07 AM Backport #56060 (In Progress): quincy: Assertion failure (ceph_assert(have_pending)) when creatin...
Sridhar Seshasayee
08:55 AM Backport #56060 (Resolved): quincy: Assertion failure (ceph_assert(have_pending)) when creating n...
https://github.com/ceph/ceph/pull/46689 Backport Bot
08:51 AM Bug #55773 (Pending Backport): Assertion failure (ceph_assert(have_pending)) when creating new OS...
Sridhar Seshasayee

06/14/2022

09:40 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Running some tests to try and reproduce the issue and get a sense of how frequently it fails. This has actually been ... Laura Flores
09:05 PM Backport #51287 (In Progress): pacific: LibRadosService.StatusFormat failed, Expected: (0) != (re...
Laura Flores
08:08 PM Bug #53855: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
@Myoungwon Oh does this look like the same thing to you? Perhaps your fix needs to be backported to Pacific.
/a/yu...
Laura Flores
03:03 PM Bug #52316: qa/tasks/mon_thrash.py: _do_thrash AssertionError len(s['quorum']) == len(mons)
/a/yuriw-2022-06-13_16:36:31-rados-wip-yuri7-testing-2022-06-13-0706-distro-default-smithi/6876523
Description: ra...
Laura Flores
02:37 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
Another detail to note is that this particular test has the pg autoscaler enabled, as opposed to TEST_divergent_2(), ... Laura Flores
10:48 AM Bug #56034 (Resolved): qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
/a/yuriw-2022-06-13_16:36:31-rados-wip-yuri7-testing-2022-06-13-0706-distro-default-smithi/6876516
Also historical...
Sridhar Seshasayee
06:22 AM Bug #56030: frequently down and up a osd may cause recovery not in asynchronous
i set osd_async_recovery_min_cost = 0 hope async recovery anyway zhouyue zhou
03:57 AM Bug #56030 (Fix Under Review): frequently down and up a osd may cause recovery not in asynchronous
ceph version: octopus 15.2.13
in my test cluster, have 6 osds, 3 for bucket index pool,3 for other pools, there ar...
zhouyue zhou

06/13/2022

10:40 PM Bug #56028 (New): thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.vers...
This assertion is resurfacing in Pacific runs. The last fix for this was tracked in #46323, but this test branch incl... Laura Flores
10:27 PM Bug #52737: osd/tests: stat mismatch
@Ronen I'm pretty sure this is a duplicate of #50222 Laura Flores
10:26 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2022-06-07_19:48:58-rados-wip-yuri6-testing-2022-06-07-0955-pacific-distro-default-smithi/6866688 Laura Flores
03:12 AM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
/a/yuriw-2022-06-09_03:58:30-smoke-quincy-release-distro-default-smithi/6869659/
Test description: smoke/basic/{clus...
Aishwarya Mathuria

06/10/2022

05:38 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
Sridhar Seshasayee wrote:
> *+Quick Update+*
> This was again hit recently in
> /a/yuriw-2022-06-09_03:58:30-smoke...
Yuri Weinstein
05:10 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
verifying again
http://pulpito.front.sepia.ceph.com/yuriw-2022-06-11_02:22:38-smoke-quincy-release-distro-default-sm...
Yuri Weinstein
03:47 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
*+Quick Update+*
This was again hit recently in
/a/yuriw-2022-06-09_03:58:30-smoke-quincy-release-distro-default-sm...
Sridhar Seshasayee
05:20 PM Backport #55981: quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/46605
merged
Yuri Weinstein
04:52 PM Bug #55001: rados/test.sh: Early exit right after LibRados global tests complete
/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6871955
Coredump...
Laura Flores
04:48 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
Nitzan Mordechai wrote:
> That could work, but when we have socket failure injection, the error callback will not be...
Laura Flores
04:44 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6872050 Laura Flores
04:39 PM Feature #55982: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46607 merged Yuri Weinstein
08:18 AM Bug #55995 (New): OSD Crash: /lib64/libpthread.so.0(+0x12ce0) [0x7f94cdcbbce0]
Hi,
i recently upgraded my ceph cluster from 14.2.x to 16.2.7 and switched to docker deployment. Since then, i see...
Kilian Ries

06/09/2022

09:42 PM Backport #55981: quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
https://github.com/ceph/ceph/pull/46605 Radoslaw Zarzynski
06:36 PM Backport #55981 (Resolved): quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
Radoslaw Zarzynski
08:42 PM Backport #55985 (In Progress): octopus: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46609 Radoslaw Zarzynski
08:35 PM Backport #55985 (Resolved): octopus: log the numbers of dups in PG Log
Backport Bot
08:40 PM Backport #55984 (In Progress): pacific: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46608 Radoslaw Zarzynski
08:35 PM Backport #55984 (Resolved): pacific: log the numbers of dups in PG Log
Backport Bot
08:38 PM Backport #55983 (In Progress): quincy: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46607 Radoslaw Zarzynski
08:35 PM Backport #55983 (Resolved): quincy: log the numbers of dups in PG Log
Backport Bot
08:32 PM Feature #55982 (Pending Backport): log the numbers of dups in PG Log
Approved for `main`, QA is going on. Switching to _Pending backport_ before the merge to unblock backports. Radoslaw Zarzynski
08:03 PM Feature #55982 (Fix Under Review): log the numbers of dups in PG Log
Radoslaw Zarzynski
07:59 PM Feature #55982 (Resolved): log the numbers of dups in PG Log
This is a feature requests that is critical for investigating / verification of the dups inflation issue. Radoslaw Zarzynski
01:28 PM Backport #55747: pacific: Support blocklisting a CIDR range
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46470
merged
Yuri Weinstein

06/08/2022

06:23 PM Bug #52969 (Fix Under Review): use "ceph df" command found pool max avail increase when there are...
Radoslaw Zarzynski
06:22 PM Backport #55973 (Rejected): pacific: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi1...
Backport Bot
06:22 PM Backport #55972 (Resolved): quincy: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10...
Backport Bot
06:16 PM Bug #49525 (Pending Backport): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi1012151...
Neha Ojha
06:13 PM Bug #55407 (Rejected): quincy osd's fail to boot and crash
Closing this ticket. The new crash is tracked independently (https://tracker.ceph.com/issues/55698). Radoslaw Zarzynski
06:10 PM Bug #55851: Assert in Ceph messenger
From Neha:
* http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=12eed3bdd041d05365...
Radoslaw Zarzynski
06:04 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
This is isn't octupus-specific as we saw it in pacific as well. Radoslaw Zarzynski
05:52 PM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
No high priority. Possibly a test issue. Radoslaw Zarzynski
05:49 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Maybe let's talk on that in one of the RADOS Team meetings. Radoslaw Zarzynski
05:48 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Maybe let's talk on that in one of the RADOS Team meetings. Radoslaw Zarzynski
03:25 PM Backport #55309: pacific: prometheus metrics shows incorrect ceph version for upgraded ceph daemon
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46429
merged
Yuri Weinstein
03:25 PM Backport #55308: pacific: Manager is failing to keep updated metadata in daemon_state for upgrade...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46427
merged
Yuri Weinstein
03:12 PM Bug #52724 (Duplicate): octopus: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
Laura Flores
03:09 PM Bug #53855 (Resolved): rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
Laura Flores
02:38 PM Bug #51076 (Resolved): "wait_for_recovery: failed before timeout expired" during thrashosd test w...
Laura Flores
02:38 PM Backport #55743 (Resolved): octopus: "wait_for_recovery: failed before timeout expired" during th...
Laura Flores
07:53 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
That could work, but when we have socket failure injection, the error callback will not be calling in the python API ... Nitzan Mordechai
06:17 AM Bug #55836 (Fix Under Review): add an asok command for pg log investigations
Nitzan Mordechai
02:24 AM Backport #55305 (In Progress): quincy: Manager is failing to keep updated metadata in daemon_stat...
Prashant D

06/07/2022

05:42 PM Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
Neha Ojha
05:42 PM Bug #54296 (Resolved): OSDs using too much memory
Neha Ojha
05:41 PM Backport #55633 (Resolved): octopus: ceph-osd takes all memory before oom on boot
Neha Ojha
05:41 PM Backport #55631 (Resolved): pacific: ceph-osd takes all memory before oom on boot
Neha Ojha
05:13 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
/a/yuriw-2022-05-31_21:35:41-rados-wip-yuri2-testing-2022-05-31-1300-pacific-distro-default-smithi/6856269
Descrip...
Laura Flores
04:04 PM Backport #53972: pacific: BufferList.rebuild_aligned_size_and_memory failure
Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/46215
merged
Yuri Weinstein
04:03 PM Bug #50806: osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_mis...
https://github.com/ceph/ceph/pull/46120 merged Yuri Weinstein
04:02 PM Backport #55281: pacific: mon/OSDMonitor: properly set last_force_op_resend in stretch mode
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45870
merged
Yuri Weinstein
03:42 PM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
/a/yuriw-2022-06-02_00:50:42-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-smithi/6859734
Descrip...
Laura Flores
03:30 PM Bug #48965: qa/standalone/osd/osd-force-create-pg.sh: TEST_reuse_id: return 1
/a/yuriw-2022-06-02_00:50:42-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-smithi/6859929 Laura Flores
03:29 PM Bug #55906: cephfs/metrics/Types.h: In function 'std::ostream& operator<<(std::ostream&, const Cl...
Oops, updated the wrong Tracker. Laura Flores
06:01 AM Bug #55906: cephfs/metrics/Types.h: In function 'std::ostream& operator<<(std::ostream&, const Cl...
This has been fixed by https://tracker.ceph.com/issues/50822 Xiubo Li
05:55 AM Bug #55906 (New): cephfs/metrics/Types.h: In function 'std::ostream& operator<<(std::ostream&, co...
/home/teuthworker/archive/yuriw-2022-06-02_14:44:32-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-sm... Nitzan Mordechai
03:20 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
/a/yuriw-2022-06-02_00:50:42-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-smithi/6859916 Laura Flores
02:01 PM Backport #55298: octopus: malformed json in a Ceph RESTful API call can stop all ceph-mon services
nikhil kshirsagar wrote:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/...
Yuri Weinstein
01:56 AM Bug #55905 (New): Failed to build rados.cpython-310-x86_64-linux-gnu.so
I build ceph on Ubuntu22.04, but I meet the error. And under my research, I found a way to solve the error, but I don... Hualong Feng

06/06/2022

08:06 PM Bug #55836: add an asok command for pg log investigations
It'd be nice if we could retrieve pg log dups length by means of an existing command. FWIW, we log the "approx pg log... Neha Ojha
06:38 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Tested with the fixed version and now it is working fine!... Vikhyat Umrao
04:56 PM Bug #51076 (Pending Backport): "wait_for_recovery: failed before timeout expired" during thrashos...
Laura Flores
04:55 PM Bug #51076 (Resolved): "wait_for_recovery: failed before timeout expired" during thrashosd test w...
Laura Flores
04:55 PM Backport #55745 (Resolved): pacific: "wait_for_recovery: failed before timeout expired" during th...
Laura Flores
04:55 PM Bug #50842 (Resolved): pacific: recovery does not complete because of rw_manager lock not being ...
Laura Flores
02:58 PM Backport #55746: quincy: Support blocklisting a CIDR range
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46469
merged
Yuri Weinstein

06/05/2022

10:18 AM Bug #55407: quincy osd's fail to boot and crash
Radoslaw Zarzynski wrote:
> This looks like something new and unrelated to other crashes in this ticket, so created ...
Gonzalo Aguilar Delgado

06/03/2022

10:14 PM Bug #46877: mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
Spotted in Quincy:
/a/yuriw-2022-06-02_20:24:42-rados-wip-yuri5-testing-2022-06-02-0825-quincy-distro-default-smit...
Laura Flores
02:50 PM Bug #55851 (Resolved): Assert in Ceph messenger
Context:
Ceph balancer was busy balancing: PG remaps...
Stefan Kooman
01:23 AM Backport #55306 (Resolved): quincy: prometheus metrics shows incorrect ceph version for upgraded ...
should be fixed in 17.2.1 Adam King

06/02/2022

07:26 PM Bug #55836 (Resolved): add an asok command for pg log investigations
The rationale is that @ceph-objectstore-tool -op log@ requires stopping OSD, and thus is intrusive.
This feature i...
Radoslaw Zarzynski
09:39 AM Backport #55767 (In Progress): octopus: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Nitzan Mordechai
09:38 AM Backport #55768 (In Progress): pacific: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Nitzan Mordechai
08:35 AM Bug #54004 (Rejected): When creating erasure-code-profile incorrectly set parameters, it can be c...
Nitzan Mordechai
08:35 AM Bug #54004: When creating erasure-code-profile incorrectly set parameters, it can be created succ...
The profile can be created, but that doesn't mean that you will use it, yet.
as long as you are not using it, no err...
Nitzan Mordechai
06:32 AM Bug #54172: ceph version 16.2.7 PG scrubs not progressing

We've seen this at a customers cluster as well. A simple repeer of the pg gets it unstuck. We've not investigated a...
Wout van Heeswijk

06/01/2022

02:52 PM Backport #55631: pacific: ceph-osd takes all memory before oom on boot
This PR is ready to merge. Can this be executed so this change will end up in the next Pacific release? Wout van Heeswijk
10:47 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
jianwei zhang wrote:
> https://github.com/ceph/ceph/pull/46478
test result
jianwei zhang
10:40 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
https://github.com/ceph/ceph/pull/46478 jianwei zhang
08:15 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
... jianwei zhang
07:18 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
The original intention of raising this question is that testers (users) are confused as to why MAX_AVAIL does not dec... jianwei zhang
06:17 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
step4 vs step5:
4. kill 9 osd.0.pid - OSD.0 OUT unset nobackfill --> recovery HEALTH_OK
STORED = 1.1G ///increase 1...
jianwei zhang
06:15 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
5. remove out osd.0... jianwei zhang
06:03 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
for Problem2 step1 vs step4:
osd.0 already out and recovery complete HEALTH_OK, but STORED/(DATA) 1.0G increase ...
jianwei zhang
05:59 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
for ceph df detail commands
I don't think raw_used_rate should be adjusted:...
jianwei zhang
05:51 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
针对MAX AVAIL字段,我认为应该将down or out osd.0去除掉... jianwei zhang
05:44 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
Problem1 step1 vs step2:
1. ceph cluster initial state
STORED = 1.0G
(DATA) = 1.0G
MAX AVAIL = 260G
2. ...
jianwei zhang
05:30 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
ceph v15.2.13
I found same problem
1. ceph cluster initial state...
jianwei zhang
06:23 AM Fix #54565 (Resolved): Add snaptrim stats to the existing PG stats.
Sridhar Seshasayee
06:23 AM Backport #54612 (Resolved): quincy: Add snaptrim stats to the existing PG stats.
Sridhar Seshasayee
06:22 AM Bug #55186 (Resolved): Doc: Update mclock release notes regarding an existing issue.
Sridhar Seshasayee
06:21 AM Backport #55219 (Resolved): quincy: Doc: Update mclock release notes regarding an existing issue.
Sridhar Seshasayee
06:19 AM Feature #51984 (Resolved): [RFE] Provide warning when the 'require-osd-release' flag does not mat...
Sridhar Seshasayee
06:18 AM Backport #53549 (Rejected): nautilus: [RFE] Provide warning when the 'require-osd-release' flag d...
The backport to nautilus was deemed not needed. See BZ https://bugzilla.redhat.com/show_bug.cgi?id=2033078 for more d... Sridhar Seshasayee
05:57 AM Backport #53550 (Resolved): octopus: [RFE] Provide warning when the 'require-osd-release' flag do...
Sridhar Seshasayee
04:58 AM Bug #49525 (Fix Under Review): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi1012151...
Indeed caused by scrub starting while the PG is being snap-trimmed.
Ronen Friedman
04:51 AM Bug #55794 (Duplicate): scrub: scrub is not prevented from started while snap-trimming is in prog...
Laura Flores wrote:
> @Ronen is this already tracked in #49525?
Yes. Thanks. I will mark as duplicate.
Ronen Friedman

05/31/2022

11:52 PM Bug #54316 (Resolved): mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat (Junior) Sirivadhna
11:51 PM Backport #54567 (Resolved): pacific: mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat (Junior) Sirivadhna
11:50 PM Backport #54568 (Resolved): octopus: mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat (Junior) Sirivadhna
11:33 PM Backport #55747 (In Progress): pacific: Support blocklisting a CIDR range
Greg Farnum
11:18 PM Backport #55746 (In Progress): quincy: Support blocklisting a CIDR range
Greg Farnum
10:26 PM Bug #55794: scrub: scrub is not prevented from started while snap-trimming is in progress
@Ronen is this already tracked in #49525? Laura Flores
09:38 PM Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
Laura Flores wrote:
> /a/yuriw-2022-05-27_21:59:17-rados-wip-yuri-testing-2022-05-27-0934-distro-default-smithi/6851...
Laura Flores
09:35 PM Bug #55809 (New): "Leak_IndirectlyLost" valgrind report on mon.c
/a/yuriw-2022-05-27_21:59:17-rados-wip-yuri-testing-2022-05-27-0934-distro-default-smithi/6851271/remote/smithi085/lo... Laura Flores
06:13 PM Backport #53971 (Resolved): octopus: BufferList.rebuild_aligned_size_and_memory failure
Neha Ojha
06:07 PM Backport #53971: octopus: BufferList.rebuild_aligned_size_and_memory failure
Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/46216
merged
Yuri Weinstein
03:10 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Other reported instances of this `wait_for_clean` assertion failure where the pgmap has a pg stuck in recovery have l... Laura Flores
03:04 PM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
Hi,
set debug mode on OSDs and MONs but didn't find string 'choose_acting'.
Also what I found, our EC profile ...
Denis Polom
02:42 PM Bug #39150 (Resolved): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
Neha Ojha
02:41 PM Bug #50659 (Resolved): Segmentation fault under Pacific 16.2.1 when using a custom crush location...
Neha Ojha
02:39 PM Bug #53306 (Resolved): ceph -s mon quorum age negative number
Neha Ojha
02:38 PM Backport #55280 (Resolved): quincy: mon/OSDMonitor: properly set last_force_op_resend in stretch ...
Neha Ojha
02:37 PM Bug #53327 (Resolved): osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shut...
Neha Ojha
02:34 PM Backport #55632 (Resolved): quincy: ceph-osd takes all memory before oom on boot
Neha Ojha
12:52 PM Bug #55435 (Fix Under Review): mon/Elector: notify_ranked_removed() does not properly erase dead_...
Kamoltat (Junior) Sirivadhna
05:18 AM Bug #55798 (Fix Under Review): scrub starts message missing in cluster log
Prashant D
05:15 AM Bug #55798 (Pending Backport): scrub starts message missing in cluster log
We used to log "scrub starts" and "deep-scrub starts" message if scrub/deep-scrub process has been started for the pg... Prashant D

05/30/2022

01:27 PM Bug #55773 (Fix Under Review): Assertion failure (ceph_assert(have_pending)) when creating new OS...
Sridhar Seshasayee
01:16 PM Backport #55309 (In Progress): pacific: prometheus metrics shows incorrect ceph version for upgra...
Prashant D
01:13 PM Backport #55308 (In Progress): pacific: Manager is failing to keep updated metadata in daemon_sta...
Prashant D
12:24 PM Bug #55794 (Duplicate): scrub: scrub is not prevented from started while snap-trimming is in prog...
Scrub code only tests the target PG for 'active' & 'clean'. And snap-trimming PGs are
'clean'.
For example:
http...
Ronen Friedman
09:25 AM Backport #55792 (Rejected): octopus: CEPH Graylog Logging Missing "host" Field
Konstantin Shalygin
09:25 AM Backport #55791 (Rejected): pacific: CEPH Graylog Logging Missing "host" Field
Konstantin Shalygin

05/27/2022

10:29 PM Bug #55787 (New): mon/crush_ops.sh: Error ENOENT: item osd.7 does not exist
Found in an Octopus teuthology run:
/a/yuriw-2022-05-14_14:30:10-rados-wip-yuri5-testing-2022-05-13-1402-octopus-d...
Laura Flores
03:59 PM Bug #55383 (Resolved): monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
03:59 PM Backport #55742 (Resolved): quincy: monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
03:50 PM Backport #55742: quincy: monitor cluster logs(ceph.log) appear empty until rotated
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46374
merged
Yuri Weinstein
03:35 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2022-05-26_23:23:48-rados-wip-yuri2-testing-2022-05-26-1430-quincy-distro-default-smithi/6849426$... Laura Flores

05/26/2022

10:56 PM Bug #55776 (New): octopus: map exx had wrong cluster addr
Description: rados/objectstore/{backends/ceph_objectstore_tool supported-random-distro$/{ubuntu_18.04}}
/a/yuriw-2...
Laura Flores
10:33 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2022-05-13_14:13:55-rados-wip-yuri3-testing-2022-05-12-1609-octopus-distro-default-smithi/6832544
Descrip...
Laura Flores
05:06 PM Bug #55773: Assertion failure (ceph_assert(have_pending)) when creating new OSDs during OSD deplo...
+*ANALYSIS*+
Note that the analysis is for the first crash when the leader was: mon.f25-h23-000-r730xd.rdu2.scalel...
Sridhar Seshasayee
04:54 PM Bug #55773 (Resolved): Assertion failure (ceph_assert(have_pending)) when creating new OSDs durin...
See https://bugzilla.redhat.com/show_bug.cgi?id=2086419 for more details.
+*Assertion Failure*+...
Sridhar Seshasayee
08:52 AM Bug #55355: osd thread deadlock
I think this problem may be a problem with ProtocolV2... jianwei zhang
01:56 AM Bug #55750: mon: slow request of very long time
https://github.com/ceph/ceph/pull/41516
https://github.com/ceph/ceph/commit/a124ee85b03e15f4ea371358008ecac65f9f4e50...
yite gu

05/25/2022

08:37 PM Bug #55750: mon: slow request of very long time
Radoslaw Zarzynski wrote:
> Could you please provide an info on which version of Ceph this issue happened?
# ceph -...
yite gu
06:19 PM Bug #55750 (Need More Info): mon: slow request of very long time
Could you please provide an info on which version of Ceph this issue happened? Radoslaw Zarzynski
08:17 PM Bug #53895 (Resolved): Unable to format `ceph config dump` command output in yaml using `-f yaml`
Laura Flores
06:46 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Not urgent, perhaps not low-hanging-fruit but still good as a training issue. Radoslaw Zarzynski
06:41 PM Bug #55726 (Need More Info): Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on c...
It would be really helpful to compare logs around @choose_acting@ from Nautilus vs Octopus. Radoslaw Zarzynski
06:32 PM Backport #55768 (Resolved): pacific: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
https://github.com/ceph/ceph/pull/46499 Backport Bot
06:32 PM Backport #55767 (Rejected): octopus: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
https://github.com/ceph/ceph/pull/46500 Backport Bot
06:28 PM Bug #45868 (Pending Backport): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Neha Ojha
06:27 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
Let me paste a Laura's comment from https://github.com/ceph/ceph/pull/45825:
> @NitzanMordhai perhaps similar logi...
Radoslaw Zarzynski
06:11 PM Bug #46847: Loss of placement information on OSD reboot
Notes from the bug scrub:
1. There is a theoretical way to enter backfill instead of recovery in such a scenario.
...
Radoslaw Zarzynski
05:57 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
https://tracker.ceph.com/issues/53685 shows the issue is not restricted just to @MOSDPGLog@. Radoslaw Zarzynski
05:56 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
The investigation doc: https://docs.google.com/document/d/1s-Vzv3yLTMSO8Hz_MHMg5ix1v53P4jlN6dX1L06yyls/edit#. Radoslaw Zarzynski
03:38 PM Backport #55743 (In Progress): octopus: "wait_for_recovery: failed before timeout expired" during...
Laura Flores
03:37 PM Backport #55745 (In Progress): pacific: "wait_for_recovery: failed before timeout expired" during...
Laura Flores
03:33 PM Backport #55744 (Resolved): quincy: "wait_for_recovery: failed before timeout expired" during thr...
Laura Flores
03:01 PM Backport #55624 (Resolved): quincy: Unable to format `ceph config dump` command output in yaml us...
Laura Flores
12:12 PM Feature #55764 (New): Adaptive mon_warn_pg_not_deep_scrubbed_ratio according to actual scrub thro...
This request comes from the Science Users Working Group https://pad.ceph.com/p/Ceph_Science_User_Group_20220524
Fo...
Dan van der Ster
06:54 AM Bug #55355: osd thread deadlock
... jianwei zhang
06:53 AM Bug #55355: osd thread deadlock
910583--wait-->910587: (gdb) t 32 wait MgrClient::lock (owner=910587)
910587--wait-->910429: (gdb) t 62 hold MgrClie...
jianwei zhang
03:43 AM Bug #55355: osd thread deadlock
thread-35 : holding AsyncMessenger::lock, waiting AsyncConnection::lock
thread-3: holding AsyncConnection::lock, wai...
jianwei zhang
03:12 AM Bug #55355: osd thread deadlock
... jianwei zhang
03:06 AM Bug #55355: osd thread deadlock
ceph v15.2.13
I found an almost identical stack waiting for a lock...
jianwei zhang

05/24/2022

03:51 PM Backport #55744 (In Progress): quincy: "wait_for_recovery: failed before timeout expired" during ...
Laura Flores
10:40 AM Bug #55662 (Rejected): EC: Clay assert fail ../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop....
Nitzan Mordechai
10:39 AM Bug #55662: EC: Clay assert fail ../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length...
The test needed osd_read_ec_check_for_errors to be set to true, when it is set, the EIO error is ignored and we can g... Nitzan Mordechai
03:04 AM Bug #55750: mon: slow request of very long time
It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?
yite gu
02:47 AM Bug #55750: mon: slow request of very long time
... yite gu
02:45 AM Bug #55750 (Need More Info): mon: slow request of very long time
... yite gu
02:38 AM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
We are seeing this bug in Nautilus 14.2.15 to 14.2.22 replicated pool.
Two of our osds are stuck in a crash loop ...
Justin Mammarella

05/23/2022

11:56 PM Backport #55747 (Resolved): pacific: Support blocklisting a CIDR range
https://github.com/ceph/ceph/pull/46470 Backport Bot
11:56 PM Backport #55746 (Resolved): quincy: Support blocklisting a CIDR range
https://github.com/ceph/ceph/pull/46469 Backport Bot
11:52 PM Feature #53050: Support blocklisting a CIDR range
The Backport field was empty, therefore no backport tickets were created. Neha Ojha
11:32 PM Backport #55745 (Resolved): pacific: "wait_for_recovery: failed before timeout expired" during th...
https://github.com/ceph/ceph/pull/46391 Backport Bot
11:32 PM Backport #55744 (Resolved): quincy: "wait_for_recovery: failed before timeout expired" during thr...
https://github.com/ceph/ceph/pull/46384 Backport Bot
11:31 PM Backport #55743 (Resolved): octopus: "wait_for_recovery: failed before timeout expired" during th...
https://github.com/ceph/ceph/pull/46392 Backport Bot
11:26 PM Bug #51076 (Pending Backport): "wait_for_recovery: failed before timeout expired" during thrashos...
Neha Ojha
09:24 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
/a/yuriw-2022-05-19_18:50:25-rados-wip-yuri4-testing-2022-05-19-0831-quincy-distro-default-smithi/6841763
Descriptio...
Laura Flores
08:16 PM Backport #55742 (In Progress): quincy: monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
07:55 PM Backport #55742 (Resolved): quincy: monitor cluster logs(ceph.log) appear empty until rotated
https://github.com/ceph/ceph/pull/46374 Backport Bot
07:50 PM Bug #55383 (Pending Backport): monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao

05/21/2022

11:45 PM Backport #55306 (In Progress): quincy: prometheus metrics shows incorrect ceph version for upgrad...
Adam King
11:41 PM Backport #55306: quincy: prometheus metrics shows incorrect ceph version for upgraded ceph daemon
including this in https://github.com/ceph/ceph/pull/46360 Adam King

05/20/2022

03:33 PM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NQKDCBJ2SH3DTUCMV6KU4T3EGKOSCGJV/ Ilya Dryomov
02:14 PM Bug #55726 (Need More Info): Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on c...
Hi
I observed high latencies and mount points hanging since Octopus release
and it's still observed on Pacific l...
Denis Polom
03:24 PM Bug #46847: Loss of placement information on OSD reboot
we also encounter the similar issue and when ecpool during rebalance; sometime (osd overload or pg peering crash), th... Yao Ning

05/19/2022

10:01 PM Bug #51076 (Fix Under Review): "wait_for_recovery: failed before timeout expired" during thrashos...
Laura Flores
09:19 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Neha Ojha wrote:
> Laura Flores wrote:
> > /a/yuriw-2022-03-25_18:42:52-rados-wip-yuri7-testing-2022-03-24-1341-pac...
Laura Flores
01:55 PM Bug #55711 (Fix Under Review): mon: race condition between `mgr fail` and MgrMonitor::prepare_bea...
Radoslaw Zarzynski
01:51 PM Bug #55711 (Resolved): mon: race condition between `mgr fail` and MgrMonitor::prepare_beacon()
https://gist.github.com/rzarzynski/25ac59c8422e9ad0b1710a765a77f19a#the-race-condition Radoslaw Zarzynski
06:01 AM Bug #55708 (Fix Under Review): Reducing 2 Monitors Causes Stray Daemon
Example of the problem:
Roles:
smithi001: mon.a
smithi002: mon.b
smithi070: mon.c
smithi100 : mon.d
smithi2...
Kamoltat (Junior) Sirivadhna
05:27 AM Bug #55662: EC: Clay assert fail ../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length...
i used /qa/standalone/erasure-code/test-erasure-eio.sh, the test that failed is TEST_ec_object_attr_read_error when i... Nitzan Mordechai

05/18/2022

09:09 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
/a/yuriw-2022-05-13_14:13:55-rados-wip-yuri3-testing-2022-05-12-1609-octopus-distro-default-smithi/6832699 Laura Flores
08:56 PM Bug #52316: qa/tasks/mon_thrash.py: _do_thrash AssertionError len(s['quorum']) == len(mons)
/a/yuriw-2022-05-13_14:13:55-rados-wip-yuri3-testing-2022-05-12-1609-octopus-distro-default-smithi/6832711... Laura Flores
07:37 PM Bug #53485 (Fix Under Review): monstore: logm entries are not garbage collected
Neha Ojha
01:37 PM Bug #53485: monstore: logm entries are not garbage collected
PR https://github.com/ceph/ceph/pull/44511 Daniel Poelzleithner
06:26 PM Bug #55662: EC: Clay assert fail ../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length...
Can you please add the test that helped you discover this issue? I believe the same test was passing with other EC pl... Neha Ojha
06:17 PM Bug #55407: quincy osd's fail to boot and crash
This looks like something new and unrelated to other crashes in this ticket, so created a new one: https://tracker.ce... Radoslaw Zarzynski
06:17 PM Bug #51858: octopus: rados/test_crash.sh failure
/a/nojha-2022-05-17_22:38:06-rados-wip-lrc-fix-pacific-distro-basic-smithi/6839177 Laura Flores
06:17 PM Bug #55698 (New): osd: segfault at boot up
In the https://tracker.ceph.com/issues/55407#note-14 an OSD crash during early boot up is reported:... Radoslaw Zarzynski
06:09 PM Bug #55559: osd-backfill-stats.sh fails in TEST_backfill_ec_prim_out
The common theme between these failures (this one and #47026) is @check()@ function of @qa/standalone/osd-backfill/os... Radoslaw Zarzynski
03:02 PM Bug #55695: Shutting down a monitor forces Paxos to restart and sometimes disregard subsequent co...
https://docs.google.com/document/d/1ucVz54vMlm26oiqQoqJ2upUPmiROd4AmwSwbVM_s2A0/edit# Kamoltat (Junior) Sirivadhna
03:01 PM Bug #55695 (Fix Under Review): Shutting down a monitor forces Paxos to restart and sometimes disr...
*Problem:*
mon.a
mon.b
mon.c
mon.d
mon.e
ceph -a stop mon.d
ceph mon remove d
.
.
mon.d is down...
Kamoltat (Junior) Sirivadhna
 

Also available in: Atom