Project

General

Profile

Activity

From 06/25/2022 to 07/24/2022

07/24/2022

08:09 AM Bug #56661: Quincy: OSD crashing one after another with data loss with ceph_assert_fail
Myoungwon Oh, can you please take a look? Nitzan Mordechai
07:36 AM Bug #56661: Quincy: OSD crashing one after another with data loss with ceph_assert_fail
Sadly i dont have any more logs anymore, as i had to destroy the ceph - getting it back in working order was top prio... Chris Kul
05:28 AM Bug #56661: Quincy: OSD crashing one after another with data loss with ceph_assert_fail
@Chris Kul, I'm trying to understand the sequence of failing osd's, can you please upload the osds logs that failed?
...
Nitzan Mordechai

07/21/2022

08:29 PM Bug #55836: add an asok command for pg log investigations
https://github.com/ceph/ceph/pull/46561 merged Yuri Weinstein
07:19 PM Bug #56530 (Fix Under Review): Quincy: High CPU and slow progress during backfill
Sridhar Seshasayee
06:58 PM Bug #56530: Quincy: High CPU and slow progress during backfill
The issue is addressed currently in Ceph's main branch. Please see the linked PR. This will be back-ported to Quincy ... Sridhar Seshasayee
02:59 PM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
Just a note, i was able to recreate it with vstart, without error injection but with valgrind
as soon as we step in...
Nitzan Mordechai
02:00 PM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
Ah, thanks Sridhar. I will compare the two Trackers and mark this one as a duplicate if needed. Laura Flores
02:57 AM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
This looks similar to https://tracker.ceph.com/issues/52948. See comment https://tracker.ceph.com/issues/52948#note-5... Sridhar Seshasayee
02:57 PM Backport #56664 (In Progress): quincy: mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change shou...
https://github.com/ceph/ceph/pull/47210 Kamoltat (Junior) Sirivadhna
02:45 PM Backport #56664 (Resolved): quincy: mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change should ...
Backport Bot
02:49 PM Backport #56663: pacific: mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change should be gap >= ...
https://github.com/ceph/ceph/pull/47211 Kamoltat (Junior) Sirivadhna
02:45 PM Backport #56663 (Resolved): pacific: mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change should...
Backport Bot
02:40 PM Bug #56151 (Pending Backport): mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change should be ga...
Kamoltat (Junior) Sirivadhna
01:34 PM Bug #56661: Quincy: OSD crashing one after another with data loss with ceph_assert_fail
BTW the initial version was 17.2.0, we tried to update to 17.2.1 in hope this bug got fixed, sadly without luck. Chris Kul
01:33 PM Bug #56661 (Need More Info): Quincy: OSD crashing one after another with data loss with ceph_asse...
After two weeks after an upgrade to quincy from a octopus setup, the SSD pool reported one OSD down in the middle of ... Chris Kul
09:05 AM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
Looks like a race condition. Does our a @Context@ makes a dependency on @RefCountedObj@ (e.g. @TrackedOp@) but forget... Radoslaw Zarzynski

07/20/2022

11:33 PM Bug #44089 (New): mon: --format=json does not work for config get or show
This would be a good issue for Open Source Day if someone would be willing to take over the closed PR: https://github... Laura Flores
09:40 PM Bug #56530: Quincy: High CPU and slow progress during backfill
ceph-users discussion - https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/Z7AILAXZDBIT6IIF2E6M3BLUE6B7L... Vikhyat Umrao
07:45 PM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
Found another occurrence here: /a/yuriw-2022-07-18_18:20:02-rados-wip-yuri8-testing-2022-07-18-0918-distro-default-sm... Laura Flores
06:11 PM Bug #56574 (Need More Info): rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down...
Watching for more reoccurances. Radoslaw Zarzynski
10:25 AM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
osd.0 is still down..
The valagrind for osd.0 shows:...
Nitzan Mordechai
06:25 PM Bug #51168: ceph-osd state machine crash during peering process
Yao Ning wrote:
> Radoslaw Zarzynski wrote:
> > The PG was in @ReplicaActive@ so we shouldn't see any backfill acti...
Neha Ojha
06:06 PM Backport #56656 (New): pacific: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDur...
Backport Bot
06:06 PM Backport #56655 (Resolved): quincy: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlus...
https://github.com/ceph/ceph/pull/47929 Backport Bot
06:03 PM Bug #53294 (Pending Backport): rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuri...
Neha Ojha
03:20 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
/a/yuriw-2022-07-19_23:25:12-rados-wip-yuri2-testing-2022-07-15-0755-pacific-distro-default-smithi/6939431... Laura Flores
06:02 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
Notes from the scrub:
1. It looks this happens mostly (only?) on pacific.
2. In at least of two replications Valg...
Radoslaw Zarzynski
05:56 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
... Radoslaw Zarzynski
03:58 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
/a/yuriw-2022-07-19_23:25:12-rados-wip-yuri2-testing-2022-07-15-0755-pacific-distro-default-smithi/6939660 Laura Flores
04:42 PM Backport #56408: quincy: ceph version 16.2.7 PG scrubs not progressing
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46844
merged
Yuri Weinstein
04:40 PM Backport #56060: quincy: Assertion failure (ceph_assert(have_pending)) when creating new OSDs dur...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46689
merged
Yuri Weinstein
04:40 PM Bug #49525: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missi...
https://github.com/ceph/ceph/pull/46498 merged Yuri Weinstein
04:08 PM Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
/a/yuriw-2022-07-19_23:25:12-rados-wip-yuri2-testing-2022-07-15-0755-pacific-distro-default-smithi/6939513 Laura Flores
04:07 PM Bug #53767 (Duplicate): qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing c...
Same failure on test_cls_2pc_queue.sh, but this one came with remote logs. I suspect this is a duplicate of #55809.
...
Laura Flores
03:43 PM Bug #43584: MON_DOWN during mon_join process
/a/yuriw-2022-07-19_23:25:12-rados-wip-yuri2-testing-2022-07-15-0755-pacific-distro-default-smithi/6939512 Laura Flores
02:50 PM Bug #56650: ceph df reports invalid MAX AVAIL value for stretch mode crush rule
Before applying PR#47189, MAX AVAIL for stretch_rule pools is incorrect :... Prashant D
02:07 PM Bug #56650 (Fix Under Review): ceph df reports invalid MAX AVAIL value for stretch mode crush rule
Prashant D
01:26 PM Bug #56650 (Fix Under Review): ceph df reports invalid MAX AVAIL value for stretch mode crush rule
If we define crush rule for stretch mode cluster with multiple take then MAX AVAIL for pools associated with crush ru... Prashant D
01:15 PM Backport #56649 (Resolved): pacific: [Progress] Do not show NEW PG_NUM value for pool if autoscal...
https://github.com/ceph/ceph/pull/53464 Backport Bot
01:15 PM Backport #56648 (Resolved): quincy: [Progress] Do not show NEW PG_NUM value for pool if autoscale...
https://github.com/ceph/ceph/pull/47925 Backport Bot
01:14 PM Bug #56136 (Pending Backport): [Progress] Do not show NEW PG_NUM value for pool if autoscaler is ...
Prashant D

07/19/2022

09:20 PM Backport #56642 (Resolved): pacific: Log at 1 when Throttle::get_or_fail() fails
Backport Bot
09:20 PM Backport #56641 (Resolved): quincy: Log at 1 when Throttle::get_or_fail() fails
Backport Bot
09:18 PM Bug #56495 (Pending Backport): Log at 1 when Throttle::get_or_fail() fails
Brad Hubbard
02:07 PM Bug #56495: Log at 1 when Throttle::get_or_fail() fails
https://github.com/ceph/ceph/pull/47019 merged Yuri Weinstein
04:24 PM Bug #50222 (In Progress): osd: 5.2s0 deep-scrub : stat mismatch
Thanks Rishabh, I am having a look into this. Laura Flores
04:11 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
This error showed up in QA runs -
http://pulpito.front.sepia.ceph.com/rishabh-2022-07-08_23:53:34-fs-wip-rishabh-tes...
Rishabh Dave
10:25 AM Bug #55001 (Fix Under Review): rados/test.sh: Early exit right after LibRados global tests complete
Nitzan Mordechai
08:28 AM Bug #55001: rados/test.sh: Early exit right after LibRados global tests complete
the core dump showing:... Nitzan Mordechai
08:28 AM Bug #49689 (Fix Under Review): osd/PeeringState.cc: ceph_abort_msg("past_interval start interval ...
PR is marked as draft for now. Matan Breizman
08:26 AM Backport #56580 (Resolved): octopus: snapshots will not be deleted after upgrade from nautilus to...
Matan Breizman
12:48 AM Bug #50853 (Can't reproduce): libcephsqlite: Core dump while running test_libcephsqlite.sh.
Patrick Donnelly

07/18/2022

08:43 PM Backport #56580: octopus: snapshots will not be deleted after upgrade from nautilus to pacific
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47108
merged
Yuri Weinstein
01:52 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
I was able to reproduce the problem after modifying qa/tasks/ceph_manager.py: https://github.com/ceph/ceph/pull/46931... Kamoltat (Junior) Sirivadhna
12:44 PM Bug #49777 (Fix Under Review): test_pool_min_size: 'check for active or peered' reached maximum t...
Kamoltat (Junior) Sirivadhna
01:50 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-07-13_19:41:18-rados-wip-yuri7-testing-2022-07-11-1631-distro-default-smithi/6929396/remote/smithi204/... Aishwarya Mathuria
01:47 PM Bug #55001: rados/test.sh: Early exit right after LibRados global tests complete
We have coredump and the console_log showing:
smithi042.log:[ 852.382596] ceph_test_rados[110223]: segfault at 0 ip...
Nitzan Mordechai
01:42 PM Backport #56604 (Resolved): pacific: ceph report missing osdmap_clean_epochs if answered by peon
https://github.com/ceph/ceph/pull/51258 Backport Bot
01:42 PM Backport #56603 (Rejected): octopus: ceph report missing osdmap_clean_epochs if answered by peon
Backport Bot
01:42 PM Backport #56602 (Resolved): quincy: ceph report missing osdmap_clean_epochs if answered by peon
https://github.com/ceph/ceph/pull/47928 Backport Bot
01:37 PM Bug #47273 (Pending Backport): ceph report missing osdmap_clean_epochs if answered by peon
Dan van der Ster
01:34 PM Bug #54511: test_pool_min_size: AssertionError: not clean before minsize thrashing starts
I was able to reproduce the problem after modifying qa/tasks/ceph_manager.py: https://github.com/ceph/ceph/pull/46931... Kamoltat (Junior) Sirivadhna
12:44 PM Bug #54511 (Fix Under Review): test_pool_min_size: AssertionError: not clean before minsize thras...
Kamoltat (Junior) Sirivadhna
01:16 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
I was able to reproduce the problem after modifying qa/tasks/ceph_manager.py: https://github.com/ceph/ceph/pull/46931... Kamoltat (Junior) Sirivadhna
12:44 PM Bug #51904 (Fix Under Review): test_pool_min_size:AssertionError:wait_for_clean:failed before tim...
Kamoltat (Junior) Sirivadhna
10:18 AM Bug #56575 (Fix Under Review): test_cls_lock.sh: ClsLock.TestExclusiveEphemeralStealEphemeral fai...
Nitzan Mordechai

07/17/2022

01:16 PM Bug #55001: rados/test.sh: Early exit right after LibRados global tests complete
/a/yuriw-2022-07-15_19:06:53-rados-wip-yuri-testing-2022-07-15-0950-octopus-distro-default-smithi/6932690 Matan Breizman
01:04 PM Bug #52621: cephx: verify_authorizer could not decrypt ticket info: error: bad magic in decode_de...
/a/yuriw-2022-07-15_19:06:53-rados-wip-yuri-testing-2022-07-15-0950-octopus-distro-default-smithi/6932687 Matan Breizman
09:03 AM Backport #56579 (In Progress): pacific: snapshots will not be deleted after upgrade from nautilus...
Matan Breizman
09:02 AM Backport #56578 (In Progress): quincy: snapshots will not be deleted after upgrade from nautilus ...
Matan Breizman
06:51 AM Bug #56575: test_cls_lock.sh: ClsLock.TestExclusiveEphemeralStealEphemeral fails from "method loc...
The lock expired, so the next ioctx.stat won't return -2 (-ENOENT) we need to change that as well based on r1 that re... Nitzan Mordechai

07/16/2022

03:18 PM Bug #56147: snapshots will not be deleted after upgrade from nautilus to pacific
This issue is fixed (including a unit test) and will be backported in order to prevent future clusters upgrades from ... Matan Breizman

07/15/2022

09:17 PM Cleanup #56581 (Fix Under Review): mon: fix ElectionLogic warnings
Laura Flores
09:06 PM Cleanup #56581 (Resolved): mon: fix ElectionLogic warnings
h3. Problem: compilation warnings in the ElectionLogic code... Laura Flores
08:58 PM Backport #56580 (In Progress): octopus: snapshots will not be deleted after upgrade from nautilus...
Neha Ojha
08:55 PM Backport #56580 (Resolved): octopus: snapshots will not be deleted after upgrade from nautilus to...
https://github.com/ceph/ceph/pull/47108 Backport Bot
08:55 PM Backport #56579 (Resolved): pacific: snapshots will not be deleted after upgrade from nautilus to...
https://github.com/ceph/ceph/pull/47134 Backport Bot
08:55 PM Backport #56578 (Resolved): quincy: snapshots will not be deleted after upgrade from nautilus to ...
https://github.com/ceph/ceph/pull/47133 Backport Bot
08:51 PM Bug #56147 (Pending Backport): snapshots will not be deleted after upgrade from nautilus to pacific
Neha Ojha
07:31 PM Bug #56574: rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down (OSD_DOWN)" in c...
/a/nojha-2022-07-15_14:45:04-rados-snapshot_key_conversion-distro-default-smithi/6932156 Laura Flores
07:23 PM Bug #56574 (Need More Info): rados/valgrind-leaks: cluster [WRN] Health check failed: 2 osds down...
Description: rados/valgrind-leaks/{1-start 2-inject-leak/osd centos_latest}
/a/nojha-2022-07-14_20:32:09-rados-sn...
Laura Flores
07:29 PM Bug #56575 (Pending Backport): test_cls_lock.sh: ClsLock.TestExclusiveEphemeralStealEphemeral fai...
/a/nojha-2022-07-14_20:32:09-rados-snapshot_key_conversion-distro-default-smithi/6930848... Laura Flores
12:09 PM Bug #56565 (Won't Fix): Not upgraded nautilus mons crash if upgraded pacific mon updates fsmap
I was just told there is a step in the upgrade documentation to set mon_mds_skip_sanity param before upgrade [1], whi... Mykola Golub
10:07 AM Bug #51168: ceph-osd state machine crash during peering process
Radoslaw Zarzynski wrote:
> The PG was in @ReplicaActive@ so we shouldn't see any backfill activity. A delayed event...
Yao Ning

07/14/2022

12:19 PM Bug #56565 (Won't Fix): Not upgraded nautilus mons crash if upgraded pacific mon updates fsmap
I have no idea if this needs to be fixed but at least the case looks worth reporting.
We faced the issue when upgr...
Mykola Golub

07/13/2022

07:48 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Noticed that this PR was newly included in 17.2.1, and it makes a change to GetApproximateSizes: https://github.com/c... Laura Flores
07:10 PM Backport #56551: quincy: mon/Elector: notify_ranked_removed() does not properly erase dead_ping i...
https://github.com/ceph/ceph/pull/47086 Kamoltat (Junior) Sirivadhna
06:55 PM Backport #56551 (Resolved): quincy: mon/Elector: notify_ranked_removed() does not properly erase ...
Backport Bot
07:09 PM Backport #56550 (In Progress): pacific: mon/Elector: notify_ranked_removed() does not properly er...
https://github.com/ceph/ceph/pull/47087 Kamoltat (Junior) Sirivadhna
06:55 PM Backport #56550 (Resolved): pacific: mon/Elector: notify_ranked_removed() does not properly erase...
Backport Bot
06:51 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
This looks like a test failure, so nor terribly high priority. Radoslaw Zarzynski
06:49 PM Bug #53342: Exiting scrub checking -- not all pgs scrubbed
Sridhar Seshasayee wrote:
> /a/yuriw-2022-06-29_18:22:37-rados-wip-yuri2-testing-2022-06-29-0820-distro-default-smit...
Neha Ojha
06:42 PM Bug #56438 (Need More Info): found snap mapper error on pg 3.bs0> oid 3:d81a0fb3:::smithi10749189...
Waiting for reoccurrences. Radoslaw Zarzynski
06:38 PM Bug #56439: mon/crush_ops.sh: Error ENOENT: no backward-compatible weight-set
Let's observe whether there will be reoccurances. Radoslaw Zarzynski
06:33 PM Bug #55450 (Resolved): [DOC] stretch_rule defined in the doc needs updation
Kamoltat (Junior) Sirivadhna
06:33 PM Bug #55450: [DOC] stretch_rule defined in the doc needs updation
Opensource contributor, github username: elacunza. Created https://github.com/ceph/ceph/pull/46170 and have resolved ... Kamoltat (Junior) Sirivadhna
06:33 PM Bug #56147 (Fix Under Review): snapshots will not be deleted after upgrade from nautilus to pacific
Radoslaw Zarzynski
06:31 PM Bug #56463 (Triaged): osd nodes with NVME try to run `smartctl` and `nvme` even when the tools ar...
They are called from @block_device_get_metrics()@ in @common/blkdev.cc@. Radoslaw Zarzynski
06:26 PM Bug #54485 (Resolved): doc/rados/operations/placement-groups/#automated-scaling: --bulk invalid c...
Kamoltat (Junior) Sirivadhna
06:23 PM Backport #54505 (Resolved): pacific: doc/rados/operations/placement-groups/#automated-scaling: --...
Kamoltat (Junior) Sirivadhna
06:22 PM Backport #54506 (Resolved): quincy: doc/rados/operations/placement-groups/#automated-scaling: --b...
Kamoltat (Junior) Sirivadhna
06:22 PM Bug #54576 (Resolved): cache tier set proxy faild
Fix merged. Radoslaw Zarzynski
06:19 PM Bug #55665 (Fix Under Review): osd: osd_fast_fail_on_connection_refused will cause the mon to con...
Radoslaw Zarzynski
06:11 PM Bug #51168: ceph-osd state machine crash during peering process
Nautilus is EOL now and it is also possible that we may have fixed such a bug after 14.2.18.
Can you tell me the P...
Neha Ojha
06:08 PM Bug #51168: ceph-osd state machine crash during peering process
The PG was in @ReplicaActive@ so we shouldn't see any backfill activity. A delayed event maybe? Radoslaw Zarzynski
06:04 PM Bug #51168: ceph-osd state machine crash during peering process
... Radoslaw Zarzynski
06:02 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
/a/ksirivad-2022-07-01_21:00:49-rados:thrash-erasure-code-main-distro-default-smithi/6910169/ first timed out and the... Neha Ojha
06:01 PM Bug #56192: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty())
Reoccurence reported in https://tracker.ceph.com/issues/51904#note-21. See also the replies:
* https://tracker.cep...
Radoslaw Zarzynski
05:57 PM Bug #49777 (In Progress): test_pool_min_size: 'check for active or peered' reached maximum tries ...
Kamoltat (Junior) Sirivadhna
05:46 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Hello Aishwarya! How about coworking on this? Ping me when you have time. Radoslaw Zarzynski
05:42 PM Bug #50242: test_repair_corrupted_obj fails with assert not inconsistent
Hello Ronen. It looks to be somehow scrub-related. Mind taking a look? Nothing urgent. Radoslaw Zarzynski
05:38 PM Bug #56392 (Resolved): ceph build warning: comparison of integer expressions of different signedness
Kamoltat (Junior) Sirivadhna
12:07 PM Feature #56543 (New): About the performance improvement of ceph's erasure code storage pool
Hello everyone:
Although I know that the erause code storage pool is not suitable for use in scenarios with many ran...
Sheng Xie

07/12/2022

10:29 PM Bug #56495 (Fix Under Review): Log at 1 when Throttle::get_or_fail() fails
Neha Ojha
01:57 PM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
Greg Farnum wrote:
> That said, I wouldn’t expect anything useful from running this — pool snaps are hard to use wel...
Dan van der Ster
01:06 PM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
That said, I wouldn’t expect anything useful from running this — pool snaps are hard to use well. What were you tryin... Greg Farnum
12:59 PM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
AFAICT this is just a RADOS issue? Greg Farnum
01:30 PM Backport #53339: pacific: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46767
merged
Yuri Weinstein
12:41 PM Bug #56530: Quincy: High CPU and slow progress during backfill
Thanks for looking at this. Answers to your questions:
1. Backfill started at around 4-5 objects per second, and t...
Chris Palmer
11:56 AM Bug #56530: Quincy: High CPU and slow progress during backfill
While we look into this, I have a couple of questions:
1. Did the recovery rate stay at 1 object/sec throughout? I...
Sridhar Seshasayee
11:16 AM Bug #56530 (Resolved): Quincy: High CPU and slow progress during backfill
I'm seeing a similar problem on a small cluster just upgraded from Pacific 16.2.9 to Quincy 17.2.1 (non-cephadm). The... Chris Palmer

07/11/2022

09:18 PM Bug #54396 (Resolved): Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the snapt...
Neha Ojha
09:17 PM Feature #55982 (Resolved): log the numbers of dups in PG Log
Neha Ojha
09:17 PM Backport #55985 (Resolved): octopus: log the numbers of dups in PG Log
Neha Ojha
01:35 PM Bug #54172: ceph version 16.2.7 PG scrubs not progressing
https://github.com/ceph/ceph/pull/46845 merged Yuri Weinstein
01:31 PM Backport #51287: pacific: LibRadosService.StatusFormat failed, Expected: (0) != (retry), actual: ...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46677
merged
Yuri Weinstein

07/08/2022

05:27 AM Backport #56498 (In Progress): quincy: Make the mClock config options related to [res, wgt, lim] ...
Sridhar Seshasayee
04:50 AM Backport #56498 (Resolved): quincy: Make the mClock config options related to [res, wgt, lim] mod...
https://github.com/ceph/ceph/pull/47020 Backport Bot
04:46 AM Bug #55153 (Pending Backport): Make the mClock config options related to [res, wgt, lim] modifiab...
Sridhar Seshasayee
01:48 AM Bug #56495 (Resolved): Log at 1 when Throttle::get_or_fail() fails
When trying to debug a throttle failure we currently need to set debug_ms=20 which can delay troubleshooting due to t... Brad Hubbard
01:00 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
I'll take a look Myoungwon Oh

07/07/2022

08:51 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
Potential Pacific occurrence? Although this one is catching on LibRadosTwoPoolsPP.CachePin rather than LibRadosTwoPoo... Laura Flores
03:44 PM Bug #55153: Make the mClock config options related to [res, wgt, lim] modifiable during runtime f...
https://github.com/ceph/ceph/pull/46700 merged Yuri Weinstein
01:21 AM Bug #56487: Error EPERM: problem getting command descriptions from mon, when execute "ceph -s".
similar issue as:
https://tracker.ceph.com/issues/36300
liqun zhang
01:20 AM Bug #56487: Error EPERM: problem getting command descriptions from mon, when execute "ceph -s".
in the case, cephx is disabled. test script as below:
#/usr/bin/bash
while true
do
echo `date` >> /tmp/o.log
r...
liqun zhang
01:18 AM Bug #56487 (New): Error EPERM: problem getting command descriptions from mon, when execute "ceph ...
version 15.2.13
disable cephx, and excute "ceph -s" every 1 second,
A great chance to reproduce this error. log as ...
liqun zhang

07/06/2022

11:19 PM Bug #36300: Clients receive "wrong fsid" error when CephX is disabled
Can you make a new ticket with your details and link to this one? We may have recreated a similar issue but the detai... Greg Farnum
03:14 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
@Myoungwon Oh - can you take a look at
http://pulpito.front.sepia.ceph.com/rfriedma-2022-07-05_18:14:55-rados-wip-...
Ronen Friedman
11:12 AM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Ronen Friedman wrote:
> Kamoltat Sirivadhna wrote:
> > /a/ksirivad-2022-07-01_21:00:49-rados:thrash-erasure-code-ma...
Ronen Friedman
11:10 AM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Kamoltat Sirivadhna wrote:
> /a/ksirivad-2022-07-01_21:00:49-rados:thrash-erasure-code-main-distro-default-smithi/69...
Ronen Friedman
10:47 AM Bug #51168: ceph-osd state machine crash during peering process
ceph-osd log on crashed osd uploaded Yao Ning

07/05/2022

02:32 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
/a/ksirivad-2022-07-01_21:00:49-rados:thrash-erasure-code-main-distro-default-smithi/6910169/ Kamoltat (Junior) Sirivadhna
02:06 PM Bug #54511: test_pool_min_size: AssertionError: not clean before minsize thrashing starts
/a/ksirivad-2022-07-01_21:00:49-rados:thrash-erasure-code-main-distro-default-smithi/6910103/ Kamoltat (Junior) Sirivadhna
09:17 AM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
Dan van der Ster wrote:
> Venky Shankar wrote:
> > Hi Dan,
> >
> > I need to check, but does the inconsistent ob...
Venky Shankar
07:24 AM Bug #55559: osd-backfill-stats.sh fails in TEST_backfill_ec_prim_out
Looks like we don't have the correct primary (was osd.1, changed to osd.4, and after the wait_for_clean was back to o... Nitzan Mordechai
01:51 AM Bug #36300: Clients receive "wrong fsid" error when CephX is disabled
#/usr/bin/bash
while true
do
echo `date` >> /tmp/o.log
ret=`ceph -s >> /tmp/o.log 2>&1 `
sleep 1
echo '' >> /t...
liqun zhang
01:28 AM Bug #36300: Clients receive "wrong fsid" error when CephX is disabled
version 15.2.13
disable cephx, and excute "ceph -s" every 1 second,
A great chance to reproduce this error. log as ...
liqun zhang
01:24 AM Bug #36300: Clients receive "wrong fsid" error when CephX is disabled
Mon Jul 4 15:31:19 CST 2022
2022-07-04T15:31:20.219+0800 7f8595551700 10 monclient: get_monmap_and_config
2022-07-0...
liqun zhang

07/04/2022

08:54 PM Backport #55981 (Resolved): quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
Ilya Dryomov
08:18 PM Bug #56463 (Triaged): osd nodes with NVME try to run `smartctl` and `nvme` even when the tools ar...
Using debian packages:
ceph-osd 17.2.1-1~bpo11+1
ceph-volume 17.2.1-1~bpo11+1
Every day some job runs wh...
Matthew Darwin
07:53 PM Backport #55746 (Resolved): quincy: Support blocklisting a CIDR range
Ilya Dryomov
05:48 PM Feature #55693 (Fix Under Review): Limit the Health Detail MSG log size in cluster logs
Prashant D

07/03/2022

12:49 PM Bug #56147: snapshots will not be deleted after upgrade from nautilus to pacific
Radoslaw Zarzynski wrote:
> Hello Matan! Does this snapshot issue ring a bell?
Introduced here:
https://github.c...
Matan Breizman

07/01/2022

05:36 PM Backport #54386: octopus: [RFE] Limit slow request details to mgr log
Ponnuvel P wrote:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/45154
...
Yuri Weinstein
04:17 PM Bug #56439 (New): mon/crush_ops.sh: Error ENOENT: no backward-compatible weight-set
/a/yuriw-2022-06-23_16:06:40-rados-wip-yuri7-testing-2022-06-23-0725-octopus-distro-default-smithi/6894952... Laura Flores
01:51 PM Bug #56392: ceph build warning: comparison of integer expressions of different signedness
Note: this warning was caused by merging https://github.com/ceph/ceph/pull/46029/ Kamoltat (Junior) Sirivadhna
01:40 PM Bug #55435 (Pending Backport): mon/Elector: notify_ranked_removed() does not properly erase dead_...
Kamoltat (Junior) Sirivadhna
01:17 PM Bug #55435 (Resolved): mon/Elector: notify_ranked_removed() does not properly erase dead_ping in ...
Kamoltat (Junior) Sirivadhna
01:16 PM Bug #55708 (Fix Under Review): Reducing 2 Monitors Causes Stray Daemon
Kamoltat (Junior) Sirivadhna
12:55 PM Bug #56438 (Need More Info): found snap mapper error on pg 3.bs0> oid 3:d81a0fb3:::smithi10749189...
/a/yuriw-2022-06-29_18:22:37-rados-wip-yuri2-testing-2022-06-29-0820-distro-default-smithi/6906226
The error looks...
Sridhar Seshasayee
12:29 PM Bug #53342: Exiting scrub checking -- not all pgs scrubbed
/a/yuriw-2022-06-29_18:22:37-rados-wip-yuri2-testing-2022-06-29-0820-distro-default-smithi/6906076
/a/yuriw-2022-06-...
Sridhar Seshasayee
09:21 AM Cleanup #52753 (Rejected): rbd cls : centos 8 warning
Ilya Dryomov
09:20 AM Cleanup #52753: rbd cls : centos 8 warning
Looks like this warning is no longer there with a newer g++:
https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=...
Ilya Dryomov
12:12 AM Backport #55983 (Resolved): quincy: log the numbers of dups in PG Log
Neha Ojha

06/30/2022

07:53 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
/a/yuriw-2022-06-29_13:30:16-rados-wip-yuri3-testing-2022-06-28-1737-distro-default-smithi/6905537 Kamoltat (Junior) Sirivadhna
07:41 PM Bug #50242 (New): test_repair_corrupted_obj fails with assert not inconsistent
Kamoltat (Junior) Sirivadhna
07:41 PM Bug #50242: test_repair_corrupted_obj fails with assert not inconsistent
/a/yuriw-2022-06-29_13:30:16-rados-wip-yuri3-testing-2022-06-28-1737-distro-default-smithi/6905523/ Kamoltat (Junior) Sirivadhna
07:30 PM Bug #55001: rados/test.sh: Early exit right after LibRados global tests complete
/a/yuriw-2022-06-29_13:30:16-rados-wip-yuri3-testing-2022-06-28-1737-distro-default-smithi/6905499 Kamoltat (Junior) Sirivadhna
12:44 PM Bug #56147: snapshots will not be deleted after upgrade from nautilus to pacific
Here I have a PR, which should fix the conversion on update
https://github.com/ceph/ceph/pull/46908
But what is w...
Manuel Lausch

06/29/2022

06:29 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
Not a terribly high priority. Radoslaw Zarzynski
06:24 PM Bug #48029: Exiting scrub checking -- not all pgs scrubbed.
The code that generated the exception is (from the @main@ branch):... Radoslaw Zarzynski
06:13 PM Bug #56392 (Fix Under Review): ceph build warning: comparison of integer expressions of different...
Neha Ojha
06:12 PM Bug #56393: failed to complete snap trimming before timeout
Could it be srub related? Radoslaw Zarzynski
06:08 PM Bug #56147 (New): snapshots will not be deleted after upgrade from nautilus to pacific
Hello Matan! Does this snapshot issue ring a bell? Radoslaw Zarzynski
06:03 PM Bug #46889: librados: crashed in service_daemon_update_status
Lowering the priority to match the BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2101415#c9. Radoslaw Zarzynski
05:55 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Yeah, this clearly looks like a race condition (likely around life time management).
Lowering to High as it happen...
Radoslaw Zarzynski
05:50 PM Bug #56101 (Need More Info): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function saf...
Well, it seems the logs on @dell-per320-4.gsslab.pnq.redhat.com:/home/core/tracker56101@ are on the default levels. S... Radoslaw Zarzynski
03:37 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
A Telemetry contact was able to provide their OSD log. There was not a coredump available anymore, but they were able... Laura Flores
03:48 PM Bug #56420 (New): ceph-object-store: there is no chunking in --op log
The current implementation assumes that huge amount of memory are always available.... Radoslaw Zarzynski

06/28/2022

08:06 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
Thank you Myoungwon! Laura Flores
08:01 PM Bug #53294 (Fix Under Review): rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuri...
Neha Ojha
05:13 AM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
https://github.com/ceph/ceph/pull/46866
I found that there is no reply if sending message with invalid pool inform...
Myoungwon Oh
02:22 AM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
I'll take a closer look. Myoungwon Oh
07:23 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
/a/lflores-2022-06-27_23:44:12-rados:thrash-erasure-code-wip-yuri2-testing-2022-04-26-1132-octopus-distro-default-smi... Laura Flores
04:31 PM Backport #54614 (Resolved): quincy: support truncation sequences in sparse reads
Jeff Layton
02:06 PM Backport #56408 (In Progress): quincy: ceph version 16.2.7 PG scrubs not progressing
Cory Snyder
02:00 PM Backport #56408 (Resolved): quincy: ceph version 16.2.7 PG scrubs not progressing
https://github.com/ceph/ceph/pull/46844 Backport Bot
02:05 PM Backport #56409 (In Progress): pacific: ceph version 16.2.7 PG scrubs not progressing
Cory Snyder
02:01 PM Backport #56409 (Resolved): pacific: ceph version 16.2.7 PG scrubs not progressing
https://github.com/ceph/ceph/pull/46845 Backport Bot
01:55 PM Bug #54172 (Pending Backport): ceph version 16.2.7 PG scrubs not progressing
Cory Snyder
01:10 PM Backport #50910 (Rejected): octopus: PGs always go into active+clean+scrubbing+deep+repair in the...
Will not be fixed on Octopus.
For future ref:
Fixed in main branch by 41258.
Ronen Friedman
12:36 PM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
Venky Shankar wrote:
> Hi Dan,
>
> I need to check, but does the inconsistent object warning show up only after r...
Dan van der Ster
10:01 AM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
Hi Dan,
I need to check, but does the inconsistent object warning show up only after reducing max_mds?
Venky Shankar
11:02 AM Bug #56147: snapshots will not be deleted after upgrade from nautilus to pacific
It seems to be a failure on conversion after upgrade
in the omap dump before the update with one deleted object in...
Manuel Lausch
06:25 AM Bug #46889: librados: crashed in service_daemon_update_status
Josh Durgin wrote:
> Are there any logs or coredump available? What version was this?
Sorry, I think I have misse...
Xiubo Li

06/27/2022

07:32 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Found an instance where this does not occur with minsize_recovery. It's possible that it's a different root cause, bu... Laura Flores
07:09 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
Myoungwon Oh wrote:
> I think this is the same issue as https://tracker.ceph.com/issues/53855.
I thought so too, ...
Laura Flores
06:53 PM Bug #56393 (New): failed to complete snap trimming before timeout
Description: rados/thrash-erasure-code-big/{ceph cluster/{12-osds openstack} mon_election/connectivity msgr-failures/... Laura Flores
06:19 PM Bug #56392 (Resolved): ceph build warning: comparison of integer expressions of different signedness
../src/mon/Elector.cc: In member function ‘void Elector::notify_rank_removed(int)’:
../src/mon/Elector.cc:733:20: wa...
Kamoltat (Junior) Sirivadhna
 

Also available in: Atom