Activity
From 02/29/2024 to 03/29/2024
Today
- 03:05 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Closing https://github.com/ceph/ceph/pull/55596 in favour of https://github.com/ceph/ceph/pull/56574
03/28/2024
- 11:14 AM Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- The OSDMap CRC issue is clearly there but I'm not sure / I doubt it can explain the scrub error.
Let's ask Ronen for... - 11:03 AM Bug #65186: OSDs unreachable in upgrade test
- ...
- 11:02 AM Backport #65198: squid: Failed to encode map X with expected CRC
- https://github.com/ceph/ceph/pull/56553
- 10:31 AM Backport #65198 (In Progress): squid: Failed to encode map X with expected CRC
- 10:29 AM Backport #65198 (In Progress): squid: Failed to encode map X with expected CRC
- 07:03 AM Bug #64824: mon: ceph-16.2.14/src/mon/Monitor.cc: 5661: FAILED ceph_assert(err == 0)
- ...
03/27/2024
- 10:35 PM Bug #64972: qa: "ceph tell 4.3a deep-scrub" command not found
- Laura Flores wrote:
> Strange, the syntax in the text snippet works in a vstart cluster:
> [...]
The issue, I be... - 10:09 PM Bug #64972: qa: "ceph tell 4.3a deep-scrub" command not found
- Strange, the syntax in the text snippet works in a vstart cluster:...
- 08:33 PM Bug #65186: OSDs unreachable in upgrade test
- /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7615991
- 08:29 PM Bug #65186: OSDs unreachable in upgrade test
- Possibly a dupe of the related tracker (crc encoding issues)
- 08:28 PM Bug #65186 (New): OSDs unreachable in upgrade test
- /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616011/remote/smithi087/log/a8e8c570-e819-11ee...
- 08:31 PM Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- Laura Flores wrote:
> /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098... - 08:21 PM Bug #65185 (New): OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098/log/b1f19696-e81a-11ee...
- 04:43 PM Bug #65183 (Fix Under Review): Overriding an EC pool needs the "--yes-i-really-mean-it" flag in a...
- 04:23 PM Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- Likely coming from this change:
https://github.com/ceph/ceph/pull/56287 - 04:23 PM Bug #65183 (Fix Under Review): Overriding an EC pool needs the "--yes-i-really-mean-it" flag in a...
- /a/yuriw-2024-03-26_14:32:05-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7623454...
- 12:07 PM Bug #51725 (Resolved): make bufferlist::c_str() skip rebuild when it isn't necessary
- 12:06 PM Backport #52595 (Rejected): pacific: make bufferlist::c_str() skip rebuild when it isn't necessary
- Pacific is EOL
- 12:06 PM Bug #51843 (Resolved): osd/scrub: OSD crashes at PG removal
- 12:06 PM Backport #53340 (Rejected): pacific: osd/scrub: OSD crashes at PG removal
- Pacific is EOL
- 12:05 PM Bug #53294 (Resolved): rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
- 12:05 PM Bug #49525 (Resolved): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 ...
- 12:04 PM Backport #55973 (Rejected): pacific: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi1...
- Pacific is EOL
- 12:04 PM Backport #56656 (Rejected): pacific: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlu...
- Pacific is EOL
- 12:02 PM Backport #64672 (Rejected): pacific: test_pool_min_size: AssertionError: wait_for_clean: failed b...
- Pacific is EOL
- 12:01 PM Backport #64410 (In Progress): quincy: map eXX had wrong heartbeat addr
- 12:00 PM Backport #64412 (In Progress): reef: map eXX had wrong heartbeat addr
- 11:58 AM Backport #64411 (Rejected): pacific: map eXX had wrong heartbeat addr
- Pacific is EOL
- 11:56 AM Backport #64407 (Rejected): pacific: Expected warnings that need to be whitelisted cause rados/ce...
- Pacific is EOL
- 11:56 AM Backport #64157 (Rejected): pacific: CommandFailedError (rados/test_python.sh): "RADOS object not...
- Pacific is EOL
- 11:55 AM Backport #59675 (Rejected): pacific: osd:tick checking mon for new map
- Pacific is EOL
- 11:54 AM Backport #58870 (Rejected): pacific: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- Pacific is EOL
- 11:16 AM Bug #57061: Use single cluster log level (mon_cluster_log_level) config to control verbosity of c...
- In QA.
- 11:15 AM Bug #64258: osd/PrimaryLogPG.cc: FAILED ceph_assert(inserted)
- Sent to QA.
- 09:19 AM Bug #54744: crash: void MonMap::add(const mon_info_t&): assert(addr_mons.count(a) == 0)
- The priority level is set to "minor" ... when the time comes that messenger v1 is deprecated ... operators will disab...
- 09:17 AM Bug #54744: crash: void MonMap::add(const mon_info_t&): assert(addr_mons.count(a) == 0)
- This should be fixed indeed. I wanted to disable msgv1 on this cluster. I already had set the flag "ceph config set m...
- 02:17 AM Feature #65163 (New): Rados:Provide options for data compression levels, specified with -l, to en...
03/26/2024
- 04:37 PM Bug #54439: LibRadosWatchNotify.WatchNotify2Multi fails
- /a/yuriw-2024-03-20_18:33:32-rados-wip-yuri6-testing-2024-03-18-1406-squid-distro-default-smithi/7613112...
- 04:26 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-03-20_18:33:32-rados-wip-yuri6-testing-2024-03-18-1406-squid-distro-default-smithi/7613235
- 04:20 PM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-03-20_18:33:32-rados-wip-yuri6-testing-2024-03-18-1406-squid-distro-default-smithi/7613108
- 02:45 PM Bug #64519: OSD/MON: No snapshot metadata keys trimming
- Adding 53545 as a candidate for fixing this issue, this will require additional documentation on how to use the tool ...
- 09:06 AM Bug #64519 (In Progress): OSD/MON: No snapshot metadata keys trimming
- https://tracker.ceph.com/issues/62983 should help with avoiding the gaps in the purged snaps ids intervals. As a resu...
- 12:57 PM Backport #65150 (In Progress): squid: cluster log: Cluster log level string representation missin...
- 11:59 AM Backport #65150 (In Progress): squid: cluster log: Cluster log level string representation missin...
- https://github.com/ceph/ceph/pull/56478
- 12:55 PM Backport #65151 (In Progress): squid: singleton/ec-inconsistent-hinfo.yaml: Include a possible be...
- 12:50 PM Backport #65151 (In Progress): squid: singleton/ec-inconsistent-hinfo.yaml: Include a possible be...
- https://github.com/ceph/ceph/pull/56477
- 11:57 AM Fix #64573 (Pending Backport): singleton/ec-inconsistent-hinfo.yaml: Include a possible benign cl...
- 04:32 AM Fix #64573 (Resolved): singleton/ec-inconsistent-hinfo.yaml: Include a possible benign cluster lo...
- 11:56 AM Bug #58436: ceph cluster log reporting log level in numeric format for the clog messages
- Radoslaw Zarzynski wrote:
> Do we need to backport?
Yes, this along with https://tracker.ceph.com/issues/64314 ne... - 11:56 AM Bug #64314 (Pending Backport): cluster log: Cluster log level string representation missing in th...
- 11:30 AM Backport #65141 (In Progress): reef: osd: modify PG deletion cost for mClock scheduler
- 07:27 AM Backport #65141 (In Progress): reef: osd: modify PG deletion cost for mClock scheduler
- https://github.com/ceph/ceph/pull/56475
- 11:26 AM Backport #65140 (In Progress): squid: osd: modify PG deletion cost for mClock scheduler
- 07:27 AM Backport #65140 (In Progress): squid: osd: modify PG deletion cost for mClock scheduler
- https://github.com/ceph/ceph/pull/56474
- 09:01 AM Bug #62983 (Resolved): OSD/MON: purged snap keys are not merged
- An alternative solution is to avoid incrementing the snapid on removal to avoid the gaps.
- 07:20 AM Bug #65139 (Pending Backport): osd: modify PG deletion cost for mClock scheduler
- 07:20 AM Bug #65139 (Pending Backport): osd: modify PG deletion cost for mClock scheduler
- With the osd_delete_sleep_ssd and osd_delete_sleep_hdd options disabled with mClock, it was noticed that PG deletion ...
- 04:35 AM Bug #62171 (Resolved): All OSD shards should use the same scheduler type when osd_op_queue=debug_...
- 04:34 AM Backport #63874 (Resolved): reef: All OSD shards should use the same scheduler type when osd_op_q...
- 04:32 AM Backport #64881 (Resolved): reef: singleton/ec-inconsistent-hinfo.yaml: Include a possible benign...
03/25/2024
- 09:37 PM Bug #63066: rados/objectstore - application not enabled on pool '.mgr'
- /a/yuriw-2024-03-22_13:10:42-rados-wip-yuri7-testing-2024-03-20-1625-quincy-distro-default-smithi/7616976
- 09:24 PM Backport #64881: reef: singleton/ec-inconsistent-hinfo.yaml: Include a possible benign cluster lo...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56151
merged - 09:24 PM Bug #64725: rados/singleton: application not enabled on pool 'rbd'
- /a/yuriw-2024-03-22_13:10:42-rados-wip-yuri7-testing-2024-03-20-1625-quincy-distro-default-smithi/7616657
- 09:23 PM Backport #63874: reef: All OSD shards should use the same scheduler type when osd_op_queue=debug_...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54981
merged - 09:05 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-03-22_13:09:48-rados-wip-yuri11-testing-2024-03-21-0851-reef-distro-default-smithi/7616706
- 08:57 PM Bug #59057 (Resolved): rados/test_envlibrados_for_rocksdb.sh: No rule to make target 'rocksdb_env...
- 06:39 PM Bug #64854: decoding chunk_refs_by_hash_t return wrong values
- In QA.
- 06:38 PM Bug #63891 (Fix Under Review): mon/AuthMonitor: fix potential repeated global_id
- Bump up.
- 06:37 PM Bug #64997 (Need More Info): There is always an osd process that takes up high cpu
- Note from bugscrub: need a summary here.
- 06:36 PM Bug #64670: LibRadosAioEC.RoundTrip2 hang and pkill
- Looks like a starvation?
- 06:28 PM Bug #64972: qa: "ceph tell 4.3a deep-scrub" command not found
- Radoslaw Zarzynski wrote:
> Patrick, are you posting the PR as a culprit?
yes, is it not? - 05:50 PM Bug #64972: qa: "ceph tell 4.3a deep-scrub" command not found
- Patrick, are you posting the PR as a culprit?
- 06:18 PM Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
- Bump up but not terribly high prio.
- 06:14 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- Bump up.
- 06:12 PM Bug #65044: osd/scrub: must disable reservation timeout for reserver-based requests
- Bump up.
- 06:03 PM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- A commend for local replication from Ronen (thanks!):...
- 08:38 AM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-03-25_00:22:23-rados-wip-yuri3-testing-2024-03-24-1519-distro-default-smithi/7620817
- 06:02 PM Bug #62209: can not promote object at readonly tier mode
- If the fix helps, we can reopen and merge.
- 03:49 AM Bug #62209: can not promote object at readonly tier mode
- Okay, thanks, I will try it. By the way, does ceph have other caching solutions now?
- 05:57 PM Backport #65081 (Resolved): squid: mon: MON_DOWN warnings when mons are first booting
- 05:57 PM Bug #56393: failed to complete snap trimming before timeout
- I will merge the PR mentioned by Matan above.
- 10:08 AM Bug #56393: failed to complete snap trimming before timeout
- Radoslaw Zarzynski wrote:
> Hi Matan,
> would you mind taking a look? Not a high priority.
I suspect that the ne... - 05:52 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Review in progress.
- 05:51 PM Bug #53240: full-object read crc is mismatch, because truncate modify oi.size and forget to clear...
- Bump up.
- 05:48 PM Backport #65121 (New): reef: PG autoscaler tuning => catastrophic ceph cluster crash
- 05:48 PM Backport #65120 (New): squid: PG autoscaler tuning => catastrophic ceph cluster crash
- 05:48 PM Backport #65119 (New): quincy: PG autoscaler tuning => catastrophic ceph cluster crash
- 05:45 PM Bug #64333 (Pending Backport): PG autoscaler tuning => catastrophic ceph cluster crash
- Zac has already created some backports.
- 03:12 PM Bug #64802 (Fix Under Review): rados: generalize stretch mode pg temp handling to be usable witho...
- 02:44 PM Bug #64802: rados: generalize stretch mode pg temp handling to be usable without stretch mode
- Okay so the latest change that I added will have two commands:...
- 01:45 PM Bug #63881 (Fix Under Review): Inaccurate pg splits/merges and pool deletion/creation on OSD mapgap
03/24/2024
- 12:05 PM Backport #65097 (In Progress): squid: ceph osd pool rmsnap clone object leak
- 11:58 AM Backport #65097 (In Progress): squid: ceph osd pool rmsnap clone object leak
- https://github.com/ceph/ceph/pull/56432
- 12:05 PM Backport #65096 (In Progress): reef: ceph osd pool rmsnap clone object leak
- 11:58 AM Backport #65096 (In Progress): reef: ceph osd pool rmsnap clone object leak
- https://github.com/ceph/ceph/pull/56431
- 12:04 PM Backport #65095 (In Progress): quincy: ceph osd pool rmsnap clone object leak
- 11:57 AM Backport #65095 (In Progress): quincy: ceph osd pool rmsnap clone object leak
- https://github.com/ceph/ceph/pull/56430
- 11:55 AM Bug #64646 (Pending Backport): ceph osd pool rmsnap clone object leak
- 09:46 AM Bug #64917 (Fix Under Review): SnapMapperTest.CheckObjectKeyFormat object key changed
03/23/2024
- 03:50 PM Bug #65044 (Fix Under Review): osd/scrub: must disable reservation timeout for reserver-based req...
03/22/2024
- 06:00 PM Bug #65090 (New): rados: the object most recently written on a pg won't be available for replica ...
- In practice, this means that at any time at least one object on each PG won't be available for replica read. On a po...
- 05:59 PM Bug #65086 (New): rados: replicas do not initialize their mlcod value upon activation, replica re...
- ...
- 05:55 PM Bug #65085 (New): rados: replica mlcod tends to lag by two cycles rather than one limiting replic...
- The replica and the primary populate RepModify::last_complete and RepGather::pg_local_last_complete prior to doing Pr...
- 04:08 PM Backport #65082 (In Progress): reef: mon: MON_DOWN warnings when mons are first booting
- 03:59 PM Backport #65082 (In Progress): reef: mon: MON_DOWN warnings when mons are first booting
- https://github.com/ceph/ceph/pull/56408
- 04:06 PM Backport #65081 (In Progress): squid: mon: MON_DOWN warnings when mons are first booting
- 03:59 PM Backport #65081 (Resolved): squid: mon: MON_DOWN warnings when mons are first booting
- https://github.com/ceph/ceph/pull/56407
- 03:56 PM Bug #64968 (Pending Backport): mon: MON_DOWN warnings when mons are first booting
- 02:41 PM Backport #62921 (In Progress): quincy: mon/MonmapMonitor: do not propose on error in prepare_update
- 02:39 PM Backport #62923 (In Progress): reef: mon/MonmapMonitor: do not propose on error in prepare_update
- 02:34 PM Backport #62922 (Rejected): pacific: mon/MonmapMonitor: do not propose on error in prepare_update
- EOL
- 01:31 PM Bug #65044 (In Progress): osd/scrub: must disable reservation timeout for reserver-based requests
- 11:41 AM Backport #65072 (New): squid: rados/thrash: slow reservation response from 1 (115547ms) in cluste...
- 11:34 AM Bug #64869 (Pending Backport): rados/thrash: slow reservation response from 1 (115547ms) in clust...
- 01:06 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Looking at the above crash which is referred to in https://github.com/ceph/ceph/pull/55596#issuecomment-2011798771
...
03/21/2024
- 11:51 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- 'last_complete_ondisk is updated to' only happens on the primary from PeeringState::calc_min_last_complete_ondisk(), ...
- 02:56 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- Also note, pgs from the data pool that couldn't serve from replica:...
- 02:52 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- Reproduced the issue again on a new dev environment.
Created env:... - 12:20 AM Bug #65013: replica_read not available on most recently updated objects in each PG
- Some definitions:
pg_info_t::last_update: most recent update seen by an OSD (primary or replica), should be the sa... - 03:11 PM Bug #65044 (Fix Under Review): osd/scrub: must disable reservation timeout for reserver-based req...
- 03:00 PM Backport #65042 (New): squid: librados: use CEPH_OSD_FLAG_FULL_FORCE for IoCtxImpl::remove
- 03:00 PM Backport #65041 (New): quincy: librados: use CEPH_OSD_FLAG_FULL_FORCE for IoCtxImpl::remove
- 03:00 PM Backport #65040 (New): reef: librados: use CEPH_OSD_FLAG_FULL_FORCE for IoCtxImpl::remove
- 03:00 PM Bug #64558 (Pending Backport): librados: use CEPH_OSD_FLAG_FULL_FORCE for IoCtxImpl::remove
- 09:23 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-03-19_00:09:45-rados-wip-yuri5-testing-2024-03-18-1144-distro-default-smithi/7609959
- 09:22 AM Bug #64917: SnapMapperTest.CheckObjectKeyFormat object key changed
- /a/yuriw-2024-03-19_00:09:45-rados-wip-yuri5-testing-2024-03-18-1144-distro-default-smithi/7609912
- 07:47 AM Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
- /a/yuriw-2024-03-19_00:09:45-rados-wip-yuri5-testing-2024-03-18-1144-distro-default-smithi/7609848
- 07:35 AM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-03-19_00:09:45-rados-wip-yuri5-testing-2024-03-18-1144-distro-default-smithi/7609843
03/20/2024
- 11:20 PM Backport #63879: quincy: tools/ceph_objectstore_tool: Support get/set/superblock
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/55014
merged - 11:15 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- In addition to the observation that mlcod is lagging by more than it should, there's a second detail:...
- 11:04 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- ...
- 10:47 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- A 4G image will be comprised of 1024 4M objects. The above script creates 64 pgs. The best possible case with the c...
- 10:26 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- The following tests are done on ad0cb1eb1609caa646abbbdf6ebccd4dfda0b417 from https://github.com/ceph/ceph/pull/56180...
- 08:57 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- Is that branch https://github.com/ceph/ceph/pull/56180/files ad0cb1eb1609caa646abbbdf6ebccd4dfda0b417 ?
- 08:40 PM Bug #65013: replica_read not available on most recently updated objects in each PG
- From that log line, looks like last_update is 162'9763 and mlcod is 162'9761 -- that looks like 2 updates behind?
- 08:25 PM Bug #65013 (New): replica_read not available on most recently updated objects in each PG
- In my dev environment, when trying to read from replica (leveraging crush_location), the osd rejects the requests and...
- 09:57 PM Backport #65014 (New): reef: rados/singleton: application not enabled on pool 'rbd'
- 09:51 PM Bug #64725 (Pending Backport): rados/singleton: application not enabled on pool 'rbd'
- 03:04 PM Bug #62209: can not promote object at readonly tier mode
- you can refer to this:https://github.com/ceph/ceph/pull/52672
- 02:59 PM Bug #64869: rados/thrash: slow reservation response from 1 (115547ms) in cluster log
- Updating the backport field per today's core-sync.
- 11:41 AM Bug #64869 (Fix Under Review): rados/thrash: slow reservation response from 1 (115547ms) in clust...
- 02:34 PM Bug #65008 (New): EC pool - PGs down even if min size is satisfied
- Hello I've been evaluating erasure coding ceph setup with following requirements:
- k+m 7+5
- 3 racks
- 5 hosts ... - 12:35 PM Bug #64670: LibRadosAioEC.RoundTrip2 hang and pkill
- ...
- 10:02 AM Bug #64866 (Fix Under Review): rados/test.sh: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.Wa...
- 09:24 AM Bug #64866: rados/test.sh: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify3/1 failed
- after checking deeply, only watch_check will give us the timeout return code.
the client log also shows that we wa... - 03:25 AM Bug #64997 (Need More Info): There is always an osd process that takes up high cpu
- refer to: https://github.com/rook/rook/issues/13901
- 03:24 AM Bug #63891: mon/AuthMonitor: fix potential repeated global_id
- does anyone see this issue?
03/19/2024
- 02:53 PM Backport #64396 (Resolved): quincy: mon: health store size growing infinitely
- 02:47 PM Backport #64396: quincy: mon: health store size growing infinitely
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/55549
merged - 02:45 PM Backport #63843: quincy: Add health error if one or more OSDs registered v1/v2 public ip addresse...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/55698
merged - 01:51 PM Bug #64333: PG autoscaler tuning => catastrophic ceph cluster crash
- During code reading found that @--force@ allows to overrule the stripe alignment rules. @--yes-i-really-mean-it@ is p...
- 01:14 PM Bug #64869 (In Progress): rados/thrash: slow reservation response from 1 (115547ms) in cluster log
- The problem is: how to differentiate between instances where one of the scrub reservation messages is queued as 'wait...
- 12:48 PM Bug #64866 (In Progress): rados/test.sh: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNo...
the client log shows cookie 94576533816384...- 12:20 PM Bug #64978 (New): from rgw suite: HEALTH_WARN Reduced data availability: 1 pg inactive, 1 pg peering
- from https://qa-proxy.ceph.com/teuthology/cbodley-2024-03-19_01:03:50-rgw-wip-cbodley-testing-distro-default-smithi/7...
- 05:28 AM Bug #64854 (Fix Under Review): decoding chunk_refs_by_hash_t return wrong values
- 02:20 AM Bug #64824: mon: ceph-16.2.14/src/mon/Monitor.cc: 5661: FAILED ceph_assert(err == 0)
- Radoslaw Zarzynski wrote:
> Would need logs with @debug_mon=20@ and @debug_rocksdb=20@ from period before the assert... - 02:01 AM Bug #62209: can not promote object at readonly tier mode
- In the scenario of cache tiering, are there any other solutions?
03/18/2024
- 07:29 PM Bug #64972: qa: "ceph tell 4.3a deep-scrub" command not found
- and https://github.com/ceph/ceph/pull/54214
- 07:29 PM Bug #64972 (New): qa: "ceph tell 4.3a deep-scrub" command not found
- ...
- 07:24 PM Bug #63967 (Resolved): qa/tasks/ceph.py: "ceph tell <pgid> deep_scrub" fails
- 06:56 PM Bug #64646: ceph osd pool rmsnap clone object leak
- In QA.
- 06:56 PM Bug #64854: decoding chunk_refs_by_hash_t return wrong values
- Hmm, I guess I saw a PR for that.
- 06:55 PM Bug #64824: mon: ceph-16.2.14/src/mon/Monitor.cc: 5661: FAILED ceph_assert(err == 0)
- Would need logs with @debug_mon=20@ and @debug_rocksdb=20@ from period before the assertion.
- 06:51 PM Bug #64670: LibRadosAioEC.RoundTrip2 hang and pkill
- Nothing new but still observing. Bump up.
- 06:50 PM Bug #64866: rados/test.sh: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify3/1 failed
- Hi Nitzan! Would you mind taking a look?
- 06:49 PM Bug #64863: rados/thrash-old-clients: Health detail: HEALTH_WARN 1/3 mons down, quorum a,c in clu...
- Hmm, I think I saw Laura's PR for @MON_DOWN@.
- 06:44 PM Bug #58436: ceph cluster log reporting log level in numeric format for the clog messages
- Do we need to backport?
- 06:43 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- In QA.
- 06:36 PM Bug #64558: librados: use CEPH_OSD_FLAG_FULL_FORCE for IoCtxImpl::remove
- Sent to QA.
- 06:28 PM Bug #57782 (Fix Under Review): [mon] high cpu usage by fn_monstore thread
- The fix awaits QA.
- 06:26 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- Passed QA.
- 06:25 PM Bug #64938: Pool created with single PG splits into many on single OSD causes OSD to hit max_pgs_...
- Reviewed.
- 06:21 PM Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- https://github.com/ceph/ceph/pull/54492 merged
- 06:12 PM Bug #64968 (Fix Under Review): mon: MON_DOWN warnings when mons are first booting
- 04:11 PM Bug #64968 (Pending Backport): mon: MON_DOWN warnings when mons are first booting
- ...
- 05:58 PM Bug #56393: failed to complete snap trimming before timeout
- Hi Matan,
would you mind taking a look? Not a high priority. - 01:53 PM Bug #56393: failed to complete snap trimming before timeout
- /a/yuriw-2024-03-15_19:59:43-rados-wip-yuri6-testing-2024-03-15-0709-distro-default-smithi/7603381/...
- 05:52 PM Bug #64347: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- In QA.
- 04:26 PM Bug #64347: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- /a/yuriw-2024-03-15_19:59:43-rados-wip-yuri6-testing-2024-03-15-0709-distro-default-smithi/7603610/
- 05:48 PM Bug #64917: SnapMapperTest.CheckObjectKeyFormat object key changed
- I think this is already tackled by https://github.com/ceph/ceph/pull/56142.
Assigning to Matan for confirmation. I... - 04:31 PM Bug #64917: SnapMapperTest.CheckObjectKeyFormat object key changed
- /a/yuriw-2024-03-15_19:59:43-rados-wip-yuri6-testing-2024-03-15-0709-distro-default-smithi/7603418/
/a/yuriw-2024-03... - 05:43 PM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- Bump up.
- 05:12 PM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-03-15_19:59:43-rados-wip-yuri6-testing-2024-03-15-0709-distro-default-smithi/7603349
- 05:41 PM Bug #53240: full-object read crc is mismatch, because truncate modify oi.size and forget to clear...
- In QA.
- 05:40 PM Bug #64333: PG autoscaler tuning => catastrophic ceph cluster crash
- I'm going to propose a patch removing the @--force@.
- 05:39 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Bump up.
03/16/2024
03/15/2024
- 10:49 PM Bug #64802: rados: generalize stretch mode pg temp handling to be usable without stretch mode
- I recently created a draft PR https://github.com/ceph/ceph/pull/56233/, adding the additional arguments peering_bucke...
- 10:14 PM Bug #64802: rados: generalize stretch mode pg temp handling to be usable without stretch mode
- WIP PR: https://github.com/ceph/ceph/pull/56233
- 09:02 AM Bug #56393: failed to complete snap trimming before timeout
- /a/yuriw-2024-03-13_19:25:03-rados-wip-yuri6-testing-2024-03-12-0858-distro-default-smithi/7597884
/a/yuriw-2024-03-... - 08:06 AM Bug #64942 (New): rados/verify: valgrind reports "Invalid read of size 8" error.
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587319
/a/yuriw-2024-03-... - 01:01 AM Bug #64938 (Fix Under Review): Pool created with single PG splits into many on single OSD causes ...
- 12:51 AM Bug #64938 (Fix Under Review): Pool created with single PG splits into many on single OSD causes ...
- With autoscale mode ON, if a new pool is created without specifying the pg_num/pgp_num values then the pool gets crea...
03/14/2024
- 02:18 PM Bug #64802: rados: generalize stretch mode pg temp handling to be usable without stretch mode
- peering_crush_bucket_[count|target|barrier]
- 01:47 PM Bug #64802: rados: generalize stretch mode pg temp handling to be usable without stretch mode
- Don't forget that there is also pg_pool_t::peering_crush_bucket_count that directly requires a minimum number of high...
- 12:38 PM Bug #64802: rados: generalize stretch mode pg temp handling to be usable without stretch mode
- My plan current script to setup a vstart to test out the above hypothesis:...
- 01:17 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-03-13_19:26:09-rados-wip-yuri-testing-2024-03-12-1240-reef-distro-default-smithi/7598397
/a/yuriw-2024... - 12:57 PM Backport #63559: reef: Heartbeat crash in osd
- /a/yuriw-2024-03-13_19:26:09-rados-wip-yuri-testing-2024-03-12-1240-reef-distro-default-smithi/7598201
- 11:00 AM Bug #64917 (Fix Under Review): SnapMapperTest.CheckObjectKeyFormat object key changed
- /a/yuriw-2024-03-12_18:29:22-rados-wip-yuri8-testing-2024-03-11-1138-distro-default-smithi/7594695...
03/13/2024
- 04:44 PM Bug #57782: [mon] high cpu usage by fn_monstore thread
- Hi,
Thanks to this article https://blog.palark.com/sre-troubleshooting-ceph-systemd-containerd/, I think root caus... - 01:34 PM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Ilya Dryomov wrote:
> No, snap2 would continue to exist and one should be able to "rollback" to it. Rollback is rea... - 10:16 AM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Matan Breizman wrote:
> Ilya Dryomov wrote:
> > Put another way: rollback is a destructive operation. One isn't ex... - 10:00 AM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Ilya Dryomov wrote:
> Put another way: rollback is a destructive operation. One isn't expected to be able to go bac... - 01:15 PM Bug #64897 (New): unittest_ceph_crypto - valgrind failed
- running unit-test with valgraind:
ctest -R unittest_ceph_crypto -T memcheck... - 01:14 PM Bug #64895 (New): unittest_perf_counters_cache - valgrind failed
running unit-test with valgraind:
ctest -R unittest_perf_counters_cache -T memcheck...- 01:13 PM Bug #64893 (New): unittest_bufferlist - valgrind failed
- running unit-test with valgraind:
ctest -R unittest_bufferlist -T memcheck... - 01:11 PM Bug #64892 (New): unittest_ipaddr - valgrind failed
- running unit-test with valgraind:
ctest -R unittest_ipaddr -T memcheck... - 01:08 PM Bug #64891 (New): unittest_admin_socket - valgrind failed
- running unit-test with valgraind:
ctest -R unittest_admin_socket -T memcheck... - 08:08 AM Backport #64881 (In Progress): reef: singleton/ec-inconsistent-hinfo.yaml: Include a possible ben...
- 07:34 AM Backport #64881 (Resolved): reef: singleton/ec-inconsistent-hinfo.yaml: Include a possible benign...
- https://github.com/ceph/ceph/pull/56151
- 07:32 AM Bug #64314 (Resolved): cluster log: Cluster log level string representation missing in the cluste...
- 07:30 AM Fix #64573 (Pending Backport): singleton/ec-inconsistent-hinfo.yaml: Include a possible benign cl...
- 05:26 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Brad Hubbard wrote:
> Nitzan Mordechai wrote:
> > now the segfault happens on check_one function where we also have... - 02:27 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Nitzan Mordechai wrote:
> now the segfault happens on check_one function where we also have pre-regex to truncate th...
03/12/2024
- 08:29 PM Bug #64725 (Fix Under Review): rados/singleton: application not enabled on pool 'rbd'
- 01:48 PM Bug #64725: rados/singleton: application not enabled on pool 'rbd'
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587549
- 06:21 PM Bug #58436: ceph cluster log reporting log level in numeric format for the clog messages
- https://github.com/ceph/ceph/pull/49730 merged
- 05:03 PM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Ilya Dryomov wrote:
> This is because rollback discards all changes made to image HEAD and makes it identical to the... - 04:30 PM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Matan Breizman wrote:
> the suggested change here suggests that the disk usage should actually be:
> NAME ... - 04:13 PM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Hi Matan,
We are able to roll back back and forth between arbitrary snapshots and the suggested change in https://... - 02:24 PM Bug #64735 (Need More Info): OSD/MON: rollback_to snap the latest overlap is not right
- We should first understand whether this is a bug or intentional behavior, given the following order of operations:
<... - 03:35 PM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-03-08_16:19:51-rados-wip-yuri2-testing-2024-03-01-1606-distro-default-smithi/7587184
- 01:20 PM Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587334
- 03:33 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-03-08_16:19:51-rados-wip-yuri2-testing-2024-03-01-1606-distro-default-smithi/7587174/
- 01:18 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587531
/a/yuriw-2024-03-... - 02:15 PM Bug #64869 (Pending Backport): rados/thrash: slow reservation response from 1 (115547ms) in clust...
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587833
The cluster log... - 01:27 PM Bug #64866 (Fix Under Review): rados/test.sh: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.Wa...
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587349
There was a sim... - 01:19 PM Bug #62832 (Resolved): common: config_proxy deadlock during shutdown (and possibly other times)
- 01:19 PM Backport #63457 (Resolved): quincy: common: config_proxy deadlock during shutdown (and possibly o...
- 12:44 PM Bug #64863 (New): rados/thrash-old-clients: Health detail: HEALTH_WARN 1/3 mons down, quorum a,c ...
- The following tests in the rados suite failed with the warning:
/a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testi... - 12:21 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587455
- 11:26 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- now the segfault happens on check_one function where we also have pre-regex to truncate the output that causing segfa...
- 07:55 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- according to the console logs:...
- 04:22 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Radoslaw Zarzynski wrote:
> The fix isn't merged yet which could explain the reoccurrence above
The run mentioned... - 08:28 AM Bug #64514 (Duplicate): LibRadosTwoPoolsPP.PromoteSnapScrub test failed
- Closing as this is a duplicate.
- 08:27 AM Bug #64646: ceph osd pool rmsnap clone object leak
- Radoslaw Zarzynski wrote:
> Need a squid backport as well.
Awaiting main merge (https://github.com/ceph/ceph/pull... - 06:31 AM Bug #64854 (Fix Under Review): decoding chunk_refs_by_hash_t return wrong values
- When running ceph dencoder test on clang-14 compiled JSON dump of chunk_refs_by_hash_t will show:...
- 06:02 AM Bug #56393: failed to complete snap trimming before timeout
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587430
/a/yuriw-2024-03-... - 02:08 AM Bug #64824: mon: ceph-16.2.14/src/mon/Monitor.cc: 5661: FAILED ceph_assert(err == 0)
- Radoslaw Zarzynski wrote:
> Looks like a mon-scrub failure. This can be caused by a HW issue or by a corruption.
> ...
03/11/2024
- 08:55 PM Bug #64438: NeoRadosWatchNotify.WatchNotifyTimeout times out along with FAILED ceph_assert(op->se...
- Fails here in the neorados test:...
- 07:18 PM Feature #64849 (New): rados: Support read_from_replica everywhere
- The Objecter supports read-from-replica if you pass in the LOCALIZE_READS flag. If we want to serve all read IO from ...
- 06:40 PM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- There is PR posted: https://github.com/ceph/ceph/pull/55991
- 06:06 PM Bug #64735: OSD/MON: rollback_to snap the latest overlap is not right
- Hi Matan! Would you mind taking a look?
- 06:18 PM Bug #64670: LibRadosAioEC.RoundTrip2 hang and pkill
- Bump up.
- 06:16 PM Bug #54182: OSD_TOO_MANY_REPAIRS cannot be cleared in >=Octopus
- Review in progress.
- 06:15 PM Bug #64514: LibRadosTwoPoolsPP.PromoteSnapScrub test failed
- Bump up.
- 06:09 PM Bug #64725: rados/singleton: application not enabled on pool 'rbd'
- Fix is to add this to the ignorelist.
- 06:02 PM Bug #64646: ceph osd pool rmsnap clone object leak
- Need a squid backport as well.
- 06:00 PM Bug #64824 (Need More Info): mon: ceph-16.2.14/src/mon/Monitor.cc: 5661: FAILED ceph_assert(err =...
- Looks like a mon-scrub failure. This can be caused by a HW issue or by a corruption.
Is there a sign of malfunctioni... - 08:24 AM Bug #64824 (Need More Info): mon: ceph-16.2.14/src/mon/Monitor.cc: 5661: FAILED ceph_assert(err =...
- -1> 2024-03-11T02:29:03.716+0000 7f6600eaf700 -1 /root/rpmbuild/BUILD/ceph-16.2.14/src/mon/Monitor.cc: In functio...
- 05:55 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- The fix isn't merged yet which could explain the reoccurrence above
- 02:45 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587684
/a/yuriw-2024-03-... - 05:51 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Bump up.
- 05:50 PM Bug #64333: PG autoscaler tuning => catastrophic ceph cluster crash
- 1. I'm still nor sure we need @--force@. 2. If it turns justified, shouldn't it be @--yes-i-really-really-mean-it@?
- 05:42 PM Bug #64314: cluster log: Cluster log level string representation missing in the cluster logs.
- Still in testing.
03/10/2024
- 07:37 AM Bug #64657 (Rejected): Ceph test cases starting cluster not waiting for OSDs to join fully
- 茁野 鲍 Thanks for letting us know!
i'll reject that bug
03/08/2024
- 11:50 PM Bug #64804 (Duplicate): gcc-13 apparently breaks SafeTimer
- 04:07 AM Bug #64804 (Duplicate): gcc-13 apparently breaks SafeTimer
- https://github.com/ceph/ceph/pull/55886
Probably related to https://bugzilla.redhat.com/show_bug.cgi?id=2241339 . - 10:19 AM Bug #62338: osd: choose_async_recovery_ec may select an acting set < min_size
- Hello again.
Apparently I got a tiny little bit too excited.
I tested the case described above with 16.2.15 and... - 12:26 AM Bug #64802 (Fix Under Review): rados: generalize stretch mode pg temp handling to be usable witho...
- PeeringState::calc_replicated_acting_stretch encodes special behavior for stretch clusters which prohibits the primar...
03/07/2024
- 12:17 PM Bug #64788 (Fix Under Review): EpollDriver::del_event() crashes when the nic is unplugged
- 11:48 AM Bug #64788 (Fix Under Review): EpollDriver::del_event() crashes when the nic is unplugged
- librbd uses msgr to talk to its Ceph cluster. if the client's nic is hot unplugged, there is chance that @EpollDriver...
- 09:04 AM Bug #64657: Ceph test cases starting cluster not waiting for OSDs to join fully
- Thank you for addressing this issue. I appreciate your effort in fixing the issue.
I apologize for the oversight o...
03/06/2024
- 07:15 PM Bug #64726: LibRadosAioEC.MultiWritePP hang and pkill
- ...
- 07:14 PM Bug #64726: LibRadosAioEC.MultiWritePP hang and pkill
- I think the direct reason behind the test's hang is the death of @osd.5@:...
- 08:22 AM Bug #64726: LibRadosAioEC.MultiWritePP hang and pkill
- removed the "Related issues"
- 08:21 AM Bug #64726: LibRadosAioEC.MultiWritePP hang and pkill
- last op that LibRadosAioEC.MultiWritePP trying to do is writing the oid_MultiWritePP_ obj:...
- 03:20 PM Bug #63389: Failed to encode map X with expected CRC
- The problem came because of a commit that introduced the commented-out check for @SERVER_REEF@ in @OSDMap::encode()@....
- 08:21 AM Bug #64735 (Need More Info): OSD/MON: rollback_to snap the latest overlap is not right
- when rollback_to snap, we use the latest clone's current overlap to intersection_of older snapshot's clone overlap.
... - 07:43 AM Bug #62338: osd: choose_async_recovery_ec may select an acting set < min_size
- Hello. Just FYI, this fixes a very nasty issue in my EC setup.
Here are some details.
The EC setup and crush rule...
03/05/2024
- 10:50 PM Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-03-04_20:52:58-rados-reef-release-distro-default-smithi/7581448
- 10:47 PM Bug #64726 (New): LibRadosAioEC.MultiWritePP hang and pkill
- /a/yuriw-2024-03-04_20:52:58-rados-reef-release-distro-default-smithi/7581519...
- 10:42 PM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
- /a/yuriw-2024-03-04_20:52:58-rados-reef-release-distro-default-smithi/7581575
- 10:33 PM Bug #64725 (Pending Backport): rados/singleton: application not enabled on pool 'rbd'
- /a/yuriw-2024-03-04_20:52:58-rados-reef-release-distro-default-smithi/7581526...
- 10:24 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-03-04_20:52:58-rados-reef-release-distro-default-smithi/7581722
/a/yuriw-2024-03-04_20:52:58-rados-ree... - 10:24 PM Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- Update on this: The PR is ready to be reviewed again.
- 01:04 PM Bug #64514 (In Progress): LibRadosTwoPoolsPP.PromoteSnapScrub test failed
- 01:04 PM Bug #64514: LibRadosTwoPoolsPP.PromoteSnapScrub test failed
- This may be related to bug fixed in https://tracker.ceph.com/issues/64347. However, the outcome here is different whi...
- 08:08 AM Bug #64657: Ceph test cases starting cluster not waiting for OSDs to join fully
- Without the full log it will be hard to tell if the symptoms that I see are exactly as 茁野 鲍 see, but we are missing t...
03/04/2024
- 09:19 PM Backport #63526 (Resolved): quincy: crash: int OSD::shutdown(): assert(end_time - start_time_func...
- 08:45 PM Bug #61140: crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd_fast_...
- https://github.com/ceph/ceph/pull/55134 merged
- 08:07 PM Backport #58337 (Rejected): pacific: mon-stretched_cluster: degraded stretched mode lead to Monit...
- 08:06 PM Backport #58337 (Duplicate): pacific: mon-stretched_cluster: degraded stretched mode lead to Moni...
- pacific is EOL
- 08:07 PM Bug #59271 (Resolved): mon: FAILED ceph_assert(osdmon()->is_writeable())
- 08:07 PM Backport #59700 (Rejected): pacific: mon: FAILED ceph_assert(osdmon()->is_writeable())
- pacific is EOL
- 08:06 PM Bug #57017 (Resolved): mon-stretched_cluster: degraded stretched mode lead to Monitor crash
- 08:00 PM Bug #64657: Ceph test cases starting cluster not waiting for OSDs to join fully
- Hi Nitzan! Would you mind taking a look?
- 07:59 PM Bug #64637: LeakPossiblyLost in BlueStore::_do_write_small() in osd
- Looks like typical symptom of (CPU/memory) starvation.
- 07:59 PM Bug #64646: ceph osd pool rmsnap clone object leak
- note from bug scrub: reviewed, went to QA.
- 07:58 PM Bug #64514: LibRadosTwoPoolsPP.PromoteSnapScrub test failed
- Bump up.
- 07:56 PM Bug #54182: OSD_TOO_MANY_REPAIRS cannot be cleared in >=Octopus
- note from bug scrub: reviewed, changes requested.
- 07:55 PM Bug #64670: LibRadosAioEC.RoundTrip2 hang and pkill
- Might be something new. Bump up and observe.
- 07:53 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- note from scrub: the PR is approved. Needs-qa.
- 07:51 PM Bug #64674 (Resolved): src/scripts/ceph-backport.sh
- I guess we don't need to backport anything.
- 07:49 PM Bug #64258: osd/PrimaryLogPG.cc: FAILED ceph_assert(inserted)
- note from bug scrub: reviewed.
- 01:40 PM Bug #64258 (Fix Under Review): osd/PrimaryLogPG.cc: FAILED ceph_assert(inserted)
- 07:49 PM Bug #64695: Aborted signal starting in AsyncConnection::send_message()
- ...
- 05:39 PM Bug #64695 (New): Aborted signal starting in AsyncConnection::send_message()
- /a/yuriw-2024-03-01_16:47:30-rados-wip-yuri11-testing-2024-02-28-0950-reef-distro-default-smithi/7577623...
- 07:44 PM Bug #64314: cluster log: Cluster log level string representation missing in the cluster logs.
- Still in QA. Bump up.
- 07:36 PM Bug #64333: PG autoscaler tuning => catastrophic ceph cluster crash
- Thank you very, very much for the scenario! This throws a lot of light on what has happened.
I'm not sure whether th... - 07:32 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- note from bug scrub: Aishwarya is addressing the review's comments.
- 06:27 PM Bug #53240: full-object read crc is mismatch, because truncate modify oi.size and forget to clear...
- The fix goes into QA.
- 12:18 AM Bug #63066: rados/objectstore - application not enabled on pool '.mgr'
- /a/yuriw-2024-02-28_15:47:41-rados-wip-yuri4-testing-2024-02-27-1111-quincy-distro-default-smithi/7575815
/a/yuriw-2...
03/01/2024
- 11:19 PM Bug #64674: src/scripts/ceph-backport.sh
- revert PR: https://github.com/ceph/ceph/pull/55884
will fix this - 11:16 PM Bug #64674 (Resolved): src/scripts/ceph-backport.sh
- src/script/ceph-backport.sh: line 1737: ../../../ceph/.github/pull_request_template.md: No such file or directory
... - 11:01 PM Backport #64673 (In Progress): quincy: test_pool_min_size: AssertionError: wait_for_clean: failed...
- 10:58 PM Backport #64673 (In Progress): quincy: test_pool_min_size: AssertionError: wait_for_clean: failed...
- https://github.com/ceph/ceph/pull/55882
- 10:58 PM Backport #64672 (Rejected): pacific: test_pool_min_size: AssertionError: wait_for_clean: failed b...
- 10:58 PM Backport #64671 (New): reef: test_pool_min_size: AssertionError: wait_for_clean: failed before ti...
- 10:55 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-02-28_22:53:11-rados-wip-yuri2-testing-2024-02-16-0829-reef-distro-default-smithi/7576306
- 10:54 PM Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-02-28_22:53:11-rados-wip-yuri2-testing-2024-02-16-0829-reef-distro-default-smithi/7576311
- 10:53 PM Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-02-28_22:53:11-rados-wip-yuri2-testing-2024-02-16-0829-reef-distro-default-smithi/7576314
- 09:30 PM Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-02-28_22:53:11-rados-wip-yuri2-testing-2024-02-16-0829-reef-distro-default-smithi/7576298
- 10:53 PM Bug #59172 (Pending Backport): test_pool_min_size: AssertionError: wait_for_clean: failed before ...
- 10:51 PM Bug #64670 (New): LibRadosAioEC.RoundTrip2 hang and pkill
- /a/yuriw-2024-02-28_22:53:11-rados-wip-yuri2-testing-2024-02-16-0829-reef-distro-default-smithi/7576303...
- 12:11 PM Backport #64649 (In Progress): quincy: min_last_epoch_clean is not updated, causing osdmap to be ...
- 12:00 PM Backport #64650 (In Progress): reef: min_last_epoch_clean is not updated, causing osdmap to be un...
- 11:44 AM Backport #64651 (In Progress): squid: min_last_epoch_clean is not updated, causing osdmap to be u...
- 09:19 AM Bug #64657: Ceph test cases starting cluster not waiting for OSDs to join fully
- eg. for reproduce the issue:
diff slicer-src/src/test/osd/safe-to-destroy.sh
function run() {
@@ -32,18 +32,3... - 09:12 AM Bug #64657 (Rejected): Ceph test cases starting cluster not waiting for OSDs to join fully
- I've identified an issue in the Ceph testing framework where, after starting a temporary cluster using functions like...
02/29/2024
- 09:25 PM Backport #64406: reef: Failed to encode map X with expected CRC
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/55712
merged - 09:00 PM Bug #64637: LeakPossiblyLost in BlueStore::_do_write_small() in osd
- Laura Flores wrote:
> /a/yuriw-2024-02-22_21:33:08-rados-wip-yuri8-testing-2024-02-22-0734-reef-distro-default-smith... - 09:00 PM Bug #64637 (New): LeakPossiblyLost in BlueStore::_do_write_small() in osd
- 08:57 PM Bug #64637 (Duplicate): LeakPossiblyLost in BlueStore::_do_write_small() in osd
- 08:54 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- /a/yuriw-2024-02-28_22:39:54-rados-wip-yuri8-testing-2024-02-22-0734-reef-distro-default-smithi/7576288
- 08:42 PM Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-02-28_22:39:54-rados-wip-yuri8-testing-2024-02-22-0734-reef-distro-default-smithi/7576292
- 06:26 PM Backport #64651 (In Progress): squid: min_last_epoch_clean is not updated, causing osdmap to be u...
- https://github.com/ceph/ceph/pull/55865
- 06:15 PM Backport #64650 (In Progress): reef: min_last_epoch_clean is not updated, causing osdmap to be un...
- https://github.com/ceph/ceph/pull/55867
- 06:15 PM Backport #64649 (In Progress): quincy: min_last_epoch_clean is not updated, causing osdmap to be ...
- https://github.com/ceph/ceph/pull/55868
- 06:08 PM Bug #63883 (Pending Backport): min_last_epoch_clean is not updated, causing osdmap to be unable t...
- 02:46 PM Bug #64646 (Pending Backport): ceph osd pool rmsnap clone object leak
- There are 2 ways to remove pool snaps, rados tool or mon command (ceph osd pool rmsnap).
It seems that the monitor c... - 07:02 AM Bug #53342: Exiting scrub checking -- not all pgs scrubbed
- Radoslaw Zarzynski wrote:
> Ronen, do we need any backporting?
No. The fix (55478) made it in time for Squid.
Also available in: Atom