Project

General

Profile

Activity

From 04/09/2024 to 05/08/2024

Today

09:31 PM rbd Backport #65587 (Resolved): squid: insufficient randomness for group and group snapshot IDs
Ilya Dryomov
08:49 PM Infrastructure Bug #63531: Error authenticating with smithiXXX.front.sepia.ceph.com: SSHException('No existing session') (No SSH private key found!)
/a/lflores-2024-05-08_14:59:36-rados-wip-lflores-testing-2-2024-05-07-1606-squid-distro-default-smithi/7697531 Laura Flores
08:34 PM Ceph Backport #65871 (In Progress): quincy: common/StackStringStream: update pointer to newly allocated memory in overflow()
Patrick Donnelly
08:32 PM Ceph Backport #65871 (In Progress): quincy: common/StackStringStream: update pointer to newly allocated memory in overflow()
https://github.com/ceph/ceph/pull/57363 Patrick Donnelly
08:34 PM Ceph Backport #65870 (In Progress): reef: common/StackStringStream: update pointer to newly allocated memory in overflow()
Patrick Donnelly
08:31 PM Ceph Backport #65870 (In Progress): reef: common/StackStringStream: update pointer to newly allocated memory in overflow()
https://github.com/ceph/ceph/pull/57362 Patrick Donnelly
08:33 PM Ceph Backport #65869 (In Progress): squid: common/StackStringStream: update pointer to newly allocated memory in overflow()
Patrick Donnelly
08:31 PM Ceph Backport #65869 (In Progress): squid: common/StackStringStream: update pointer to newly allocated memory in overflow()
https://github.com/ceph/ceph/pull/57361 Patrick Donnelly
08:32 PM Ceph Bug #65805: common/StackStringStream: update pointer to newly allocated memory in overflow()
Rongqi Sun wrote in #note-4:
> Seems like backport bot doesn't work?
Fixed it (you were not in the "Ceph Develope...
Patrick Donnelly
06:07 AM Ceph Bug #65805: common/StackStringStream: update pointer to newly allocated memory in overflow()
Seems like backport bot doesn't work? Rongqi Sun
02:51 AM Ceph Bug #65805 (Pending Backport): common/StackStringStream: update pointer to newly allocated memory in overflow()
Patrick Donnelly
08:25 PM Ceph Bug #63557: NVMe-oF gateway prometheus endpoints
@pcuzner ok to close this now? Ken Dreyer
08:06 PM CephFS Tasks #64165 (In Progress): Fix warnings in read_sync()
Christopher Hoffman
07:51 PM teuthology Bug #65868: [Dependencies] Ansible Galaxy: 'CustomHTTPSConnection' object has no attribute 'cert_file'. 'CustomHTTPSConnection' object has no attribute 'cert_file'
PR that fixes this issue: https://github.com/ceph/teuthology/pull/1937 Kamoltat (Junior) Sirivadhna
07:50 PM teuthology Bug #65868 (New): [Dependencies] Ansible Galaxy: 'CustomHTTPSConnection' object has no attribute 'cert_file'. 'CustomHTTPSConnection' object has no attribute 'cert_file'
During docker-compose script.... Kamoltat (Junior) Sirivadhna
07:12 PM CephFS Bug #65618 (In Progress): qa: fsstress: cannot execute binary file: Exec format error
Looks to be related to the kclient and inline data. Patrick Donnelly
06:39 PM Ceph QA QA Run #65867 (QA Testing): wip-pdonnell-testing-20240508.183908-debug
* "PR #57334":https://github.com/ceph/ceph/pull/57334 -- mds: remove erroneous debug message
* "PR #57329":https://g...
Patrick Donnelly
05:52 PM Ceph QA QA Run #65594 (QA Approved): wip-yuriw11-testing-20240501.200505-squid
Laura Flores
05:32 PM RADOS Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686989 Laura Flores
05:26 PM RADOS Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686880 Laura Flores
05:18 PM CephFS Tasks #63295 (Resolved): Access semantics
SetPolicyNonDir test is now done. It passes as it should. Christopher Hoffman
04:47 PM rgw Backport #65821 (Resolved): squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
Casey Bodley
04:18 PM RADOS Bug #63789: LibRadosIoEC test failure
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7687027 Laura Flores
03:16 PM rgw Backport #65244 (Resolved): squid: RGW/s3select : several issues, s3select related, some caused a crash.
Casey Bodley
03:15 PM rgw Backport #65666 (Resolved): squid: rgw/lc: A few buckets stuck in UNINITIAL state
Casey Bodley
03:13 PM Ceph QA QA Run #65859 (QA Testing): wip-lflores-testing-2-2024-05-07-1606-squid
Laura Flores
03:09 PM RADOS Bug #62934: unittest_osdmap (Subprocess aborted) during OSDMapTest.BUG_42485
seeing @unittest_osdmap (Subprocess aborted)@ failures on squid too, tagged for backport Casey Bodley
03:02 PM rgw Bug #65866 (New): reef: cannot build arrow with CMAKE_BUILD_TYPE=Debug
... Patrick Donnelly
01:09 PM CephFS Bug #63538 (Can't reproduce): mds: src/mds/Locker.cc: 2357: FAILED ceph_assert(!cap->is_new())
Patrick Donnelly
01:09 PM CephFS Bug #61950 (Can't reproduce): mds/OpenFileTable: match MAX_ITEMS_PER_OBJ does not honor osd_deep_scrub_large_omap_object_key_threshold
Patrick Donnelly
12:55 PM CephFS Feature #63468 (Pending Backport): mds/purgequeue: add l_pq_executed_ops counter
Patrick Donnelly
12:53 PM CephFS Feature #61903: pybind/mgr/volumes: add config to turn off subvolume deletion
Where is the fix that is under review? @rishabh-d-dave Patrick Donnelly
12:42 PM CephFS Bug #65865 (New): MDS/MDSMonitor: retval of mds fail cmd when non-existent MDS name is passed is zero
Passing a non-existent MDS's name to the command @ceph mds fail@ returns zero. It should be returning non-zero value ... Rishabh Dave
12:40 PM CephFS Bug #65864 (New): qa/cephfs: tests in TestMDSFail passes FS name to mds fail command
Rishabh Dave
12:31 PM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
These are the current actions tried:... Robert Sander
12:27 PM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
Can we please bump the severity a few levels?
There is loss of production as the MDS are currently not running. Ev...
Robert Sander
07:49 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
Robert Sander wrote in #note-6:
> Xiubo Li wrote in #note-4:
> > A new report from the ceph-user mail list: https:/...
Xiubo Li
11:22 AM Dashboard Bug #65863 (New): ceph-mixin - CephPGImbalance alert not honoring osd device class
CephPGImbalance alert expression... Andrew Mitroshin
11:08 AM CephFS Bug #65795 (Fix Under Review): cephfs_mirror: daemon status shows KeyError: 'directory_count'
Jos Collin
10:49 AM rgw Bug #65862 (Fix Under Review): rgw/cloud-transition: crash with notify->publish_commit
Soumya Koduri
10:47 AM rgw Bug #65862 (Fix Under Review): rgw/cloud-transition: crash with notify->publish_commit
Below crash is observed while running cloud-transition tests at scale -
Thread 594 "wp_thrd: 1, 0" received signal...
Soumya Koduri
10:04 AM rgw Bug #65861 (New): notifications: report an error when persistent queue deletion failed
assuming the topic deletion was successfull, we cannot send an error if the queue deletion failed.
since any consequ...
Yuval Lifshitz
09:47 AM CephFS Bug #65705 (Fix Under Review): qa: snaptest-multiple-capsnaps.sh failure
The ceph patch link: https://patchwork.kernel.org/project/ceph-devel/list/?series=851489&archive=both Xiubo Li
09:25 AM CephFS Bug #64730 (Fix Under Review): fs/misc/multiple_rsync.sh workunit times out
Venky Shankar
09:25 AM Dashboard Bug #65788 (Resolved): mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
Aashish Sharma
09:25 AM Dashboard Backport #65790 (Resolved): squid: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
Aashish Sharma
07:56 AM crimson Bug #65753 (Duplicate): [crimson] OSD deployment fails
Matan Breizman
07:53 AM crimson Bug #65857: osd: user_version is inconsistent between object_info and log entries
I suspcet this is related to how we handle acting set changes which trigger `ClientRequest::Orderer::requeue`.
See: h...
Matan Breizman
07:32 AM CephFS Bug #65778: qa: valgrind error: Leak_StillReachable malloc malloc strdup
@vshankar How do I interpret this ?
I don't see any frame pointing to a ceph file apart from the initial finger poi...
Milind Changire
07:03 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
> Not sure if reactor stall related?
I think not, reactor stall is a warning from seastar if a continuation takes ...
Yingxin Cheng
06:46 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
Yingxin Cheng wrote in #note-6:
> > Indeed, but all about seastore.
>
> Seems correct, revised the title and cate...
Rongqi Sun
06:39 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
> Indeed, but all about seastore.
Seems correct, revised the title and category.
Yingxin Cheng
06:03 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
Yingxin Cheng wrote in #note-3:
> Seems it can fail anytime regardless of the specific crimson test.
>
> Examples...
Rongqi Sun
05:13 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
Seems it can fail anytime regardless of the specific crimson test.
Examples:...
Yingxin Cheng
02:32 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
Attach full make check log. Rongqi Sun
06:19 AM CephFS Bug #65841 (Fix Under Review): qa: dead job from `tasks.cephfs.test_admin.TestFSFail.test_with_health_warn_oversize_cache`
Venky Shankar
05:56 AM CephFS Bug #65770: qa: failed to be set on mds daemons: {'mds.imported', 'mds.exported'}
Jos, start by checking if the workload isn't heavy enough to trigger subtree export/import (which would then update t... Venky Shankar
03:29 AM teuthology Bug #64828 (In Progress): teuthology-suite: -n option does not sync --ceph/sha1 and --suite-branch/sha1
Aishwarya Mathuria
02:35 AM RADOS Bug #59670: Ceph status shows PG recovering when norecover flag is set
Radoslaw Zarzynski wrote in #note-5:
> The fix has been merged on 5 Jan 2024, so this could fit. It has been bacport...
Wes Dillingham
12:38 AM mgr Bug #65860 (New): Upgrade test re-opts into new telemetry collections too late
The test failed from:
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-sm...
Laura Flores
12:33 AM mgr Backport #65117 (Resolved): squid: rados/upgrade/parallel: [WRN] TELEMETRY_CHANGED: Telemetry requires re-opt-in
Laura Flores

05/07/2024

10:21 PM Ceph Bug #55859: Radosgw-admin: illegal instruction, running on commodity hardware
I recently updated my cluster (AMD Opteron 6134) from Pacific to Reef and ran into this, too. I believe the problem w... jae beller
09:10 PM Ceph QA QA Run #65859 (QA Building): wip-lflores-testing-2-2024-05-07-1606-squid
Laura Flores
09:09 PM Ceph QA QA Run #65859 (QA Testing): wip-lflores-testing-2-2024-05-07-1606-squid
* "PR #57303":https://github.com/ceph/ceph/pull/57303 -- squid:osd/scrub: reinstate scrub reservation queuing Laura Flores
09:08 PM CephFS Bug #65389 (Fix Under Review): The ceph_readdir function in libcephfs returns incorrect d_reclen value
Patrick Donnelly
08:48 PM CephFS Bug #65858 (New): ceph.in: make `ceph tell mds.<fsname>:<rank> help` give help output
Right now it gives an error:... Patrick Donnelly
08:12 PM crimson Bug #65857 (New): osd: user_version is inconsistent between object_info and log entries
Steps to reproduce:
- create vstart cluster...
Samuel Just
08:07 PM CephFS Backport #65854: quincy: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
https://github.com/ceph/ceph/pull/54469#issuecomment-2099212048 Patrick Donnelly
07:49 PM CephFS Backport #65854 (New): quincy: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
Backport Bot
08:04 PM Ceph QA QA Run #65856 (QA Testing): wip-pdonnell-testing-20240508.150423-reef
* "PR #57357":https://github.com/ceph/ceph/pull/57357 -- reef: ceph.spec.in: remove command-with-macro line
* "PR #5...
Patrick Donnelly
08:02 PM CephFS Backport #65855 (In Progress): reef: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
Patrick Donnelly
07:49 PM CephFS Backport #65855 (In Progress): reef: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
https://github.com/ceph/ceph/pull/57343 Backport Bot
07:59 PM CephFS Backport #65853 (In Progress): squid: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
Patrick Donnelly
07:49 PM CephFS Backport #65853 (In Progress): squid: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
https://github.com/ceph/ceph/pull/57342 Backport Bot
07:56 PM CephFS Backport #65844 (In Progress): squid: qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
Patrick Donnelly
02:41 PM CephFS Backport #65844 (In Progress): squid: qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
https://github.com/ceph/ceph/pull/57341 Backport Bot
07:53 PM CephFS Backport #65843 (In Progress): squid: qa: quiesce cache/ops dump not world readable
Patrick Donnelly
02:40 PM CephFS Backport #65843 (In Progress): squid: qa: quiesce cache/ops dump not world readable
https://github.com/ceph/ceph/pull/57340 Backport Bot
07:49 PM CephFS Bug #65733 (Pending Backport): mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
Patrick Donnelly
07:47 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686875 Laura Flores
07:45 PM Ceph Bug #65852: ceph_test_rados command hits ceph_abort when trying to delete op
Looks similar to https://tracker.ceph.com/issues/48764 Laura Flores
07:45 PM Ceph Bug #65852 (New): ceph_test_rados command hits ceph_abort when trying to delete op
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686996... Laura Flores
05:28 PM CephFS Bug #65851 (New): MDS Squid Metadata Performance Regression
Found during 21 MDS IO500 runs comparing v18.2.2 to v19.0.0.
| Ceph Version | Meaurement | v1...
Mark Nelson
05:08 PM rgw-testing Backport #65850 (New): reef: notifications: test hangs when http notification fails
Backport Bot
05:08 PM rgw-testing Backport #65849 (New): squid: notifications: test hangs when http notification fails
Backport Bot
05:06 PM rgw-testing Bug #65848 (Pending Backport): notifications: test hangs when http notification fails
Yuval Lifshitz
05:06 PM rgw-testing Bug #65848 (Pending Backport): notifications: test hangs when http notification fails
this is a regression caused by the fix of: https://tracker.ceph.com/issues/63909
so, before doing the backport to sq...
Yuval Lifshitz
04:31 PM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686876 Laura Flores
03:44 PM sepia Support #65847 (New): Sepia Lab Access Request
1) Do you just need VPN access or will you also be running teuthology jobs?
VPN access only
2) Desired Username:
...
Omer Goshen
03:31 PM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
While trying to export the journal the following error shows up:... Robert Sander
07:56 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)

Xiubo Li wrote in #note-4:
> A new report from the ceph-user mail list: https://lists.ceph.io/hyperkitty/list/ceph...
Robert Sander
12:42 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
Another one https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NDWFYV5XFDCUW5EBRWXEDQFGVFL5HAIV/:
<pr...
Xiubo Li
12:37 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
A new report from the ceph-user mail list: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/GOAZLA6NQH... Xiubo Li
03:12 PM rgw Backport #64425 (Resolved): quincy: rgw: rados objects wrongly deleted
Casey Bodley
03:11 PM rgw Backport #64539 (Resolved): quincy: metadata cache races on deletes
Casey Bodley
03:11 PM rgw Backport #64599 (Resolved): quincy: unittest_rgw_dmclock_scheduler fails for arm64
Casey Bodley
03:08 PM rgw Feature #65050 (Fix Under Review): Add alternative way for providing user name/password for Kafka endpoint authentication
Casey Bodley
03:07 PM CephFS Bug #65846 (Fix Under Review): mds: "invalid message type: 501"
Patrick Donnelly
02:51 PM CephFS Bug #65846 (Fix Under Review): mds: "invalid message type: 501"
... Patrick Donnelly
02:41 PM CephFS Backport #65845 (New): reef: qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
Backport Bot
02:40 PM crimson Bug #65842: unittest-seastore (Failed) on arm64
similar arm64 failure from @unittest-staged-fltree@ in https://jenkins.ceph.com/job/ceph-pull-requests-arm64/55826/co... Casey Bodley
02:36 PM crimson Bug #65842 (New): unittest-seastore (Failed) on arm64
from https://jenkins.ceph.com/job/ceph-pull-requests-arm64/56090/consoleFull#772176351e840cee4-f4a4-4183-81dd-4285561... Casey Bodley
02:34 PM CephFS Bug #65701 (Pending Backport): qa: quiesce cache/ops dump not world readable
Patrick Donnelly
02:34 PM CephFS Bug #65700 (Pending Backport): qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
Patrick Donnelly
02:19 PM rgw Bug #65664 (Fix Under Review): Crash observed in boost::asio module related to stream.async_shutdown()
Casey Bodley
05:57 AM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
Updating:
Managed to repro the crash repeatedly and verify that the fix PR does resolve the issue.
Details:
Befo...
Mark Kogan
02:15 PM Ceph QA QA Run #65771 (QA Approved): wip-pdonnell-testing-20240503.163550-debug
https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-05-03wip-pdonnell-testing-20240503163550-debug Patrick Donnelly
02:06 PM CephFS Bug #65802 (In Progress): Quiesce and rename aren't properly syncrhonized
Leonid Usov
11:07 AM CephFS Bug #65802: Quiesce and rename aren't properly syncrhonized
Update: having implemented the above I realized that it's just an optimization. The real issue we had was due to the ... Leonid Usov
10:18 AM CephFS Bug #65802: Quiesce and rename aren't properly syncrhonized
With the help of @kotresh we have the picture of the deadlock:
1. the dest auth mds xlocks the linklock on both th...
Leonid Usov
02:00 PM rgw Bug #59488 (Fix Under Review): [RGW][Notification][Kafka]: event name received as "Noncurrent" instead of "NonCurrent"
Casey Bodley
01:29 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
Milind Changire wrote in #note-6:
> Venky Shankar wrote in #note-5:
> > Isn't this same as: https://tracker.ceph.co...
Milind Changire
01:24 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
Patrick Donnelly wrote:
> https://pulpito.ceph.com/pdonnell-2024-03-20_18:16:52-fs-wip-batrick-testing-20240320.1457...
Milind Changire
01:23 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
Venky Shankar wrote in #note-5:
> Isn't this same as: https://tracker.ceph.com/issues/48562 ?
"object missing on ...
Milind Changire
01:00 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
Isn't this same as: https://tracker.ceph.com/issues/48562 ? Venky Shankar
11:17 AM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
Patrick Donnelly wrote in #note-1:
> Maybe also related: https://pulpito.ceph.com/pdonnell-2024-03-20_18:16:52-fs-wi...
Milind Changire
12:50 PM rgw Bug #65794: Ceph Reef RGW error response fails to be parsed during awscli create-bucket
Yep, this is behaviour of boto, it parses xml response with... Peter Razumovsky
12:16 PM CephFS Bug #65841 (Fix Under Review): qa: dead job from `tasks.cephfs.test_admin.TestFSFail.test_with_health_warn_oversize_cache`
/teuthology/pdonnell-2024-05-07_01:13:22-fs-wip-pdonnell-testing-20240503.163550-debug-distro-default-smithi/7695097/... Patrick Donnelly
12:14 PM Dashboard Backport #65840 (New): reef: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
12:12 PM Dashboard Backport #65839 (New): reef: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
12:12 PM Dashboard Backport #65838 (New): squid: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
12:10 PM CephFS Bug #65837 (Fix Under Review): qa: dead job from waiting to unmount client on deliberately damaged fs
Patrick Donnelly
12:09 PM CephFS Bug #65837 (Fix Under Review): qa: dead job from waiting to unmount client on deliberately damaged fs
https://pulpito.ceph.com/pdonnell-2024-05-07_01:13:22-fs-wip-pdonnell-testing-20240503.163550-debug-distro-default-sm... Patrick Donnelly
11:59 AM CephFS Bug #65616: pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
Milind Changire wrote in #note-4:
> @pdonnell Do you wan't me to check the continuity in the timestamps in the snap ...
Patrick Donnelly
11:06 AM CephFS Bug #65616: pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
@pdonnell Do you wan't me to check the continuity in the timestamps in the snap dir names ? Milind Changire
11:46 AM mgr Bug #65836 (New): ceph-mgr cephadm's service discovery not starting
Hello !
I've seen the https://tracker.ceph.com/issues/63388 issue, and I'm currently facing something a bit simila...
Cyril Duval
10:55 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
Dhairya Parmar wrote in #note-8:
> Venky Shankar wrote in #note-7:
> > Dhairya Parmar wrote in #note-6:
> > > Venk...
Venky Shankar
10:54 AM Linux kernel client Bug #65835 (New): File reads/writes hang during ceph_llseek with misrouted OSD
Using v17.2.7 with linux kernel version 5.15.0-105-generic, we've been having issues with rsync hanging on writes, an... Samuel Wein
10:38 AM Ceph Bug #65834 (New): reef: cephadm: ceph-common package installation with cephadm fails due to the activation of OracleLinux EPEL repository
Installation of ceph-common package on Oracle Linux 9 with cephadm fails due to the activation of the OracleLinux EPE... Philippe Bidault
10:36 AM Ceph Bug #65833 (New): No binaries for reef for el9 / aarch64
Hi, I've noticed that no binaries appear to have been released for reef for el9: https://download.ceph.com/rpm-reef/e... Dan Whitehouse
10:23 AM CephFS Bug #65829: qa: qa/suites/fs/functional/subvol_versions/ multiplies all jobs in fs:function by 2
@pdonnell I understood the reorg bit up to v1,v2 ... but not the last "test" part. What's the "test" part ? Milind Changire
12:17 AM CephFS Bug #65829 (New): qa: qa/suites/fs/functional/subvol_versions/ multiplies all jobs in fs:function by 2
This change:
https://github.com/ceph/ceph/pull/53999/files#diff-e00804e3b70b5d89f530c963e9dfa38f43587ae6be9d94687d...
Patrick Donnelly
10:04 AM crimson Bug #65832 (New): crimson osd clone_overlap calculate error
... Xuehan Xu
09:56 AM Dashboard Bug #63686 (In Progress): mgr/dashboard: adapt service creation form to support nvmeof creation
Afreen Misbah
09:36 AM sepia Support #65831 (New): Sepia Lab access
1) Do you just need VPN access or will you also be running teuthology jobs?
I need VPN access and will also be run...
Harsh Kumar
09:19 AM rgw Feature #65830 (New): rgw: allow send bucket notification to multiple brokers of kafka cluster
Currently, rgw allow to send message to one node kafka.
add paramerter to config broker list and support send mes...
Hoai-Thu Vuong
09:04 AM crimson Bug #65752: [crimson] OSD deployment fails

> Hi Matan,
>
> Thank you for your response!
> Have these changes been introduced recently?
>
> Asking becau...
Matan Breizman
08:59 AM crimson Bug #65752: [crimson] OSD deployment fails
Matan Breizman wrote in #note-1:
> Hey Harsh,
> Looks like the OSD crashes in "crimson::os::AlienStore::start()".
...
Harsh Kumar
08:46 AM crimson Bug #65752 (Need More Info): [crimson] OSD deployment fails
Hey Harsh,
Looks like the OSD crashes in "crimson::os::AlienStore::start()".
I suspect this is about missing essent...
Matan Breizman
08:41 AM rbd Backport #65814 (In Progress): squid: [pybind] expose CLONE_FORMAT and FLATTEN image options
Ilya Dryomov
08:39 AM rbd Backport #65816 (In Progress): reef: [pybind] expose CLONE_FORMAT and FLATTEN image options
Ilya Dryomov
08:39 AM Ceph QA QA Run #65793: wip-rishabh-testing-20240503.134948
There are more than 100 failures related to cephadm. These failures occurred again in re-run which confirms this issu... Rishabh Dave
08:33 AM rbd Backport #65815 (In Progress): quincy: [pybind] expose CLONE_FORMAT and FLATTEN image options
Ilya Dryomov
08:31 AM Ceph QA QA Run #65792: wip-rishabh-testing-20240501.193033
There are plenty related failures. This PR needs to fixed and tested again. It'll take a new build for it and therefo... Rishabh Dave
08:31 AM Ceph QA QA Run #65792 (QA Closed): wip-rishabh-testing-20240501.193033
Rishabh Dave
08:29 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
https://jenkins.ceph.com/job/ceph-pull-requests-arm64/56052/consoleFull#-14170402636733401c-e9d0-4737-9832-6594c5da0a... Rongqi Sun
08:09 AM rbd Backport #65817 (In Progress): squid: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov
08:08 AM rbd Backport #65819 (In Progress): reef: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov
08:06 AM rbd Backport #65818 (In Progress): quincy: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov
07:59 AM CephFS Bug #65388: The MDS_SLOW_REQUEST warning is flapping even though the slow requests don't go away
Venky, no, not yet. I haven't gotten back to this with the quiesce work that keeps coming. I'll try to continue where... Leonid Usov
07:18 AM Ceph Bug #63494 (Pending Backport): all: daemonizing may release CephContext:: _fork_watchers_lock when its already unlocked
Venky Shankar
07:16 AM ceph-volume Bug #64260 (Resolved): ceph-volume lvm migrate could assert on AttributeError:'NoneType' object has no attribute 'path'
Guillaume Abrioux
07:16 AM ceph-volume Backport #64356 (Resolved): quincy: ceph-volume lvm migrate could assert on AttributeError:'NoneType' object has no attribute 'path'
Guillaume Abrioux
06:55 AM rgw Bug #65436: Getting Object Crashing radosgw services
Reid Guyett wrote in #note-8:
> What did you do to fix it at the proxy layer? Strip the parameters from the URL?
...
hoan nv
06:25 AM CephFS Bug #64209 (Duplicate): snaptest-multiple-capsnaps.sh fails with "got remote process result: 1"
This is the same issue with https://tracker.ceph.com/issues/65705. Xiubo Li
05:47 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
Venky Shankar wrote in #note-4:
> Xiubo Li wrote in #note-3:
> > Venky Shankar wrote in #note-2:
> > > Xiubo, this...
Xiubo Li
04:58 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
Xiubo Li wrote in #note-3:
> Venky Shankar wrote in #note-2:
> > Xiubo, this is using the distro kernel. Maybe the ...
Venky Shankar
04:01 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
Venky Shankar wrote in #note-2:
> Xiubo, this is using the distro kernel. Maybe the relevant kclient fixes haven't ye...
Xiubo Li
03:51 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
Xiubo, this is using the distro kernel. Maybe the relevant kclient fixes haven't yet landed in the distro kernel? Venky Shankar
04:38 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
04:38 AM Dashboard Bug #64321 (Fix Under Review): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
04:37 AM RADOS Bug #65768: rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
@rzarzynski I found this during a review of a squid run that included a couple of my PRs.
I wasn't working on this, ...
Sridhar Seshasayee
04:26 AM RADOS Bug #65737 (Fix Under Review): pg-split-merge.sh -
Nitzan Mordechai
04:24 AM RADOS Bug #65737: pg-split-merge.sh -
Radoslaw Zarzynski wrote in #note-1:
> Ni Nitzan! Are you working on this tracker maybe?
yes, I'm researching it.
Nitzan Mordechai

05/06/2024

11:02 PM rgw Bug #65828 (New): radosgw process killed with "Out of memory" while executing query "select * from s3object limit 1" on a 12GB parquet file

(coppied from https://bugzilla.redhat.com/show_bug.cgi?id=2275323)
Description of problem:
radosgw process kill...
Gal Salomon
10:35 PM rgw Bug #65794: Ceph Reef RGW error response fails to be parsed during awscli create-bucket
Peter Razumovsky wrote:
>  It seems this is due to RGW starts returning message tag with empty body since Ceph Reef:...
Casey Bodley
10:26 PM rgw Backport #59614 (Resolved): reef: s3 error response missing Message field
Casey Bodley
09:32 PM RADOS Bug #56770: crash: void OSDShard::register_and_wake_split_child(PG*): assert(p != pg_slots.end())
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686929 Laura Flores
09:28 PM Orchestrator Bug #65732: rados/cephadm/osds: job times out during nvme_loop interval
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7687095 Laura Flores
08:56 PM RADOS Bug #65749: osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
Radoslaw Zarzynski wrote in #note-4:
> Is there any trace of autoscaler-induced PG splitting visible during the situ...
Dan van der Ster
07:32 PM RADOS Bug #65749: osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
Is there any trace of autoscaler-induced PG splitting visible during the situation? Radoslaw Zarzynski
08:56 PM rgw Bug #59380: rados/singleton-nomsgr: test failing from "Health check failed: 1 full osd(s) (OSD_FULL)" and "Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)"
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7687043 Laura Flores
07:36 PM Orchestrator Bug #65827 (New): qa/tasks/cephadm: logrotation is not done every 15 minutes as in the ceph.py task
This can lead to massive logs which are not compressed until the very end of the test:... Patrick Donnelly
07:21 PM Ceph QA QA Run #65349 (QA Closed): wip-yuri3-testing-2024-04-05-0825
Yuri Weinstein
07:21 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@ksirivad also pls add a link into the PRS next time
like in https://github.com/ceph/ceph/pull/56515#issuecomment-...
Yuri Weinstein
07:11 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@ksirivad pls assign to me and/or change the status to "QA Approved" in the future, TIA Yuri Weinstein
06:52 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
RADOS APPROVED https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65349
@yuriw
Kamoltat (Junior) Sirivadhna
06:59 PM CephFS Tasks #64164 (In Progress): verify st_blocks is correct
It appears logic the #warning is addressing matches non-fscrypt directories behavior. This logic has been here since ... Christopher Hoffman
06:56 PM RADOS Bug #65737: pg-split-merge.sh -
Ni Nitzan! Are you working on this tracker maybe? Radoslaw Zarzynski
06:47 PM RADOS Bug #65826 (New): test_default_progress_test (tasks.mgr.test_progress.TestProgress) remove_pool assert pool_name in self.pool
/a/yuriw-2024-05-01_22:15:10-rados-wip-yuri3-testing-2024-04-05-0825-distro-default-smithi/7684757/... Kamoltat (Junior) Sirivadhna
06:46 PM RADOS Bug #53544: src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in thrash_cache_writeback_proxy_none tests
Cache tiering is unsupported. Radoslaw Zarzynski
06:44 PM RADOS Bug #65765: squid: rados/test.sh: LibRadosWatchNotifyECPP.WatchNotify test of api_watch_notify_pp suite didn't complete.
Hi Nitzan! Would you mind taking a look? Radoslaw Zarzynski
06:42 PM RADOS Bug #65686: ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
Thanks for finding it, Mohit! Radoslaw Zarzynski
06:26 PM RADOS Bug #65825 (New): test_python.sh TestIoctx.test_locator failes: Classic

/a/yuriw-2024-05-01_22:15:10-rados-wip-yuri3-testing-2024-04-05-0825-distro-default-smithi/7684635...
Kamoltat (Junior) Sirivadhna
06:14 PM Orchestrator Bug #65824 (New): rados/thrash-old-clients: cluster [WRN] Health detail: HEALTH_WARN noscrub flag(s) set" in cluster log
/a/yuriw-2024-05-01_22:15:10-rados-wip-yuri3-testing-2024-04-05-0825-distro-default-smithi/7684599/
Just need to w...
Kamoltat (Junior) Sirivadhna
06:05 PM CephFS Bug #65823 (Fix Under Review): qa/tasks/quiescer: dump ops in parallel
Patrick Donnelly
06:03 PM CephFS Bug #65823 (Fix Under Review): qa/tasks/quiescer: dump ops in parallel
Since this --flags=locks takes the mds_lock and dumps thousands of ops, this
may take a long time to complete for ea...
Patrick Donnelly
06:00 PM RADOS Bug #65768: rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
Sridhar, are you working on this? Radoslaw Zarzynski
05:59 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
IIRC Ronen has a hypothesis this is another incarnation of https://tracker.ceph.com/issues/65185.
The has been merge...
Radoslaw Zarzynski
05:54 PM RADOS Bug #63198 (In Progress): rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
Radoslaw Zarzynski
05:50 PM RADOS Bug #55750: mon: slow request of very long time
Hi Dan! Do you have the corresponding mon's log by any chance? Radoslaw Zarzynski
05:50 PM RADOS Bug #55750: mon: slow request of very long time
... Radoslaw Zarzynski
05:49 PM CephFS Tasks #65811 (Resolved): Make dbench work on fscrypt
dbench completes successfully without any errors. Christopher Hoffman
01:51 PM CephFS Tasks #65811 (Resolved): Make dbench work on fscrypt
Ensure dbench will work on top of fuse w/fscrypt. Christopher Hoffman
05:34 PM rgw Backport #65822 (In Progress): reef: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
Casey Bodley
05:19 PM rgw Backport #65822 (In Progress): reef: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
https://github.com/ceph/ceph/pull/57301 Casey Bodley
05:31 PM CephFS Tasks #65812 (Resolved): pwrite failure on overwrite
The issue was a fix I did in an earlier commit. The reproducer should do the read in the start block. The bool for st... Christopher Hoffman
02:00 PM CephFS Tasks #65812 (Resolved): pwrite failure on overwrite
Failure on pwrite on overwrite when end of write is past previous end of file.
Steps to reproduce:...
Christopher Hoffman
05:21 PM rgw Backport #65821 (In Progress): squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
Casey Bodley
05:19 PM rgw Backport #65821 (Resolved): squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
https://github.com/ceph/ceph/pull/57300 Casey Bodley
05:19 PM rgw Bug #65746 (Pending Backport): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
Casey Bodley
05:12 PM mgr Tasks #47108 (Resolved): mgr/restful: Document deprecation of the restful module in favor of the Ceph Dashboard REST API in the restful documentation
Done in the parent task: https://tracker.ceph.com/issues/47066 Ernesto Puerta
05:11 PM mgr Tasks #47067 (Resolved): mgr/restful: communicate the deprecation of the restful module in favor of the Ceph Dashboard REST API
Ernesto Puerta
05:10 PM mgr Tasks #47066 (Fix Under Review): mgr/restful: Deprecate the "restful" module in favor of the Ceph Dashboard REST API
Ernesto Puerta
04:53 PM CephFS Bug #65820 (New): qa/tasks/fwd_scrub: Traceback in teuthology.log for normal exit condition
... Patrick Donnelly
04:40 PM rgw Bug #65436: Getting Object Crashing radosgw services
What did you do to fix it at the proxy layer? Strip the parameters from the URL? Reid Guyett
03:45 PM rbd Backport #65819 (In Progress): reef: rbd-mirror daemon in ERROR state, require manual restart
https://github.com/ceph/ceph/pull/57306 Backport Bot
03:45 PM rbd Backport #65818 (In Progress): quincy: rbd-mirror daemon in ERROR state, require manual restart
https://github.com/ceph/ceph/pull/57305 Backport Bot
03:45 PM rbd Backport #65817 (In Progress): squid: rbd-mirror daemon in ERROR state, require manual restart
https://github.com/ceph/ceph/pull/57307 Backport Bot
03:44 PM rbd Backport #65816 (In Progress): reef: [pybind] expose CLONE_FORMAT and FLATTEN image options
https://github.com/ceph/ceph/pull/57309 Backport Bot
03:44 PM rbd Backport #65815 (In Progress): quincy: [pybind] expose CLONE_FORMAT and FLATTEN image options
https://github.com/ceph/ceph/pull/57308 Backport Bot
03:44 PM rbd Backport #65814 (In Progress): squid: [pybind] expose CLONE_FORMAT and FLATTEN image options
https://github.com/ceph/ceph/pull/57310 Backport Bot
03:44 PM rbd Feature #65624 (Pending Backport): [pybind] expose CLONE_FORMAT and FLATTEN image options
Ilya Dryomov
03:42 PM rbd Bug #65487 (Pending Backport): rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov
03:33 PM rbd Bug #65813 (New): [test] fsx can call posix_memalign() with size == 0
While legal, it's specified as implementation-defined:
> If the size of the space requested is 0, the behavior is im...
Ilya Dryomov
03:09 PM Ceph QA QA Run #65454: wip-vshankar-testing-20240411.061452
Dropped a couple of offending PRs and one more that got merged by another dev. Venky Shankar
02:41 PM RADOS Bug #58461 (Closed): osd/scrub: replica-response timeout is handled without locking the PG
Ronen Friedman
02:41 PM RADOS Bug #58461: osd/scrub: replica-response timeout is handled without locking the PG
49687 was never merged. Instead - the whole timeout implementation was discarded (Squid) Ronen Friedman
02:38 PM RADOS Bug #61457 (Can't reproduce): PgScrubber: shard blocked on an object for too long
Ronen Friedman
02:36 PM RADOS Backport #63370 (Resolved): quincy: use-after-move in OSDService::build_incremental_map_msg()
Ronen Friedman
02:35 PM RADOS Bug #63310 (Resolved): use-after-move in OSDService::build_incremental_map_msg()
Ronen Friedman
02:32 PM RADOS Bug #63509 (Resolved): osd/scrub: some replica states specified incorrectly
Ronen Friedman
02:31 PM RADOS Bug #63509: osd/scrub: some replica states specified incorrectly
54460 was made obsolete by https://github.com/ceph/ceph/pull/54482.
Ronen Friedman
02:29 PM RADOS Bug #64346 (In Progress): TEST_dump_scrub_schedule fails from "key is query_is_future: negation:0 # expected: false # in actual: true"
A test script error. In progress Ronen Friedman
02:27 PM RADOS Bug #64972 (Resolved): qa: "ceph tell 4.3a deep-scrub" command not found
Ronen Friedman
02:25 PM RADOS Backport #65072 (Resolved): squid: rados/thrash: slow reservation response from 1 (115547ms) in cluster log
Ronen Friedman
02:24 PM RADOS Backport #65374 (Resolved): squid: qa: "ceph tell 4.3a deep-scrub" command not found
Ronen Friedman
02:24 PM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Pedro González Gómez
02:23 PM Dashboard Bug #64321 (Resolved): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Pedro González Gómez
12:08 PM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Pedro González Gómez
12:08 PM Dashboard Bug #64321 (New): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Pedro González Gómez
11:06 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
11:06 AM Dashboard Bug #64321 (Fix Under Review): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
08:46 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
08:46 AM Dashboard Bug #64321 (Fix Under Review): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
02:23 PM RADOS Backport #65646 (Resolved): squid: osd/scrub: must disable reservation timeout for reserver-based requests
Ronen Friedman
02:09 PM Ceph QA QA Run #65688: wip-yuri4-testing-2024-04-29-0642
@lflores will your PR address these? Yuri Weinstein
01:05 PM Ceph QA QA Run #65688 (QA Needs Rerun/Rebuilt): wip-yuri4-testing-2024-04-29-0642
@yuriw There are a high number of failures (26 nos on Rados) still
related to infra issue as mentioned in #note-3 on...
Sridhar Seshasayee
01:51 PM Orchestrator Bug #65810 (New): mgr/cephadm: update PROMETHEUS_API_HOST if prometheus fails over to another node
We need to update the PROMETHEUS_API_HOST in dashboard config if prometheus fails over to another node. This process ... Aashish Sharma
01:46 PM Ceph Feature #64335 (Resolved): Add alerts to ceph monitoring stack for the nvmeof gateways
Aashish Sharma
01:46 PM Ceph Backport #65539 (Resolved): squid: Add alerts to ceph monitoring stack for the nvmeof gateways
Aashish Sharma
01:45 PM Ceph Backport #65540 (Resolved): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
Aashish Sharma
01:16 PM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
Venky Shankar wrote in #note-7:
> Dhairya Parmar wrote in #note-6:
> > Venky Shankar wrote in #note-5:
> > > Dhair...
Dhairya Parmar
12:48 PM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
Dhairya Parmar wrote in #note-6:
> Venky Shankar wrote in #note-5:
> > Dhairya Parmar wrote in #note-4:
> > > @vsh...
Venky Shankar
12:38 PM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
Venky Shankar wrote in #note-5:
> Dhairya Parmar wrote in #note-4:
> > @vshankar @patrick can you update this track...
Dhairya Parmar
09:31 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
Dhairya Parmar wrote in #note-4:
> @vshankar @patrick can you update this tracker with the discussion you guys had o...
Venky Shankar
01:10 PM CephFS Bug #65783 (Duplicate): qa: cluster [WRN] Health detail: HEALTH_WARN 1 osds down; Degraded data redundancy
Duplicate of https://tracker.ceph.com/issues/65700 Venky Shankar
01:09 PM CephFS Bug #65803: mds: some asok commands wait with asok thread blocked
The actual issue is that `flush journal` command is synchronous: it's blocking the admin socket thread in the mds:
...
Patrick Donnelly
01:09 PM CephFS Bug #65777 (Duplicate): qa: error during scrub thrashing
Venky Shankar
01:06 PM CephFS Bug #65782: qa: test_flag_scrub_mdsdir (tasks.cephfs.test_scrub_checks.TestScrubChecks)
This does not show up in main branch run. The test branch was testing uninline-data[0] feature.
[0]: https://githu...
Venky Shankar
01:06 PM CephFS Bug #65780: qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
Duplicate of 65779 Kotresh Hiremath Ravishankar
01:05 PM CephFS Bug #65780 (Duplicate): qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
Kotresh Hiremath Ravishankar
01:02 PM Orchestrator Bug #65809 (New): cephadm: ignore NVMEoF daemon after mons in staggered upgrade
This PR https://github.com/ceph/ceph/pull/54671 is going to make the NVMEoF daemon dependent on the mons. That means ... Adam King
12:53 PM CephFS Bug #65801 (Triaged): mgr/snap_schedule: restrict retention spec multiplier set
Venky Shankar
12:50 PM CephFS Bug #65808 (New): Test failure: test_idem_unaffected_root_squash (tasks.cephfs.test_admin.TestFsAuthorizeUpdate)
... Rishabh Dave
12:49 PM CephFS Bug #65781: qa: ceph version 17.2.7-917.gd69ee407 was not installed, found 17.2.7-913.g8c431824.el8.
How is this cephfs related, Milind? Venky Shankar
12:44 PM CephFS Feature #65637: mds: continue sending heartbeats during recovery when MDS journal is large
I'm taking this one (since I already own https://tracker.ceph.com/issues/61863) Venky Shankar
12:18 PM CephFS Bug #65388: The MDS_SLOW_REQUEST warning is flapping even though the slow requests don't go away
Leonid, any updates on this? Venky Shankar
12:17 PM CephFS Bug #65604 (Triaged): dbench.sh workload times out after 3h when run with-quiescer
Venky Shankar
12:15 PM CephFS Bug #65604: dbench.sh workload times out after 3h when run with-quiescer
Venky Shankar wrote in #note-2:
> There already a tracker for this. Will dig it up and link.
I guess the tracker was...
Venky Shankar
10:18 AM CephFS Bug #65807 (New): qa failure: test_adding_multiple_caps (tasks.cephfs.test_admin.TestFsAuthorize)
From https://pulpito.ceph.com/vshankar-2024-05-01_17:34:00-fs-wip-vshankar-testing-20240430.111407-debug-testing-defa... Rishabh Dave
09:25 AM Dashboard Backport #65758 (Resolved): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Pedro González Gómez
09:24 AM Dashboard Backport #65789 (Resolved): reef: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
Pedro González Gómez
08:40 AM crimson Bug #65806 (New): IO hangs when issuing balanced/localized reads to replica crimson osds while the pg is still peering
The IO request in the following log never ends, this is because it's waiting for the pg to be active, but the current... Xuehan Xu
07:27 AM Ceph QA QA Run #65680: wip-mchangir-testing-20240429.064231-main-debug
Milind Changire wrote in #note-2:
> 41 Jobs Failed out of 206
> Most of them are known (documented issues).
> Few ...
Venky Shankar
07:25 AM CephFS Bug #65766: qa: perm denied for runing find on cephtest dir
The error starts right after generic/099 finishes.... Venky Shankar
06:46 AM CephFS Bug #63514: mds: avoid sending inode/stray counters as part of health warning for standby-replay
Rishabh Dave wrote in #note-6:
> Venky, should we skip backporting this from Quincy? Or is it still valid?
Let's ...
Venky Shankar
05:57 AM CephFS Bug #50719: xattr returning from the dead (sic!)
Matthew Hutchinson wrote in #note-27:
> HI, Can I get an update on this?
Hi Matthew,
Please try to reproduce it and...
Xiubo Li
05:55 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
It's a kclient side bug. Xiubo Li
05:49 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Mykola Golub wrote in #note-9:
> Xiubo Li wrote in #note-6:
> > Venky Shankar wrote in #note-5:
>
> > > The trac...
Xiubo Li
04:42 AM Ceph Bug #65805 (Fix Under Review): common/StackStringStream: update pointer to newly allocated memory in overflow()
Kefu Chai
03:14 AM Ceph Bug #65805 (Pending Backport): common/StackStringStream: update pointer to newly allocated memory in overflow()
When sanitizer is enabled, unittest_log fails as following... Rongqi Sun
02:10 AM crimson Bug #65804 (New): CEPH_OSD_OP_CHECKSUM got "invalid argument"
crimson-osd's handling of CEPH_OSD_OP_CHECKSUM is not idempotent, if a client request got interrupted and requeued af... Xuehan Xu
02:01 AM CephFS Bug #64572: workunits/fsx.sh failure
This is a following fix for it https://github.com/ceph/ceph/pull/57275. Xiubo Li
01:52 AM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
See https://github.com/ceph/ceph/blob/16021434f3f18d548e35cad33faea4e5978ffe4f/src/crimson/mgr/client.cc#L102-L105
...
Yingxin Cheng

05/05/2024

09:22 PM CephFS Bug #65803 (Fix Under Review): mds: some asok commands wait with asok thread blocked
Leonid Usov
08:23 PM CephFS Bug #65803 (Fix Under Review): mds: some asok commands wait with asok thread blocked
Teuthology script is often running slow, as much as it is not able to keep timed events, arriving late and finding a ... Leonid Usov
08:09 PM CephFS Bug #65802 (In Progress): Quiesce and rename aren't properly syncrhonized
Detected in this run: https://pulpito.ceph.com/pdonnell-2024-05-03_22:48:16-fs-wip-pdonnell-testing-20240503.163550-d... Leonid Usov
12:48 PM Ceph Bug #65791: Enable Ceph to benefit from a faster CRC32 implementation
Small correction to the above -- this should be the PCLMUL instruction set, and not SSE4.1. Tyler Stachecki
09:28 AM RADOS Bug #50608 (Fix Under Review): ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
Still relevant:
http://telemetry.front.sepia.ceph.com:4000/d/Nvj6XTaMk/spec-search?orgId=1&var-substr_1=PrimaryLog...
Matan Breizman

05/04/2024

04:48 PM Ceph QA QA Run #65798 (QA Needs Approval): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
Yuri Weinstein
02:26 PM Ceph QA QA Run #65798: wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
retriggered centos8 https://jenkins.ceph.com/job/ceph-dev-new/80454/ Yuri Weinstein
10:57 AM CephFS Bug #65801 (Triaged): mgr/snap_schedule: restrict retention spec multiplier set
The accepted retention spec multiplier set is a union of [a-z] and [A-Z].
This causes confusion with too many meanin...
Milind Changire
10:50 AM Ceph QA QA Run #65680: wip-mchangir-testing-20240429.064231-main-debug
41 Jobs Failed out of 206
Most of them are known (documented issues).
Few of them are new and have been added to th...
Milind Changire
07:59 AM rbd Documentation #65800: Improve "rbd flatten" documentation
https://tracker.ceph.com/issues/40486 - "rbd migration" command documentation tracker Zac Dover
07:56 AM rbd Documentation #65800 (In Progress): Improve "rbd flatten" documentation
build/src/pybind/mgr/dashboard/frontend/dist/en-US/default-src_app_ceph_block_block_module_ts.js: titleTex... Zac Dover
01:25 AM rgw Bug #65436: Getting Object Crashing radosgw services
Reid Guyett wrote in #note-6:
> We are also blocked by https://tracker.ceph.com/issues/64308 in moving to 17.2.7.
...
hoan nv

05/03/2024

11:49 PM Ceph QA QA Run #65796 (QA Needs Approval): wip-yuriw-testing-20240503.181540-main (wrong wip name, should be wip-yuri2-testing*)
Yuri Weinstein
06:15 PM Ceph QA QA Run #65796 (QA Needs Approval): wip-yuriw-testing-20240503.181540-main (wrong wip name, should be wip-yuri2-testing*)
* "PR #57216":https://github.com/ceph/ceph/pull/57216 -- mgr/balancer: set upmap_max_deviation to 1
* "PR #56937":ht...
Yuri Weinstein
11:48 PM Ceph QA QA Run #65797 (QA Needs Approval): wip-yuriw-testing-20240503.213344-main (wip-yuri5-testing)
Yuri Weinstein
09:34 PM Ceph QA QA Run #65797 (QA Needs Approval): wip-yuriw-testing-20240503.213344-main (wip-yuri5-testing)
* "PR #57015":https://github.com/ceph/ceph/pull/57015 -- bluefs: bluefs alloc unit should only be shrink
* "PR #5698...
Yuri Weinstein
11:45 PM Ceph QA QA Run #65798 (QA Testing): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
Yuri Weinstein
11:43 PM Ceph QA QA Run #65798 (QA Needs Approval): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
Yuri Weinstein
09:35 PM Ceph QA QA Run #65798 (QA Needs Approval): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
* "PR #57137":https://github.com/ceph/ceph/pull/57137 -- osd: CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag is passed from... Yuri Weinstein
11:41 PM Orchestrator Bug #65799 (Fix Under Review): cephadm: [progress WARNING root] complete: ev {UUID} does not exist
Prashant D
11:28 PM Orchestrator Bug #65799: cephadm: [progress WARNING root] complete: ev {UUID} does not exist
some upstream users reported this issue earlier : https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VQT... Prashant D
11:27 PM Orchestrator Bug #65799 (Fix Under Review): cephadm: [progress WARNING root] complete: ev {UUID} does not exist
The cephadm module, while applying service specs, creates a progress event for the daemons to be added or deleted fro... Prashant D
07:51 PM RADOS Bug #55750: mon: slow request of very long time
Still getting this on reef 18.2.1 Dan van der Ster
06:49 PM rgw Bug #65436: Getting Object Crashing radosgw services
We are also blocked by https://tracker.ceph.com/issues/64308 in moving to 17.2.7. Reid Guyett
12:58 PM rgw Bug #65436: Getting Object Crashing radosgw services
So the solution is to upgrade RGW, delete and recreate the bucket?
Since we do not own or control the data being u...
Reid Guyett
08:48 AM rgw Bug #65436: Getting Object Crashing radosgw services
Did you try with error file on old bucket ?
Error file can't fix by upgrade ceph. You need delete error file or all ...
hoan nv
04:02 PM RADOS Bug #52657 (Pending Backport): MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Igor Fedotov
01:39 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
https://github.com/ceph/ceph/pull/49619 merged Yuri Weinstein
03:58 PM CephFS Bug #65795 (Fix Under Review): cephfs_mirror: daemon status shows KeyError: 'directory_count'
ceph fs snapshot mirror daemon status gives KeyError: 'directory_count' when mirroring is disabled and enabled repeat... Jos Collin
03:38 PM Ceph QA QA Run #65641 (QA Needs Approval): wip-yuriw8-testing-20240424.000125-main
Yuri Weinstein
03:12 PM rgw Backport #63857: quincy: notification: etag is missing in CompleteMultipartUpload event
please hold off on this backport until https://tracker.ceph.com/issues/65746 is resolved. these should be backported ... Casey Bodley
03:07 PM rgw Bug #65746 (Fix Under Review): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
Casey Bodley
04:05 AM rgw Bug #65746 (Triaged): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
i think this was a regression from https://github.com/ceph/ceph/pull/54569, which moved @meta_obj->delete_object()@ f... Casey Bodley
03:05 PM rgw Bug #65794 (New): Ceph Reef RGW error response fails to be parsed during awscli create-bucket
Using Ceph Reef v18.2.2 (with Rook v1.13.5 but it is not important here). Running tests with RGW bucket quota exceed ... Peter Razumovsky
02:01 PM Ceph QA QA Run #65793 (QA Testing): wip-rishabh-testing-20240503.134948
https://github.com/ceph/ceph/pull/55956 Rishabh Dave
01:59 PM Ceph QA QA Run #65792 (QA Closed): wip-rishabh-testing-20240501.193033
https://github.com/ceph/ceph/pull/55144 Rishabh Dave
01:58 PM Ceph QA QA Run #65764 (QA Closed): wip-rishabh-testing-20240426.111959
Rishabh Dave
07:09 AM Ceph QA QA Run #65764: wip-rishabh-testing-20240426.111959
Rishabh Dave wrote in #note-2:
> There were lots of new failures along the usual ones. These new failures were cause...
Rishabh Dave
01:45 PM bluestore Bug #58274 (Pending Backport): BlueStore::collection_list becomes extremely slow due to unbounded rocksdb iteration
Konstantin Shalygin
01:38 PM bluestore Bug #58274: BlueStore::collection_list becomes extremely slow due to unbounded rocksdb iteration
https://github.com/ceph/ceph/pull/49438 merged Yuri Weinstein
01:43 PM CephFS Bug #65246: qa/cephfs: test_multifs_single_path_rootsquash (tasks.cephfs.test_admin.TestFsAuthorize)
Venky,no need to backport to Quincy, right? The PR that introduced this bug hasn't been backported at all. Rishabh Dave
01:43 PM CephFS Bug #65246 (Pending Backport): qa/cephfs: test_multifs_single_path_rootsquash (tasks.cephfs.test_admin.TestFsAuthorize)
Rishabh Dave
01:42 PM Ceph QA QA Run #65560 (QA Closed): wip-yuri5-testing-2024-04-17-1400
Yuri Weinstein
07:16 AM Ceph QA QA Run #65560 (QA Approved): wip-yuri5-testing-2024-04-17-1400
Rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65560 Aishwarya Mathuria
01:10 PM Ceph Bug #65791 (New): Enable Ceph to benefit from a faster CRC32 implementation
ISA-L, a component in Ceph (https://github.com/ceph/isa-l/tree/bee5180a1517f8b5e70b02fcd66790c623536c5d) provides mul... Tyler Stachecki
11:58 AM Dashboard Backport #65789 (In Progress): reef: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
Aashish Sharma
11:54 AM Dashboard Backport #65789 (Resolved): reef: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
https://github.com/ceph/ceph/pull/57255 Backport Bot
11:56 AM Dashboard Backport #65790 (In Progress): squid: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
Aashish Sharma
11:54 AM Dashboard Backport #65790 (Resolved): squid: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
https://github.com/ceph/ceph/pull/57254 Backport Bot
11:49 AM Dashboard Bug #65788 (Resolved): mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
Introduce prometheus fedeartion in ceph dashboard. This is done by adding a federate job to the prometheus configurat... Aashish Sharma
11:16 AM Dashboard Backport #65787 (New): squid: mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
Backport Bot
11:16 AM Dashboard Backport #65786 (New): reef: mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
Backport Bot
11:07 AM Dashboard Backport #65785 (New): squid: mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
Backport Bot
11:05 AM Dashboard Bug #65760 (Pending Backport): mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
Aashish Sharma
10:45 AM Orchestrator Feature #53562: cephadm doesn't support osd crush_location_hook
Any update on this feature request? We use custom crush location hooks as well. Eugen Block
09:03 AM Orchestrator Feature #65784 (Fix Under Review): bump loki/promtail to 3.0.0
We currently ship with old versions of loki/promtail.
loki/promtail released 3.0.0 recently.
Guillaume Abrioux
08:56 AM Ceph QA QA Run #65454 (QA Needs Approval): wip-vshankar-testing-20240411.061452
Venky Shankar
08:36 AM CephFS Bug #65350 (Pending Backport): mgr/snap_schedule: restore yearly spec from uppercase Y to lowercase y
Rishabh Dave
07:58 AM CephFS Bug #63514: mds: avoid sending inode/stray counters as part of health warning for standby-replay
Venky, should we skip backporting this from Quincy? Or is it still valid? Rishabh Dave
07:57 AM CephFS Bug #63514 (Pending Backport): mds: avoid sending inode/stray counters as part of health warning for standby-replay
Rishabh Dave
07:44 AM CephFS Feature #61866 (Pending Backport): MDSMonitor: require --yes-i-really-mean-it when failing an MDS with MDS_HEALTH_TRIM or MDS_HEALTH_CACHE_OVERSIZED health warnings
Rishabh Dave
07:42 AM CephFS Bug #65314: valgrind error: Leak_PossiblyLost posix_memalign UnknownInlinedFun ceph::buffer::v15_2_0::list::refill_append_space(unsigned int)
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
07:40 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
07:32 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
Rishabh Dave wrote in #note-5:
> @Xiubo @Venky, The PR has been merged since QA run was successful. IMO this issue w...
Venky Shankar
07:30 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
@Xiubo @Venky, The PR has been merged since QA run was successful. IMO this issue will need backports too. Please che... Rishabh Dave
07:31 AM CephFS Bug #64927: qa/cephfs: test_cephfs_mirror_blocklist raises "KeyError: 'rados_inst'"
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
07:25 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Mykola, thank you for the update. It's very likely these config setting has exposed a bug in the MDS. I did a quick s... Venky Shankar
07:21 AM CephFS Bug #65783: qa: cluster [WRN] Health detail: HEALTH_WARN 1 osds down; Degraded data redundancy
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
07:07 AM CephFS Bug #65783 (Duplicate): qa: cluster [WRN] Health detail: HEALTH_WARN 1 osds down; Degraded data redundancy
"Teuthology Job":https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-d... Milind Changire
07:19 AM teuthology Bug #62937: Command failed on smithi027 with status 3: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
07:18 AM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
Neeraj, please take this one. Venky Shankar
02:28 AM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
Patrick Donnelly wrote in #note-2:
> Venky Shankar wrote in #note-1:
> > @pdonnell Are you talking about TestShutdo...
Venky Shankar
07:11 AM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681133/ Aishwarya Mathuria
07:07 AM Orchestrator Bug #64118: cephadm: RuntimeError: Failed command: apt-get update: E: The repository 'https://download.ceph.com/debian-quincy jammy Release' does not have a Release file.
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681072/... Aishwarya Mathuria
07:01 AM CephFS Bug #65564: Test failure: test_snap_schedule_subvol_and_group_arguments_08 (tasks.cephfs.test_snap_schedules.TestSnapSchedulesSubvolAndGroupArguments)
main:
http://qa-proxy.ceph.com/teuthology/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-...
Milind Changire
06:56 AM CephFS Bug #65782 (New): qa: test_flag_scrub_mdsdir (tasks.cephfs.test_scrub_checks.TestScrubChecks)
... Milind Changire
06:50 AM teuthology Bug #61576: teuthology.exceptions.AnsibleFailedError - Failed to manage policy for boolean nagios_run_sudo
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681077/ Aishwarya Mathuria
06:47 AM CephFS Bug #65781 (New): qa: ceph version 17.2.7-917.gd69ee407 was not installed, found 17.2.7-913.g8c431824.el8.
... Milind Changire
06:44 AM Orchestrator Bug #63784: qa/standalone/mon/mkfs.sh:'mkfs/a' already exists and is not empty: monitor may already exist
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681112/ Aishwarya Mathuria
06:42 AM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681014
/a/yuriw-2024-04-...
Aishwarya Mathuria
06:40 AM CephFS Bug #65780 (Duplicate): qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
... Milind Changire
06:38 AM CephFS Bug #65779 (New): qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
... Milind Changire
06:37 AM rgw Bug #59380: rados/singleton-nomsgr: test failing from "Health check failed: 1 full osd(s) (OSD_FULL)" and "Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)"
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681095/
/a/yuriw-2024-04-...
Aishwarya Mathuria
06:30 AM CephFS Bug #65778 (New): qa: valgrind error: Leak_StillReachable malloc malloc strdup
... Milind Changire
06:25 AM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680976/
/a/yuriw-2024-04...
Aishwarya Mathuria
06:23 AM CephFS Bug #57677: qa: "1 MDSs behind on trimming (MDS_TRIM)"
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
06:20 AM CephFS Bug #62658: error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
06:11 AM CephFS Bug #57676: qa: error during scrub thrashing: rank damage found: {'backtrace'}
main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi...
Milind Changire
06:00 AM CephFS Bug #65777 (Duplicate): qa: error during scrub thrashing
... Milind Changire
05:40 AM RADOS Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680980/
/a/yuriw-2024-04-...
Aishwarya Mathuria
05:38 AM devops Backport #65776 (New): reef: ceph-mgr-dashboard RPM requires python3-werkzeug
Backport Bot
05:38 AM devops Backport #65775 (New): squid: ceph-mgr-dashboard RPM requires python3-werkzeug
Backport Bot
05:31 AM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680989
/a/yuriw-2024-04-...
Aishwarya Mathuria
05:30 AM devops Bug #65693 (Pending Backport): ceph-mgr-dashboard RPM requires python3-werkzeug
Nizamudeen A
05:27 AM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680957/
/a/yuriw-2024-04...
Aishwarya Mathuria
05:25 AM Dashboard Bug #65774 (New): mgr/dashboard: Filter alerts based on cluster fsid and do not allow to connect clusters with version less than hub cluster in multi-cluster

1.Since we have a new cluster variable in the prometheus metrics , we need to filter the alerts based on the cluste...
Aashish Sharma
05:20 AM Orchestrator Bug #64871: rados/cephadm/workunits: Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)" in cluster log
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681108/ Aishwarya Mathuria
05:16 AM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
/a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680978 Aishwarya Mathuria
04:34 AM bluestore Bug #65735: OSDs failed to restart when doing crimson-osd:thrash tests
This was caused by a miss handling of CEPH_OSD_OP_CREATE in crimson, a new issue has been created: https://tracker.ce... Xuehan Xu
04:30 AM crimson Bug #65773 (New): OSDs failed to restart when doing crimson-rados:thrash tests
... Xuehan Xu
03:56 AM rgw Bug #65772 (New): Bucket lifecycle not working while bucket versioning is suspended
How to procedure :
1. Create bucket, enable bucket versioning then suspend bucket versioning.
2. Upload file to b...
hoan nv
01:07 AM Ceph QA QA Run #65771 (QA Approved): wip-pdonnell-testing-20240503.163550-debug
* "PR #57226":https://github.com/ceph/ceph/pull/57226 -- common: mark assert-only variables as unused
* "PR #57192":...
Patrick Donnelly

05/02/2024

11:59 PM Ceph QA QA Run #65594 (QA Needs Approval): wip-yuriw11-testing-20240501.200505-squid
Yuri Weinstein
08:09 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
https://shaman.ceph.com/builds/ceph/wip-yuriw11-testing-20240501.200505-squid/f273489c6dcd4bc88409993babd09dd99491162c/ Yuri Weinstein
07:44 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
rebasing Yuri Weinstein
07:39 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
Laura Flores wrote in #note-12:
> @yuriw I'm sorry to request another rebase- there has been a new commit added to h...
Yuri Weinstein
07:06 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
@yuriw I'm sorry to request another rebase- there has been a new commit added to https://github.com/ceph/ceph/pull/57... Laura Flores
08:50 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
I did 400 deployments (2 runs of 200 deployments on separate vms) with 0 ceph issues.
Using your image pushed to m...
Nir Soffer
08:18 PM rgw Backport #65636 (Resolved): squid: release note for rgw_realm init
Casey Bodley
07:12 PM rgw Bug #65668 (Fix Under Review): Notification: Persistent queue not deleted when topic is deleted via radosgw-admin
Casey Bodley
07:09 PM Ceph QA QA Run #65688 (QA Needs Approval): wip-yuri4-testing-2024-04-29-0642
Yuri Weinstein
07:09 PM Ceph QA QA Run #65688: wip-yuri4-testing-2024-04-29-0642
@sseshasa rerrunnig
Pls assign it back to me in the future if you need a rerun or else
Yuri Weinstein
03:56 PM Ceph QA QA Run #65688 (QA Needs Rerun/Rebuilt): wip-yuri4-testing-2024-04-29-0642
@yuriw I updated the Rados analysis here: https://tracker.ceph.com/projects/rados/wiki/SQUID#httpstrackercephcomissue... Sridhar Seshasayee
05:40 PM rgw Bug #65436: Getting Object Crashing radosgw services
Hello,
I was able to test in 17.2.7 and the rgw service is still crashing with the same error message....
Reid Guyett
05:27 PM rgw Backport #65640 (Resolved): squid: [rgw][accounts] bucket quota management at account-level
Casey Bodley
05:17 PM CephFS Bug #65770 (New): qa: failed to be set on mds daemons: {'mds.imported', 'mds.exported'}
This issue has been seen in QA runs for a couple of months but it incorrectly got marked as known issue. https://trac... Rishabh Dave
03:47 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/7680132
Sridhar Seshasayee
03:44 PM Ceph Backport #65391: squid: osd/scrub: "reservation requested while still reserved" error in cluster log
Just for tracking - Lot's of failures reported on the following squid run (22 failures):
https://pulpito.ceph.com/yu...
Sridhar Seshasayee
03:40 PM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/768019...
Sridhar Seshasayee
03:40 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/768015...
Sridhar Seshasayee
03:39 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/768024...
Sridhar Seshasayee
03:37 PM rgw Feature #65769: rgw: make incomplete multipart upload part of bucket check efficient
quincy backport ready to go -- https://github.com/ceph/ceph/pull/57244 J. Eric Ivancich
03:36 PM rgw Feature #65769 (Fix Under Review): rgw: make incomplete multipart upload part of bucket check efficient
Previously the incomplete multipart portion of bucket check would list all entries in the multipart namespace across ... J. Eric Ivancich
03:35 PM RADOS Bug #65768 (New): rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
This is observed on squid. I couldn't find a tracker on main related to this test.
A more proper analysis on whether...
Sridhar Seshasayee
03:31 PM bluestore Backport #65358 (Fix Under Review): quincy: BlueFS log runway space exhausted
Md Mahamudur Rahaman Sajib
03:31 PM bluestore Backport #65356 (Fix Under Review): reef: BlueFS log runway space exhausted
Md Mahamudur Rahaman Sajib
03:30 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
See also:... Patrick Donnelly
03:27 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
... Patrick Donnelly
08:52 AM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
@pdonnell I'm not ready to investigate why the freezing takes so long. Maybe it's one of those cases you mentioned to... Leonid Usov
08:31 AM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
I made some progress.
the directory represented by the inode 0x1000000003 is owned by rank 0, we are the replica (...
Leonid Usov
03:29 PM rgw Backport #65767 (In Progress): squid: rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
Casey Bodley
02:58 PM rgw Backport #65767 (In Progress): squid: rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
https://github.com/ceph/ceph/pull/57242 Backport Bot
03:16 PM bluestore Backport #65357 (Resolved): squid: BlueFS log runway space exhausted
Md Mahamudur Rahaman Sajib
02:58 PM rgw Bug #65590 (Pending Backport): rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
Casey Bodley
02:39 PM Dashboard Feature #50327 (In Progress): mgr/dashboard: add/edit lifecycle policy
Pedro González Gómez
02:38 PM RADOS Bug #65686 (Fix Under Review): ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
Igor Fedotov
02:16 PM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
This issue impacts most test runs lately, bumping up. Matan Breizman
02:16 PM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
Looking at:
> OSDs 2 and 3:
> https://pulpito.ceph.com/matan-2024-05-02_11:41:00-crimson-rados-wip-crimson-only-coh...
Matan Breizman
01:26 PM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
OSDs 2 and 3:
https://pulpito.ceph.com/matan-2024-05-02_11:41:00-crimson-rados-wip-crimson-only-coherent-log-and-at_...
Matan Breizman
01:52 PM CephFS Bug #65766 (New): qa: perm denied for runing find on cephtest dir
After the test suite finishes running, during teardown/unwinding running @find /home/ubuntu/cephtest@ unexpectedly pr... Rishabh Dave
01:51 PM CephFS Bug #50719: xattr returning from the dead (sic!)
HI, Can I get an update on this? Matthew Hutchinson
01:50 PM CephFS Bug #62664 (Fix Under Review): ceph-fuse: failed to remount for kernel dentry trimming; quitting!
Jakob Haufe wrote in #note-4:
> I created a minimal PR implementing this: https://github.com/ceph/ceph/pull/57170
>...
Venky Shankar
01:17 PM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
Venky Shankar wrote in #note-1:
> @pdonnell Are you talking about TestShutdownKillpoints() in test_failover? If yes,...
Patrick Donnelly
10:17 AM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
@pdonnell Are you talking about TestShutdownKillpoints() in test_failover? If yes, you are suggesting changing, e.g.:... Venky Shankar
01:13 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Venky Shankar wrote in #note-40:
> It's not the metrics stuff but a bug when the MDS sends back a client_session(ope...
Patrick Donnelly
12:08 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
It's not the metrics stuff but a bug when the MDS sends back a client_session(open) during client reconnect (post mds... Venky Shankar
12:55 PM RADOS Bug #65765 (New): squid: rados/test.sh: LibRadosWatchNotifyECPP.WatchNotify test of api_watch_notify_pp suite didn't complete.
The following failure was seen on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-dist...
Sridhar Seshasayee
12:47 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
Suggest updating your tearDown/setUp procedures to mirror what CephFSTestCase is doing. Patrick Donnelly
12:29 PM Orchestrator Backport #65763 (In Progress): reef: cephadm: set "osd - profile rbd" for nvmeof service
Adam King
12:09 PM Orchestrator Backport #65763 (In Progress): reef: cephadm: set "osd - profile rbd" for nvmeof service
https://github.com/ceph/ceph/pull/57234 Backport Bot
12:28 PM Orchestrator Backport #65762 (In Progress): squid: cephadm: set "osd - profile rbd" for nvmeof service
Adam King
12:09 PM Orchestrator Backport #65762 (In Progress): squid: cephadm: set "osd - profile rbd" for nvmeof service
https://github.com/ceph/ceph/pull/57233 Backport Bot
12:26 PM CephFS Bug #65364: Provide metrics support for the Target Cluster Disconnection status
Copying from the bz update:
I had a chat about this with Greg. Unfortunately, the messenger layer isn't the most a...
Venky Shankar
12:25 PM CephFS Bug #65564 (Fix Under Review): Test failure: test_snap_schedule_subvol_and_group_arguments_08 (tasks.cephfs.test_snap_schedules.TestSnapSchedulesSubvolAndGroupArguments)
Venky Shankar
12:12 PM Ceph QA QA Run #65764: wip-rishabh-testing-20240426.111959
There were lots of new failures along the usual ones. These new failures were caused by - https://github.com/ceph/cep... Rishabh Dave
12:10 PM Ceph QA QA Run #65764 (QA Closed): wip-rishabh-testing-20240426.111959
* https://github.com/ceph/ceph/pull/56981
* https://github.com/ceph/ceph/pull/56846
* https://github.com/ceph/ceph/...
Rishabh Dave
12:04 PM Orchestrator Bug #65691 (Pending Backport): cephadm: set "osd - profile rbd" for nvmeof service
Adam King
12:02 PM CephFS Bug #65761: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
https://pulpito.ceph.com/rishabh-2024-04-28_11:41:23-fs-wip-rishabh-testing-20240426.111959-testing-default-smithi/76... Rishabh Dave
12:00 PM CephFS Bug #65761 (New): valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
First saw these failures in a QA run for CephFS PRs, when I ran failed jobs from that run against main branch version... Rishabh Dave
12:00 PM Orchestrator Bug #65739 (In Progress): Cephadm adopt doesn't support "--no-cgroups-split" flag
Adam King
11:47 AM Dashboard Bug #65760 (Pending Backport): mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
Fix extra braces in multi cluster overview grafana json Aashish Sharma
11:00 AM rgw Backport #65244 (In Progress): squid: RGW/s3select : several issues, s3select related, some caused a crash.
Gal Salomon
10:45 AM CephFS Bug #65604: dbench.sh workload times out after 3h when run with-quiescer
There already a tracker for this. Will dig it up and link. Venky Shankar
10:45 AM rgw Backport #65245 (In Progress): reef: RGW/s3select : several issues, s3select related, some caused a crash.
Gal Salomon
10:19 AM CephFS Bug #65572: Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi155 with status 1
Lowered prior since this seems to be infra noise. Venky Shankar
10:13 AM RADOS Bug #65227 (Need More Info): noscrub cluster flag prevents deep-scrubs from starting
I am not able to reproduce the problem. Can you attach debug logs (including of the commands used to recreate the sce... Ronen Friedman
10:06 AM Dashboard Bug #65218 (Pending Backport): mgr/dashboard: Grafana ceph-cluster.json doesn't support cluster label
Aashish Sharma
10:02 AM CephFS Backport #65406 (In Progress): quincy: mds: Reduce log level for messages when mds is stopping
Kotresh Hiremath Ravishankar
09:54 AM CephFS Backport #65405 (In Progress): reef: mds: Reduce log level for messages when mds is stopping
Kotresh Hiremath Ravishankar
09:49 AM bluestore Fix #58759: BlueFS log runway space exhausted
I found that commits has been processed in squid. In that case we can remove it. Md Mahamudur Rahaman Sajib
09:23 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
Aashish Sharma
07:43 AM CephFS Backport #65404 (In Progress): squid: mds: Reduce log level for messages when mds is stopping
Kotresh Hiremath Ravishankar
07:39 AM CephFS Bug #65660: mds: drop client metrics during recovery
Venky Shankar wrote in #note-5:
> Patrick Donnelly wrote in #note-4:
> > Christopher Hoffman wrote in #note-2:
> >...
Dhairya Parmar
05:53 AM CephFS Bug #65660: mds: drop client metrics during recovery
Patrick Donnelly wrote in #note-4:
> Christopher Hoffman wrote in #note-2:
> > >there's little reason to record his...
Venky Shankar
07:31 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
@vshankar @patrick can you update this tracker with the discussion you guys had on the call post standup on tuesday? ... Dhairya Parmar
06:38 AM Dashboard Backport #65759 (In Progress): quincy: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Nizamudeen A
06:28 AM Dashboard Backport #65759 (In Progress): quincy: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
https://github.com/ceph/ceph/pull/57221 Backport Bot
06:34 AM Dashboard Backport #65758 (In Progress): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Nizamudeen A
06:28 AM Dashboard Backport #65758 (Resolved): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
https://github.com/ceph/ceph/pull/57220 Backport Bot
06:33 AM Dashboard Backport #65756 (In Progress): reef: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Nizamudeen A
06:17 AM Dashboard Backport #65756 (In Progress): reef: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
https://github.com/ceph/ceph/pull/57219 Backport Bot
06:17 AM Dashboard Backport #65757 (New): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Backport Bot
06:06 AM Dashboard Bug #65698 (Pending Backport): mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Nizamudeen A
05:56 AM Dashboard Backport #65755 (New): reef: mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
Backport Bot
05:55 AM Dashboard Backport #65754 (New): squid: mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
Backport Bot
05:52 AM Dashboard Bug #64080 (Resolved): mgr/dashboard: In rgw multisite, during zone creation acess/secret key should not be compulsory provide an edit option to set these keys
Nizamudeen A
05:52 AM Dashboard Backport #64791 (Resolved): squid: mgr/dashboard: In rgw multisite, during zone creation acess/secret key should not be compulsory provide an edit option to set these keys
Nizamudeen A
05:49 AM Dashboard Bug #65534 (Pending Backport): mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
Nizamudeen A
02:37 AM crimson Bug #65753: [crimson] OSD deployment fails
Duplicate of https://tracker.ceph.com/issues/65752
Do not have permission to delete, please ignore
Harsh Kumar
02:34 AM crimson Bug #65753 (Duplicate): [crimson] OSD deployment fails
While deploying OSDs on a Crimson cluster, the following error was observed... Harsh Kumar
02:33 AM crimson Bug #65752 (Need More Info): [crimson] OSD deployment fails
While deploying OSDs on a Crimson cluster, the following error was observed... Harsh Kumar

05/01/2024

11:39 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
Partial fix for some of the warnings: https://github.com/ceph/ceph/pull/57218 Laura Flores
08:01 AM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664127
/a/yuriw-2024...
Matan Breizman
10:48 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
rebased and pushed a new branch wip-yuri8-testing-2024-05-01-1547 meanwhile
https://shaman.ceph.com/builds/ceph/wip-...
Yuri Weinstein
10:13 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
still can't schedule
asked for help https://ceph-storage.slack.com/archives/C04SYTAN25P/p1714601019211669
Yuri Weinstein
10:47 PM RADOS Tasks #65751 (New): Add OS type and version to "ceph tell mon.* sessions" json dump
Related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2154808
Context from BZ:...
Laura Flores
10:26 PM teuthology Bug #65750 (New): "RuntimeError: Read beyond file size detected, file is corrupted."
I have seen lately errors, see below during suites scheduling.
Here is the run and a log snippet:
https://pulpi...
Yuri Weinstein
10:17 PM Ceph QA QA Run #65349 (QA Needs Approval): wip-yuri3-testing-2024-04-05-0825
@ksirivad pls review when ready Yuri Weinstein
08:26 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
https://shaman.ceph.com/builds/ceph/wip-yuri3-testing-2024-04-05-0825/a53b05d03701e4d0ba0c9aadc7431842129aabf9/ Yuri Weinstein
03:17 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
Kamoltat (Junior) Sirivadhna wrote in #note-17:
> @yuriw Rebase and re-run please, I think now infra might be fixed...
Yuri Weinstein
03:16 PM Ceph QA QA Run #65349 (QA Needs Rerun/Rebuilt): wip-yuri3-testing-2024-04-05-0825
Yuri Weinstein
02:56 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@yuriw Rebase and re-run please, I think now infra might be fixed, but rebasing because it has been while since the ... Kamoltat (Junior) Sirivadhna
10:12 PM RADOS Bug #65749 (Fix Under Review): osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
Dan van der Ster
10:00 PM RADOS Bug #65749 (In Progress): osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
Dan van der Ster
09:56 PM RADOS Bug #65749 (Fix Under Review): osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
In the field this issue comes up very often. It is quite disruptive because PGs are stuck in activating state and the... Joshua Blanch
09:44 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
The problem here doesn't seem to be quiesce-related.
We can see that the path traversal didn't attempt to authpin ...
Leonid Usov
12:31 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
Leonid Usov wrote in #note-1:
> @pdonnell, so is this a deadlock between the operations, or just an unfortunate timi...
Patrick Donnelly
10:34 AM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
@pdonnell, so is this a deadlock between the operations, or just an unfortunate timing of the quiesce which would suc... Leonid Usov
09:38 PM mgr Bug #65748 (Fix Under Review): Change default upmap_max_deviation to 1
Dan van der Ster
09:07 PM mgr Bug #65748 (Fix Under Review): Change default upmap_max_deviation to 1
Field experience shows that default upmax_max_deviation 5 is not effective to reach well a balanced cluster. This is ... Joshua Blanch
07:37 PM Ceph QA QA Run #65594 (QA Needs Rerun/Rebuilt): wip-yuriw11-testing-20240501.200505-squid
@yuriw can you rebase this branch? There are way too many failures due to:... Laura Flores
02:58 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
I am rescheduling the rados suite because everything died in the last run. :/ Laura Flores
07:11 PM Ceph Feature #65747 (In Progress): common/admin_socket: support saving json output to a file local to the daemon
Patrick Donnelly
07:08 PM Ceph Feature #65747 (In Progress): common/admin_socket: support saving json output to a file local to the daemon
The @ceph tell mds.X cache dump@ and @ceph tell mds.X ops@ commands have a useful @--path@ argument that directs the ... Patrick Donnelly
06:26 PM RADOS Backport #63400 (Resolved): reef: pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
Igor Fedotov
02:28 PM RADOS Backport #63400: reef: pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
Igor Fedotov wrote in #note-2:
> https://github.com/ceph/ceph/pull/54358
merged
Yuri Weinstein
06:04 PM rgw Bug #65746 (Pending Backport): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)

+ # XXXX re-trying the complete is failing in RGW due to an internal error that appears not caused
+ # check...
Matt Benjamin
05:50 PM CephFS Tasks #64133 (Resolved): Make pjd work on fscrypt
pjd completes and passes all tests. Christopher Hoffman
05:38 PM CephFS Tasks #65745 (Resolved): RMW fail when on end of block or file
... Christopher Hoffman
05:35 PM CephFS Tasks #65745 (Resolved): RMW fail when on end of block or file
A RMW will fail when at end boundary of block or file.
See:...
Christopher Hoffman
05:33 PM rbd Feature #65624: [pybind] expose CLONE_FORMAT and FLATTEN image options
While working on this, https://tracker.ceph.com/issues/65743 and https://tracker.ceph.com/issues/65744 were discovered. Ilya Dryomov
05:27 PM rbd Feature #65624 (Fix Under Review): [pybind] expose CLONE_FORMAT and FLATTEN image options
Ilya Dryomov
05:31 PM rbd Bug #65744 (New): FORMAT and CLONE_FORMAT image options accept bogus values
In particular, for RBD_IMAGE_OPTION_CLONE_FORMAT, 1 selects clone v1 and everything else (i.e. 0, 2, 3, ...) selects ... Ilya Dryomov
05:10 PM rbd Bug #65743 (New): migration of a clone with --flatten doesn't fully detach from the parent
... Ilya Dryomov
05:05 PM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
as discussed, we'll revert this for main/squid until we have a chance to validate the fix. the reverts are tracked in... Casey Bodley
05:03 PM rgw Bug #65742 (Fix Under Review): beast: revert changes to ssl async_shutdown()
Casey Bodley
04:47 PM rgw Bug #65742 (Fix Under Review): beast: revert changes to ssl async_shutdown()
the crash tracked in https://tracker.ceph.com/issues/65664 was introduced by https://github.com/ceph/ceph/pull/55967.... Casey Bodley
04:31 PM rgw Feature #65741: rgw: implement RestrictPublicBuckets from Blocking public access
RFC: https://github.com/ceph/ceph/pull/57206 Seena Fallah
04:22 PM rgw Feature #65741 (New): rgw: implement RestrictPublicBuckets from Blocking public access
Currently setting RestrictPublicBuckets has no effects on the bucket.
ref. https://docs.aws.amazon.com/AmazonS3/late...
Seena Fallah
04:27 PM rgw Documentation #50084 (Won't Fix): notifications: document behavior in case of multisite
Konstantin Shalygin
03:54 PM rgw Documentation #50084: notifications: document behavior in case of multisite
starting from squid, topics and notifications are replicated between sites. Yuval Lifshitz
04:27 PM rgw Documentation #49649 (Won't Fix): add information on the system objects holding notifications
Konstantin Shalygin
03:52 PM rgw Documentation #49649: add information on the system objects holding notifications
the object format was changed as part of the squid release.
we should probably not document that.
Yuval Lifshitz
03:43 PM CephFS Backport #65740 (New): squid: mds: missing policylock acquisition for quiesce
Backport Bot
03:41 PM Orchestrator Bug #65739: Cephadm adopt doesn't support "--no-cgroups-split" flag
PR
https://github.com/ceph/ceph/pull/57205
waiting for review
Gilad Sid
02:28 PM Orchestrator Bug #65739: Cephadm adopt doesn't support "--no-cgroups-split" flag
working on a PR, its a very simple fix
Gilad Sid
02:09 PM Orchestrator Bug #65739 (In Progress): Cephadm adopt doesn't support "--no-cgroups-split" flag
Attempting to adopt a legacy daemon to cephadm with '--no-cgroups-split' fails due to
"cephadm: error: unrecognized ...
Gilad Sid
03:36 PM CephFS Bug #65595 (Pending Backport): mds: missing policylock acquisition for quiesce
Patrick Donnelly
03:31 PM rgw Bug #65656 (Fix Under Review): Reduce default thread pool size
Casey Bodley
03:30 PM Infrastructure Bug #65727: ntpq: command not found
I would suggest using "chronyc sources" instead of ntpq, chrony is the newer tool that is used adam kraitman
11:47 AM Infrastructure Bug #65727 (In Progress): ntpq: command not found
adam kraitman
03:17 PM Infrastructure Bug #65734 (Closed): Expected OS to be centos 8 but found ubuntu 22.04
This error was around the same time Centos image was recaptured adam kraitman
08:41 AM Infrastructure Bug #65734 (Closed): Expected OS to be centos 8 but found ubuntu 22.04
... Matan Breizman
03:02 PM CephFS Bug #62664: ceph-fuse: failed to remount for kernel dentry trimming; quitting!
I created a minimal PR implementing this: https://github.com/ceph/ceph/pull/57170
Is there any jenkins test to run...
Jakob Haufe
03:00 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
I was about to open an issue and got to this one with a search. I suspect that this should be handled by the cephfs t... Leonid Usov
02:43 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
https://jenkins.ceph.com/job/ceph-api/73295/... Casey Bodley
02:36 PM CephFS Backport #65710 (Fix Under Review): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Leonid Usov
01:26 PM CephFS Backport #65710 (In Progress): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Leonid Usov
01:25 PM CephFS Backport #65710 (Fix Under Review): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Leonid Usov
02:34 PM RADOS Bug #62338 (Resolved): osd: choose_async_recovery_ec may select an acting set < min_size
Konstantin Shalygin
02:33 PM RADOS Backport #62819 (Resolved): reef: osd: choose_async_recovery_ec may select an acting set < min_size
Konstantin Shalygin
02:30 PM RADOS Backport #62819: reef: osd: choose_async_recovery_ec may select an acting set < min_size
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54550
merged
Yuri Weinstein
02:32 PM Ceph QA QA Run #65655 (QA Closed): wip-yuri2-testing-2024-04-24-0914-squid
Yuri Weinstein
06:39 AM Ceph QA QA Run #65655 (QA Approved): wip-yuri2-testing-2024-04-24-0914-squid
Ronen Friedman
06:33 AM Ceph QA QA Run #65655: wip-yuri2-testing-2024-04-24-0914-squid
@yuriw - approved.
One interesting Scrub-related bug (delayed status reporting), but unrelated. Possibly a test issue...
Ronen Friedman
02:31 PM Ceph QA QA Run #65574 (QA Closed): wip-yuri7-testing-2024-04-18-1351-reef
Yuri Weinstein
08:43 AM Ceph QA QA Run #65574 (QA Approved): wip-yuri7-testing-2024-04-18-1351-reef
7664087,7664152, 7664219, 7664256, 7664282 - https://tracker.ceph.com/issues/65183 (Overriding an EC pool needs the "... Matan Breizman
02:29 PM RADOS Backport #63559: reef: Heartbeat crash in osd
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54527
merged
Yuri Weinstein
02:27 PM CephFS Backport #63363: reef: mds: create an admin socket command for raising a signal
Leonid Usov wrote in #note-2:
> https://github.com/ceph/ceph/pull/54357
merged
Yuri Weinstein
02:26 PM RADOS Backport #63289: reef: mon: segfault on rocksdb opening
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54150
merged
Yuri Weinstein
01:30 PM CephFS Backport #65738 (In Progress): squid: mds: quiesce timeout due to a freezing directory
Patrick Donnelly
01:07 PM CephFS Backport #65738 (In Progress): squid: mds: quiesce timeout due to a freezing directory
https://github.com/ceph/ceph/pull/57203 Backport Bot
01:13 PM rgw Bug #23953 (Rejected): rgw: bucket index delete cleanup
Casey Bodley
01:05 PM RADOS Bug #65737 (Fix Under Review): pg-split-merge.sh -

/a/nmordech-2024-04-30_10:14:02-rados:standalone-main-distro-default-smithi/7680852
/a/nmordech-2024-04-30_10:14:0...
Nitzan Mordechai
01:04 PM cephsqlite Backport #65736 (In Progress): quincy: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Patrick Donnelly
12:57 PM cephsqlite Backport #65736 (In Progress): quincy: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
https://github.com/ceph/ceph/pull/57199 Backport Bot
12:58 PM CephFS Bug #65603 (Pending Backport): mds: quiesce timeout due to a freezing directory
Patrick Donnelly
12:34 PM RADOS Bug #61832: Restoring #61785: osd-scrub-dump.sh: ERROR: Extra scrubs after test completion...not expected
Note: the test is disabled for now (with https://github.com/ceph/ceph/pull/54482).
No point in updating, until the f...
Ronen Friedman
10:32 AM bluestore Backport #63316 (In Progress): quincy: crash: ZonedAllocator::ZonedAllocator
Konstantin Shalygin
10:32 AM bluestore Backport #63315 (In Progress): reef: crash: ZonedAllocator::ZonedAllocator
Konstantin Shalygin
10:29 AM bluestore Backport #64592 (In Progress): quincy: BlueFS: l_bluefs_log_compactions is counted twice in sync log compaction
Konstantin Shalygin
10:29 AM bluestore Backport #64591 (In Progress): squid: BlueFS: l_bluefs_log_compactions is counted twice in sync log compaction
Konstantin Shalygin
10:27 AM bluestore Backport #64590 (In Progress): reef: BlueFS: l_bluefs_log_compactions is counted twice in sync log compaction
Konstantin Shalygin
10:25 AM bluestore Bug #64511: kv/RocksDBStore: rocksdb_cf_compact_on_deletion has no effect on the default column family
Steven Goodliff wrote in #note-4:
> Hi,
>
> Will this fix get into 18.2.3 ?, thanks
Seems, backport bot is bro...
Konstantin Shalygin
10:09 AM bluestore Bug #64511: kv/RocksDBStore: rocksdb_cf_compact_on_deletion has no effect on the default column family
Hi,
Will this fix get into 18.2.3 ?, thanks
Steven Goodliff
10:06 AM bluestore Bug #65735 (New): OSDs failed to restart when doing crimson-osd:thrash tests
... Xuehan Xu
08:34 AM rgw Backport #59730: quincy: S3 CompleteMultipartUploadResult has empty ETag element
Wout van Heeswijk wrote in #note-3:
> Is something needed to merge this?
Only developer with merge permissions
Konstantin Shalygin
08:24 AM rgw Backport #59730: quincy: S3 CompleteMultipartUploadResult has empty ETag element
@konstantin
Is something needed to merge this? main, pacific and reef have been patched already. Is something bloc...
Wout van Heeswijk
08:32 AM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/teuthology/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664305 Matan Breizman
08:27 AM Orchestrator Bug #64208: test_cephadm.sh: Container version mismatch causes job to fail.
/a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664211 Matan Breizman
08:24 AM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
/a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664183 Matan Breizman
08:13 AM CephFS Bug #65261: qa/cephfs: cephadm related failure on fs/upgrade job
/a//teuthology/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664176 Matan Breizman
08:09 AM RADOS Bug #53767: qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing causes timeout
/a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664155 Matan Breizman
08:03 AM RADOS Bug #64725: rados/singleton: application not enabled on pool 'rbd'
/a/https://pulpito.ceph.com/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smi... Matan Breizman
07:57 AM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
/a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664087 Matan Breizman
07:09 AM crimson Bug #63647: SnapTrimEvent AddressSanitizer: heap-use-after-free
https://pulpito.ceph.com/matan-2024-04-30_07:11:13-crimson-rados-wip-matanb-crimson-testing-user-modify-distro-crimso... Matan Breizman
07:08 AM crimson Bug #64206: obc->is_loaded_and_valid() assertion
osd.3
https://pulpito.ceph.com/matan-2024-04-30_07:11:13-crimson-rados-wip-matanb-crimson-testing-user-modify-distro...
Matan Breizman
06:44 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Xiubo Li wrote in #note-6:
> Venky Shankar wrote in #note-5:
> > The tracker description mentions @denied reconne...
Mykola Golub
06:25 AM RADOS Bug #44510 (Fix Under Review): osd/osd-recovery-space.sh TEST_recovery_test_simple failure
Nitzan Mordechai
01:45 AM CephFS Bug #65733 (Fix Under Review): mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
Patrick Donnelly

04/30/2024

11:45 PM CephFS Bug #65733 (Pending Backport): mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
Patrick Donnelly
11:43 PM CephFS Bug #56067 (Resolved): Cephfs data loss with root_squash enabled
Resolved via #57154 Patrick Donnelly
11:33 PM Ceph QA QA Run #65592 (QA Closed): wip-yuriw-testing-20240419.185239-main
Yuri Weinstein
11:01 PM Ceph QA QA Run #65592 (QA Approved): wip-yuriw-testing-20240419.185239-main
@yuriw rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65592 Laura Flores
11:32 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
https://github.com/ceph/ceph/pull/56995 merged Yuri Weinstein
10:59 PM CephFS Bug #64707: suites/fsstress.sh hangs on one client - test times out
/a/lflores-2024-04-29_20:31:53-upgrade-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7679200 Laura Flores
10:55 PM cephsqlite Bug #59335: Found coredumps on smithi related to sqlite3
/a/teuthology-2024-04-28_20:00:15-rados-main-distro-default-smithi/7677031 Laura Flores
10:46 PM RADOS Bug #53544: src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in thrash_cache_writeback_proxy_none tests
/a/yuriw-2024-04-21_14:00:04-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7666787 Laura Flores
09:37 PM mgr Bug #65627: Centos 9 stream ceph container iscsi test failure
Logs are in ~dmick/c9.iscsi.archive.tgz on the teuthology node. Laura Flores
09:36 PM rgw Bug #65654: run-bucket-check.sh: failed assert len(json_out) == len(unlinked_keys)
It looks like the test reproduced and caught another scenario where unlinked objects get left behind. Still looking a... Cory Snyder
09:36 PM Ceph Bug #55461 (Fix Under Review): ceph osd crush swap-bucket {old_host} {new_host} where {old_host}={new_host} crashes monitors
Dan van der Ster
09:14 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
UPD: this is not a containerized installation, so the above guess is wrong Leonid Usov
08:07 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
> The alarming problem is that the beacon upkeep thread apparently slept for about 60 seconds! This should only be po... Leonid Usov
07:24 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
Leonid Usov wrote in #note-4:
> Looking at @remote/smithi096/log/ceph-mds.d.log.gz@, I get an impression that the no...
Patrick Donnelly
07:16 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
Looking at @remote/smithi096/log/ceph-mds.d.log.gz@, I get an impression that the node is cut from network for a few ... Leonid Usov
05:10 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
... Leonid Usov
03:58 PM CephFS Bug #65704 (New): mds+valgrind: beacon thread blocked for 60+ seconds
This one is really weird and my working theory is that this is related to the quiesce database. Test symptom:
<pre...
Patrick Donnelly
08:58 PM cephsqlite Backport #65731 (In Progress): reef: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Patrick Donnelly
08:49 PM cephsqlite Backport #65731 (In Progress): reef: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
https://github.com/ceph/ceph/pull/57190 Backport Bot
08:58 PM cephsqlite Backport #65730 (In Progress): squid: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Patrick Donnelly
08:49 PM cephsqlite Backport #65730 (In Progress): squid: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
https://github.com/ceph/ceph/pull/57189 Backport Bot
08:53 PM Orchestrator Bug #65732 (New): rados/cephadm/osds: job times out during nvme_loop interval
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664906... Laura Flores
08:46 PM cephsqlite Bug #65494 (Pending Backport): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Patrick Donnelly
08:37 PM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7665003 Laura Flores
08:34 PM RADOS Bug #65729 (New): thrash_cache_writeback_proxy_none: command failed when setting target_max_objects
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664981... Laura Flores
08:26 PM Orchestrator Bug #63784: qa/standalone/mon/mkfs.sh:'mkfs/a' already exists and is not empty: monitor may already exist
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664969 Laura Flores
08:23 PM Orchestrator Bug #65728 (New): Alertmanager in an unknown state
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664960/remote/smithi... Laura Flores
08:01 PM Infrastructure Bug #65727 (In Progress): ntpq: command not found
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664955... Laura Flores
07:57 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664940
OSD_DOWN
Laura Flores
07:11 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664903... Laura Flores
06:15 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664854
/a/yuriw-2024...
Laura Flores
06:13 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664765
/a/yuriw-2024...
Laura Flores
06:03 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664686... Laura Flores
07:56 PM RADOS Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664923... Laura Flores
07:47 PM Orchestrator Backport #65417 (Resolved): squid: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
Adam King
07:46 PM Orchestrator Bug #65234 (Resolved): upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
Adam King
07:44 PM Orchestrator Backport #65381 (Resolved): squid: upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
Adam King
07:43 PM Orchestrator Backport #65726 (New): quincy: cephadm: anonymous_access: false is dropped from grafana spec after apply
Backport Bot
07:43 PM Orchestrator Backport #65725 (New): reef: cephadm: anonymous_access: false is dropped from grafana spec after apply
Backport Bot
07:42 PM Orchestrator Backport #65724 (New): squid: cephadm: anonymous_access: false is dropped from grafana spec after apply
Backport Bot
07:42 PM Orchestrator Backport #65723 (New): reef: cephadm: agent tries to json load response payload before checking for errors
Backport Bot
07:42 PM Orchestrator Backport #65722 (New): squid: cephadm: agent tries to json load response payload before checking for errors
Backport Bot
07:41 PM Orchestrator Bug #65553 (Pending Backport): cephadm: agent tries to json load response payload before checking for errors
Adam King
07:36 PM Orchestrator Bug #65511 (Pending Backport): cephadm: anonymous_access: false is dropped from grafana spec after apply
Adam King
07:27 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-20:
> Hi Nir,
>
> I built a container based on 18.2.3 (an upcoming release). It woul...
Nir Soffer
12:51 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-20:
> Hi Nir,
>
> I built a container based on 18.2.3 (an upcoming release). It woul...
Nir Soffer
07:16 PM RADOS Bug #65721 (New): [MON] Connection Scores: peers become dead after ~5mins, However quorum seems fine
All ranks reports that everyone is alive and well... Kamoltat (Junior) Sirivadhna
07:03 PM RADOS Bug #65695 (Fix Under Review): [MON] ConnectionTracker dumps duplicate keys
Kamoltat (Junior) Sirivadhna
02:26 AM RADOS Bug #65695 (Fix Under Review): [MON] ConnectionTracker dumps duplicate keys
Problem:
Currently, the ConnectionTracker::dump()
will dump a duplicate key which is not
ideal when you want to ...
Kamoltat (Junior) Sirivadhna
06:48 PM rbd Feature #65720 (New): diff-iterate should allow passing the "from snapshot" by snap ID
If e.g. RBD_SNAP_NAMESPACE_TYPE_TRASH snapshots pile up, a lot of space can go missing/unaccounted for from the user'... Ilya Dryomov
06:24 PM Infrastructure Bug #65719 (New): debian-17.2.6 jammy repository does not have a Release file
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664896... Laura Flores
06:12 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664739 Laura Flores
06:07 PM Orchestrator Bug #65718 (In Progress): cephadm: nvmeof daemon omap_file_lock_retry_sleep_interval default causes daemon to fail to start
... Adam King
05:52 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@ksirivad I don't know what you want to do
PLMK
Yuri Weinstein
04:07 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@yuriw
Hmm in the most recent re-run you scheduled https://pulpito.ceph.com/yuriw-2024-04-26_18:18:24-rados-wip-y...
Kamoltat (Junior) Sirivadhna
05:47 PM Orchestrator Bug #65717 (In Progress): cephadm: iscsi and nvme auth keyring are not cleaned up
If you move/remove an iscsi daemon, the keyring for the removed daemon is left behind unless the user cleans up the k... Adam King
05:38 PM CephFS Backport #65711 (In Progress): squid: mds: regular file inode flags are not replicated by the policylock
Patrick Donnelly
04:26 PM CephFS Backport #65711 (In Progress): squid: mds: regular file inode flags are not replicated by the policylock
https://github.com/ceph/ceph/pull/57179 Backport Bot
05:36 PM CephFS Backport #65713 (In Progress): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
05:29 PM CephFS Backport #65713 (New): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
05:28 PM CephFS Backport #65713 (Rejected): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
04:27 PM CephFS Backport #65713 (In Progress): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
https://github.com/ceph/ceph/pull/57178 Backport Bot
05:28 PM CephFS Backport #65715 (In Progress): reef: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
04:27 PM CephFS Backport #65715 (In Progress): reef: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
https://github.com/ceph/ceph/pull/57177 Backport Bot
05:20 PM CephFS Backport #65714 (In Progress): squid: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
04:27 PM CephFS Backport #65714 (In Progress): squid: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
https://github.com/ceph/ceph/pull/57176 Backport Bot
05:19 PM CephFS Backport #65712 (In Progress): squid: qa: lockup not long enough to for test_quiesce_authpin_wait
Patrick Donnelly
04:26 PM CephFS Backport #65712 (In Progress): squid: qa: lockup not long enough to for test_quiesce_authpin_wait
https://github.com/ceph/ceph/pull/57175 Backport Bot
05:18 PM CephFS Backport #65709 (In Progress): reef: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
Patrick Donnelly
04:25 PM CephFS Backport #65709 (In Progress): reef: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
https://github.com/ceph/ceph/pull/57174 Backport Bot
05:18 PM CephFS Backport #65708 (In Progress): squid: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
Patrick Donnelly
04:25 PM CephFS Backport #65708 (In Progress): squid: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
https://github.com/ceph/ceph/pull/57173 Backport Bot
05:17 PM CephFS Backport #65707 (In Progress): reef: qa: increase debugging for snap_schedule
Patrick Donnelly
04:25 PM CephFS Backport #65707 (In Progress): reef: qa: increase debugging for snap_schedule
https://github.com/ceph/ceph/pull/57172 Backport Bot
05:17 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Patrick Donnelly wrote in #note-36:
> It sounds more and more to me like there is some kind of request the client co...
Venky Shankar
10:30 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Venky Shankar wrote in #note-37:
> Patrick Donnelly wrote in #note-36:
> > It sounds more and more to me like there...
Venky Shankar
05:29 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Patrick Donnelly wrote in #note-36:
> It sounds more and more to me like there is some kind of request the client co...
Venky Shankar
05:17 PM CephFS Backport #65706 (In Progress): squid: qa: increase debugging for snap_schedule
Patrick Donnelly
04:25 PM CephFS Backport #65706 (In Progress): squid: qa: increase debugging for snap_schedule
https://github.com/ceph/ceph/pull/57171 Backport Bot
05:11 PM CephFS Bug #65716 (In Progress): mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
... Patrick Donnelly
04:26 PM CephFS Backport #65710 (Fix Under Review): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Backport Bot
04:22 PM CephFS Fix #65617 (Pending Backport): qa: increase debugging for snap_schedule
Patrick Donnelly
04:22 PM CephFS Bug #65614 (Pending Backport): client: resends request to same MDS it just received a forward from if it does not have an open session with the target
Patrick Donnelly
04:21 PM CephFS Bug #65606 (Pending Backport): workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Patrick Donnelly
04:20 PM CephFS Bug #65518 (Pending Backport): mds: regular file inode flags are not replicated by the policylock
Patrick Donnelly
04:20 PM CephFS Bug #65496 (Pending Backport): mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
04:19 PM CephFS Bug #65508 (Pending Backport): qa: lockup not long enough to for test_quiesce_authpin_wait
Patrick Donnelly
04:13 PM Ceph QA QA Run #65694 (QA Closed): wip-pdonnell-testing-20240429.210911-debug
Patrick Donnelly
04:13 PM Ceph QA QA Run #65694 (QA Approved): wip-pdonnell-testing-20240429.210911-debug
https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-04-30 Patrick Donnelly
04:13 PM CephFS Bug #65705 (Fix Under Review): qa: snaptest-multiple-capsnaps.sh failure
... Patrick Donnelly
04:10 PM Linux kernel client Bug #64471: kernel: upgrades from quincy/v18.2.[01]/reef to main|squid fail with kernel oops
/teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680637/... Patrick Donnelly
03:34 PM CephFS Bug #62664: ceph-fuse: failed to remount for kernel dentry trimming; quitting!
This is related to https://github.com/util-linux/util-linux/issues/2576 and will happen on any system with util-linux... Jakob Haufe
03:12 PM Orchestrator Bug #65703 (New): qa/suites/fs/upgrade: Command failed ... ceph orch upgrade check quay.ceph.io/ceph-ci/ceph:$sha1
... Patrick Donnelly
02:57 PM Infrastructure Bug #65639 (In Progress): smithi139 unable to be reached over ssh
adam kraitman
02:48 PM Infrastructure Bug #65639 (Resolved): smithi139 unable to be reached over ssh
I checked for hardware issues on smithi139 but I didn't found anything , I can leave this ticket on hold and check ag... adam kraitman
11:27 AM Infrastructure Bug #65639 (In Progress): smithi139 unable to be reached over ssh
adam kraitman
02:54 PM rgw Feature #61887 (Fix Under Review): s3: GetBucketLocation should also return placement target
Casey Bodley
02:39 PM Orchestrator Bug #65702 (In Progress): cephadm: ganesha-rados-grace tool sometimes fails the first time it is run, causing a health warning

Failures look like...
Adam King
02:31 PM CephFS Bug #65618: qa: fsstress: cannot execute binary file: Exec format error
Other runs:... Patrick Donnelly
01:02 PM CephFS Bug #65618 (Triaged): qa: fsstress: cannot execute binary file: Exec format error
Venky Shankar
02:28 PM Infrastructure Bug #65682 (Resolved): Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
adam kraitman
08:40 AM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
Fixed, magna006 reimaged with rhel 9.3 adam kraitman
02:24 PM CephFS Bug #65701 (Fix Under Review): qa: quiesce cache/ops dump not world readable
Patrick Donnelly
02:22 PM CephFS Bug #65701 (Pending Backport): qa: quiesce cache/ops dump not world readable
/teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680431/... Patrick Donnelly
02:18 PM Ceph QA QA Run #65560: wip-yuri5-testing-2024-04-17-1400
Aishwarya Mathuria wrote in #note-6:
> @yuriw can we please re-run this? There are 172 dead jobs.
rerunning
Yuri Weinstein
02:18 PM Ceph QA QA Run #65560 (QA Needs Approval): wip-yuri5-testing-2024-04-17-1400
Yuri Weinstein
04:23 AM Ceph QA QA Run #65560 (QA Needs Rerun/Rebuilt): wip-yuri5-testing-2024-04-17-1400
@yuriw can we please re-run this? There are 172 dead jobs. Aishwarya Mathuria
02:06 AM Ceph QA QA Run #65560: wip-yuri5-testing-2024-04-17-1400
@lflores sure Aishwarya Mathuria
02:04 PM CephFS Bug #65700 (Fix Under Review): qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
Patrick Donnelly
02:02 PM CephFS Bug #65700 (Pending Backport): qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
https://pulpito.ceph.com/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-sm... Patrick Donnelly
01:08 PM CephFS Bug #65580 (Triaged): mds/client: add dummy client feature to test client eviction
Venky Shankar
11:36 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
BTW we do have this [0] which is run as part of [1] so we can run this after the MDS upgrade (with some minor tweaks ... Dhairya Parmar
11:29 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
So this would be something like adding `CEPHFS_FEATURE_DUMMY` to `include/ceph_features.h`, then post MDS upgrade hav... Dhairya Parmar
01:08 PM ceph-volume Bug #65584 (Fix Under Review): ceph-volume: use os.makedirs to implement mkdir_p
Guillaume Abrioux
01:04 PM CephFS Bug #65572: Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi155 with status 1
There even no any ceph side logs. It seemed the cluster dead suddenly. Xiubo Li
12:58 PM CephFS Bug #65647 (Triaged): Evicted kernel client may get stuck after reconnect
Venky Shankar
12:44 PM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Venky Shankar wrote in #note-5:
> Mykola Golub wrote in #note-4:
> > Xiubo Li wrote in #note-3:
> >
> > > Is tha...
Xiubo Li
12:24 PM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Mykola Golub wrote in #note-4:
> Xiubo Li wrote in #note-3:
>
> > Is that possible to enable the mds debug logs, ...
Venky Shankar
12:36 PM Ceph QA QA Run #65699 (QA Testing): wip-pdonnell-testing-20240430.185512-reef-debug
* "PR #57162":https://github.com/ceph/ceph/pull/57162 -- reef: qa: add support/qa for cephfs-shell on CentOS 9 / RHEL9 Patrick Donnelly
12:04 PM CephFS Bug #65423 (Rejected): Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
Please follow suggestion in note-3. Venky Shankar
12:02 PM CephFS Bug #65455 (Rejected): read operation hung in Client::get_caps
Please reopen the ticket if this is reproducible supported ceph versions. Venky Shankar
12:02 PM CephFS Bug #65616 (Triaged): pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
Venky Shankar
11:58 AM sepia Support #65633 (In Progress): Sepia Lab Access Request
adam kraitman
11:57 AM sepia Support #65633: Sepia Lab Access Request
Hey Neha,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn and ssh neh...
adam kraitman
11:48 AM Ceph QA QA Run #65454: wip-vshankar-testing-20240411.061452
Had to rebuild the (debug) branch since some fixes were merged for knows issues. Venky Shankar
11:13 AM Ceph QA QA Run #65454 (QA Needs Rerun/Rebuilt): wip-vshankar-testing-20240411.061452
Venky Shankar
11:41 AM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
ACK,
testing with:...
Mark Kogan
11:37 AM Ceph QA QA Run #65680: wip-mchangir-testing-20240429.064231-main-debug
"Teuthology Jobs":https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-... Milind Changire
09:54 AM RADOS Bug #44510 (In Progress): osd/osd-recovery-space.sh TEST_recovery_test_simple failure
from /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659542
we can see t...
Nitzan Mordechai
09:20 AM rbd Feature #65624 (In Progress): [pybind] expose CLONE_FORMAT and FLATTEN image options
Limiting the scope of this tracker to options related to cloning. Ilya Dryomov
09:09 AM sepia Support #65685: Sepia Lab Access Request
Hey Srinivasa Bharath Kanta, Are these new/additional or replacement credentials? adam kraitman
08:41 AM sepia Support #65685 (In Progress): Sepia Lab Access Request
adam kraitman
08:32 AM Dashboard Bug #65698 (Pending Backport): mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
Starting with mimic/RBD clone format 2, snapshots don't require protection to be cloned. What it happens under the ho... Ernesto Puerta
08:30 AM rgw Bug #48358: rgw: qlen and qactive perf counters leak
Is there any progress on this ticket?
We still have a performance issue when active connections get high.
Andrea Bolzonella
08:16 AM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
https://pulpito.ceph.com/matan-2024-04-28_14:58:28-crimson-rados-wip-crimson-coherent-log-and-at_version-distro-crims... Matan Breizman
07:54 AM mgr Bug #47537 (Resolved): Prometheus rbd metrics absent by default
Konstantin Shalygin
07:53 AM Ceph Bug #19242 (Resolved): Ownership of /var/run/ceph not set with sysv-init under Jewel
Konstantin Shalygin
07:48 AM rgw Backport #57658 (In Progress): quincy: fail to set requestPayment in slave zone
Konstantin Shalygin
07:46 AM crimson Bug #65697 (New): fix _do_rollback_to clone_overlaps
Bring https://github.com/ceph/ceph/pull/56696 to Crimson. Matan Breizman
07:44 AM rgw Bug #50261 (Fix Under Review): rgw: system users can't issue role policy related ops without explicit user policy
Konstantin Shalygin
07:42 AM rgw Bug #49313 (Resolved): RGW ops log is not logging bucket listing operations
Konstantin Shalygin
07:41 AM mgr Bug #43897 (Fix Under Review): crash module reports UTC timestamps but doesn't format timestamps accordingly
Konstantin Shalygin
07:40 AM mgr Bug #46703 (Resolved): mgr/prometheus: introduce metric for collection time
Konstantin Shalygin
06:47 AM crimson Bug #65451 (Resolved): tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
Matan Breizman
06:44 AM crimson Bug #65451: tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
Tested *before* 56844 was merged
https://pulpito.ceph.com/matan-2024-04-25_08:01:21-crimson-rados-wip-matanb-crimson...
Matan Breizman
06:11 AM RADOS Bug #65517 (Fix Under Review): rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
Nitzan Mordechai
05:14 AM rgw Bug #64999: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone

Hi All,
We are eagerly awaiting the resolution of the mentioned issue.
Any guidance or insight would be greatl...
Mohammad Saif
03:32 AM crimson Bug #65696 (New): osd crashes when recovering PGs that have unfound objects
Crimson OSD doesn't handle unfound objects during recovery/backfill.... Xuehan Xu
03:26 AM Ceph QA QA Run #65688 (QA Needs Approval): wip-yuri4-testing-2024-04-29-0642
@sseshasa can you review this when done pls? Yuri Weinstein
03:01 AM RADOS Bug #65686: ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
The
Radoslaw Zarzynski wrote in #note-1:
> A note from bug scrub: Mohit might want to take a look. Pinged him in...
MOHIT AGRAWAL
01:53 AM crimson Bug #65628 (Resolved): unittest-seastore (Timeout)
Yingxin Cheng
01:52 AM crimson Bug #65585 (Resolved): unittest-seastore (Timeout)
Yingxin Cheng
01:14 AM CephFS Bug #65157: cephfs-mirror: set layout.pool_name xattr of destination subvol correctly
I arrived at this issue only via code review.
I haven't attempted reproducing the issue.
Milind Changire
12:35 AM CephFS Feature #61866: MDSMonitor: require --yes-i-really-mean-it when failing an MDS with MDS_HEALTH_TRIM or MDS_HEALTH_CACHE_OVERSIZED health warnings
Rishabh Dave wrote in #note-7:
> Patrick, should we include other health warnings too? I didn't include it in PR bec...
Patrick Donnelly

04/29/2024

09:34 PM Orchestrator Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664699
More on http...
Laura Flores
09:30 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
Sure Laura Brad Hubbard
05:59 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
@badone perhaps we can address just the warnings reported in this Tracker, and address additional warnings in a Part 2. Laura Flores
09:14 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
In this one, we are intentionally setting OSDs down, so the warning is expected.
/a/yuriw-2024-04-20_15:32:38-rado...
Laura Flores
09:12 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664685... Laura Flores
09:12 PM Ceph QA QA Run #65594 (QA Needs Approval): wip-yuriw11-testing-20240501.200505-squid
Yuri Weinstein
09:11 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
Laura Flores wrote in #note-5:
> @yuriw can you rerun this, and also schedule an upgrade suite?
done
Yuri Weinstein
08:57 PM Ceph QA QA Run #65594 (QA Needs Rerun/Rebuilt): wip-yuriw11-testing-20240501.200505-squid
@yuriw can you rerun this, and also schedule an upgrade suite? Laura Flores
09:09 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664685 Laura Flores
09:09 PM Ceph QA QA Run #65694 (QA Closed): wip-pdonnell-testing-20240429.210911-debug
* "PR #57059":https://github.com/ceph/ceph/pull/57059 -- mds: abort fragment/export when quiesced
* "PR #57044":http...
Patrick Donnelly
08:58 PM Ceph QA QA Run #65592: wip-yuriw-testing-20240419.185239-main
Note to self: I also scheduled an upgrade suite. Laura Flores
08:54 PM Ceph QA QA Run #65661 (QA Closed): wip-pdonnell-testing-20240425.015853-debug
Broken by https://github.com/ceph/ceph/pull/57059 Patrick Donnelly
08:53 PM devops Bug #65693 (In Progress): ceph-mgr-dashboard RPM requires python3-werkzeug
Ken Dreyer
08:26 PM devops Bug #65693 (Pending Backport): ceph-mgr-dashboard RPM requires python3-werkzeug
The @ceph-mgr-dashboard@ RPM has a requirement on the @python3-werkzeug@ RPM package, but no code in @rpm -qf ceph-mg... Ken Dreyer
08:41 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
It sounds more and more to me like there is some kind of request the client continues to wait on that is blocking the... Patrick Donnelly
08:12 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
Laura Flores wrote in #note-4:
> @yuriw this PR caused the build failure: https://github.com/ceph/ceph/pull/55592/
> ...
Yuri Weinstein
07:58 PM Ceph QA QA Run #65641 (QA Needs Rerun/Rebuilt): wip-yuriw8-testing-20240424.000125-main
@yuriw this PR caused the build failure: https://github.com/ceph/ceph/pull/55592/
Can you drop it and rebuild?
Laura Flores
07:54 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
I scheduled another rerun since there were too many infra failures:
https://pulpito.ceph.com/lflores-2024-04-29_19:4...
Laura Flores
07:39 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
... James Ringer
06:55 PM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
magna006 node is not reachable now.
tmathew@magna002:~$ ping magna006
PING magna006.ceph.redhat.com (10.8.128.6) ...
Tintu Mathew
09:13 AM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
https://ocs3xstage-jenkins-csb-rhgsocs3x.apps.ocp-c1.prod.psi.redhat.com/view/Baremetal/job/qe-reef-rados-baremetal/3... Tintu Mathew
09:13 AM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
From the pipeline when we are trying to execute podman run command it is failing with following error and we are not ... Tintu Mathew
09:07 AM Infrastructure Bug #65682 (Resolved): Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
While running podman ps command on magna006 node, we are getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal... Tintu Mathew
06:10 PM RADOS Bug #42519 (Closed): During deployment of the ceph,when the main node starts slower than the other nodes.It may lead to generate a core by assert.
No touch for 4 years means rather low priority. Feel free to reopen if needed. Radoslaw Zarzynski
06:08 PM RADOS Bug #65591: Pool MAX_AVAIL goes UP when an OSD is marked down+in
Bump up. Radoslaw Zarzynski
06:04 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
QA Review in progress. Radoslaw Zarzynski
06:02 PM RADOS Bug #53000: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called
Still looking into this one. Laura Flores
05:56 PM RADOS Bug #54515: mon/health-mute.sh: TEST_mute: return 1 (HEALTH WARN 3 mgr modules have failed dependencies)
I'll take a look at the latest re-occurrence to see if it needs a new tracker. Laura Flores
05:53 PM RADOS Bug #65670 (Closed): src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
Radoslaw Zarzynski
05:48 PM RADOS Bug #47813 (Closed): osd op age is 4294967296
Radoslaw Zarzynski
05:46 PM RADOS Bug #44631: ceph pg dump error code 124
Hmm, looks this bug last time occurred in 2021. Is it still replicating? Radoslaw Zarzynski
05:43 PM RADOS Bug #65686: ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
A note from bug scrub: Mohit might want to take a look. Pinged him in a side-channel. Radoslaw Zarzynski
12:07 PM RADOS Bug #65686 (Fix Under Review): ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
A while ago https://github.com/ceph/ceph/pull/23629 introduced CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag for deep-scru... Igor Fedotov
04:59 PM Ceph Bug #65692 (New): Body of POST requests is dumped into the RGW log file (at level 10) which might contain sensitive info such as user name/password.
Body of POST requests is dumped into the RGW log file (at level 10) which might contain sensitive info such as user n... Igor Gomon
04:13 PM Orchestrator Bug #65691 (Pending Backport): cephadm: set "osd - profile rbd" for nvmeof service
Currently Cephadm does set "profile rbd" for "mon" scope, but not for "osd". This basically results in rbd snapshot f... Ernesto Puerta
03:21 PM CephFS Cleanup #65690 (New): mds: move specialized cleanup for export_dir to MDCache::request_cleanup
https://github.com/ceph/ceph/blob/14f956b95eb2902d7a33b1026c450cb388ada113/src/mds/Migrator.cc#L245
We should also...
Patrick Donnelly
03:20 PM CephFS Cleanup #65689 (New): mds: move specialized cleanup for fragment_dir to MDCache::request_cleanup
https://github.com/ceph/ceph/blob/14f956b95eb2902d7a33b1026c450cb388ada113/src/mds/MDCache.cc#L12101-L12112
We sho...
Patrick Donnelly
03:19 PM nvme-of Backport #65124 (Resolved): squid: cephadm - make changes to ceph-nvmeof.conf template
Adam King
01:43 PM Ceph QA QA Run #65688 (QA Needs Rerun/Rebuilt): wip-yuri4-testing-2024-04-29-0642

--- done. these PRs were included:
https://github.com/ceph/ceph/pull/56477 - squid: qa: Add benign cluster warning...
Yuri Weinstein
11:37 AM CephFS Bug #50260: pacific: qa: "rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty"
Venky Shankar wrote in #note-3:
> Xiubo Li wrote in #note-2:
> > /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-tes...
Xiubo Li
09:47 AM CephFS Bug #50260: pacific: qa: "rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty"
Xiubo Li wrote in #note-2:
> /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-s...
Venky Shankar
10:39 AM sepia Support #65685 (In Progress): Sepia Lab Access Request
Hi Adam,
Please provide the access to sepia lab.
*Here are the details:*
root:~# sudo ./sepia/new-client...
Srinivasa Bharath Kanta
10:37 AM Dashboard Bug #61720 (Pending Backport): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
Aashish Sharma
10:36 AM Dashboard Bug #61720 (Fix Under Review): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
Aashish Sharma
10:06 AM Dashboard Bug #61720 (Pending Backport): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
Aashish Sharma
10:06 AM Dashboard Bug #61720 (Fix Under Review): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
Aashish Sharma
08:40 AM Dashboard Bug #61720 (Pending Backport): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
Aashish Sharma
09:41 AM RADOS Bug #64373 (Fix Under Review): osd: Segmentation fault on OSD shutdown
Igor Fedotov
09:26 AM Orchestrator Bug #65683 (New): cephadmin wrongly assumes that lines in /sys/kernel/security/apparmor/profiles contain exactly one space
The code in line
https://github.com/ceph/ceph/blob/1680e466aab77cdf9ba07394bea664106580b32b/src/cephadm/cephadmlib/h...
Frank Nagel
08:52 AM bluestore Bug #65678: Cannot use BtreeAllocator for blustore or bluefs
NB: before backporting please resolve https://tracker.ceph.com/issues/61949.
Igor Fedotov
08:45 AM bluestore Bug #65678 (Fix Under Review): Cannot use BtreeAllocator for blustore or bluefs
Igor Fedotov
12:44 AM bluestore Bug #65678 (Fix Under Review): Cannot use BtreeAllocator for blustore or bluefs
BtreeAllocator was added in the commit https://github.com/ceph/ceph/pull/41828.
Its performance has advantages in so...
changzhi tan
08:47 AM Dashboard Feature #65681 (New): mgr/dashboard: add support for smb service
Since smb service deployement has been added with cephadm, the dashboard should support the smb service mangement Pedro González Gómez
07:19 AM CephFS Bug #50821 (Fix Under Review): qa: untar_snap_rm failure during mds thrashing
Xiubo Li
12:52 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
Yeah, really this time we hit another case. The local *MDS* was in *up:active* state but not others, so in this case ... Xiubo Li
12:23 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It ... Xiubo Li
07:18 AM rgw Bug #59471 (Resolved): Object Ownership Inconsistent
Konstantin Shalygin
07:17 AM rgw Backport #61353 (Resolved): quincy: Object Ownership Inconsistent
Konstantin Shalygin
07:16 AM CephFS Bug #57591 (Resolved): cephfs: qa enables kclient for newop test
Konstantin Shalygin
07:16 AM rgw Backport #59376 (In Progress): quincy: rgw/s3 transfer encoding problems.
Konstantin Shalygin
07:15 AM CephFS Bug #58018 (Resolved): mount.ceph: will fail with old kernels
Konstantin Shalygin
07:15 AM CephFS Backport #58251 (Resolved): quincy: mount.ceph: will fail with old kernels
Konstantin Shalygin
07:14 AM CephFS Bug #16745 (Resolved): mon: prevent allocating snapids allocated for CephFS
Konstantin Shalygin
07:13 AM rgw Backport #59144 (In Progress): quincy: rgw: request QUERY_STRING is duplicated into ops-log uri element
Konstantin Shalygin
07:12 AM rgw Backport #58327 (In Progress): quincy: ListOpenIDConnectProviders XML format error
Konstantin Shalygin
07:11 AM rgw Bug #57784 (Resolved): beast frontend crashes on exception from socket.local_endpoint()
Konstantin Shalygin
07:11 AM rgw Backport #58235 (In Progress): quincy: multisite sync process block after long time running
Konstantin Shalygin
07:08 AM rgw Backport #58237 (Resolved): quincy: beast frontend crashes on exception from socket.local_endpoint()
Konstantin Shalygin
07:08 AM rgw Bug #56572 (Resolved): pubsub test failures
Konstantin Shalygin
07:08 AM rgw Backport #57561 (Resolved): quincy: pubsub test failures
Konstantin Shalygin
06:55 AM rgw Bug #59048 (Resolved): DeleteObjects response does not include DeleteMarker/DeleteMarkerVersionId
Konstantin Shalygin
06:55 AM rgw Backport #59132 (Resolved): quincy: DeleteObjects response does not include DeleteMarker/DeleteMarkerVersionId
Konstantin Shalygin
06:55 AM rgw Bug #57881 (Resolved): LDAP invalid password resource leak fix
Konstantin Shalygin
06:54 AM Ceph Bug #55079 (Resolved): rpm: remove contents of build directory at end of %install section
Konstantin Shalygin
06:49 AM crimson Bug #65585: unittest-seastore (Timeout)
Xuehan Xu wrote in #note-9:
> Yingxin Cheng wrote in #note-8:
> > Seem to me the blocking issue of test-seastore re...
Xuehan Xu
06:46 AM crimson Bug #65585: unittest-seastore (Timeout)
Yingxin Cheng wrote in #note-8:
> Seem to me the blocking issue of test-seastore reveals a deadlock from background ...
Xuehan Xu
06:37 AM crimson Bug #65585: unittest-seastore (Timeout)
Seem to me the blocking issue of test-seastore reveals a deadlock from background cleaning -- the IO transaction didn... Yingxin Cheng
06:43 AM Ceph QA QA Run #65680 (QA Testing): wip-mchangir-testing-20240429.064231-main-debug
* "PR #44359":https://github.com/ceph/ceph/pull/44359 -- mds: un-inline data on scrub Milind Changire
06:27 AM RADOS Backport #63879 (Resolved): quincy: tools/ceph_objectstore_tool: Support get/set/superblock
Konstantin Shalygin
06:27 AM rgw Backport #64426 (Resolved): reef: rgw: rados objects wrongly deleted
Konstantin Shalygin
05:17 AM rgw Backport #64325 (In Progress): reef: multisite: avoid writing multipart parts to the bucket index log
Jane Zhu
04:53 AM rgw Backport #64324 (In Progress): quincy: multisite: avoid writing multipart parts to the bucket index log
Jane Zhu
03:24 AM crimson Bug #65679 (New): osd crashes due to inconsistency between the in-memory cache and on disk data of the snap mapper
Operations in crimson can be interrupted, which is different from classic osds. The implementation of SnapMapper foll... Xuehan Xu

04/28/2024

05:02 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It ... Xiubo Li
04:45 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
Okay, finally it was because the *mds.b* crashed and this was why it wasn't brought up:... Xiubo Li
04:40 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
It seems the *mds.b* daemon wasn't brought up in *300s* and then the watchdog barked and then all the daemons were ki... Xiubo Li
02:56 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
Patrick Donnelly wrote in #note-10:
> [...]
>
> From: /teuthology/pdonnell-2024-04-20_23:33:17-fs-wip-pdonnell-te...
Xiubo Li
03:21 AM crimson Bug #65673 (Fix Under Review): the main branch fails gcc-13 compilation
Kefu Chai
01:41 AM CephFS Backport #65363 (Rejected): reef: qa/cephfs: test_idem_unaffected_root_squash fails
The depending patches were not backported to reef: https://tracker.ceph.com/issues/47264. Xiubo Li
01:41 AM CephFS Backport #65362 (Rejected): quincy: qa/cephfs: test_idem_unaffected_root_squash fails
The depending patches were not backported to quincy: https://tracker.ceph.com/issues/47264. Xiubo Li
01:28 AM CephFS Backport #65361 (Fix Under Review): squid: qa/cephfs: test_idem_unaffected_root_squash fails
Xiubo Li
01:15 AM CephFS Backport #65323 (Fix Under Review): squid: src/mds/MDCache.cc: 5131: FAILED ceph_assert(isolated_inodes.empty())
Xiubo Li
01:15 AM CephFS Backport #65321 (Fix Under Review): reef: src/mds/MDCache.cc: 5131: FAILED ceph_assert(isolated_inodes.empty())
Xiubo Li
01:15 AM CephFS Backport #65322 (Fix Under Review): quincy: src/mds/MDCache.cc: 5131: FAILED ceph_assert(isolated_inodes.empty())
Xiubo Li
01:14 AM CephFS Backport #65677 (Fix Under Review): quincy: mds: the name and descriptions of the inotable testing only options need to be fixed
Xiubo Li
01:02 AM CephFS Backport #65677 (Fix Under Review): quincy: mds: the name and descriptions of the inotable testing only options need to be fixed
*mds_kill_skip_replaying_inotable* and *mds_inject_skip_replaying_inotable* are exactly the same, which is incorrect.... Xiubo Li
01:14 AM CephFS Backport #65676 (Fix Under Review): reef: mds: the name and descriptions of the inotable testing only options need to be fixed
Xiubo Li
01:01 AM CephFS Backport #65676 (Fix Under Review): reef: mds: the name and descriptions of the inotable testing only options need to be fixed
*mds_kill_skip_replaying_inotable* and *mds_inject_skip_replaying_inotable* are exactly the same, which is incorrect.... Xiubo Li
01:13 AM CephFS Backport #65675 (Fix Under Review): squid: mds: the name and descriptions of the inotable testing only options need to be fixed
Xiubo Li
01:00 AM CephFS Backport #65675 (Fix Under Review): squid: mds: the name and descriptions of the inotable testing only options need to be fixed
*mds_kill_skip_replaying_inotable* and *mds_inject_skip_replaying_inotable* are exactly the same, which is incorrect.... Xiubo Li

04/27/2024

03:51 PM bluestore Backport #65485 (In Progress): squid: bluestore/bluestore_types: check 'it' valid before using
Konstantin Shalygin
03:50 PM bluestore Backport #65484 (In Progress): reef: bluestore/bluestore_types: check 'it' valid before using
Konstantin Shalygin
03:49 PM bluestore Backport #65483 (Resolved): quincy: bluestore/bluestore_types: check 'it' valid before using
Konstantin Shalygin
01:32 PM bluestore Bug #64511 (Pending Backport): kv/RocksDBStore: rocksdb_cf_compact_on_deletion has no effect on the default column family
Konstantin Shalygin
03:12 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
s/good/off/g James Ringer
03:11 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
Something still looks good though with the numbers. It's like Ceph isn't balanced? Sure, OSD 3 RAW USE and DATA are t... James Ringer
01:59 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
... James Ringer
01:46 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
Yeah, I @kill -9@ the OSD process in the pod and it fixed resolved the problem. Thanks! James Ringer

04/26/2024

11:16 PM RADOS Bug #57061 (Pending Backport): Use single cluster log level (mon_cluster_log_level) config to control verbosity of cluster logs while logging to external entities
Prashant D
10:46 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
Generally what you need is to shutdown OSD process in a non-graceful manner. And let it rebuild allocmap during the f... Igor Fedotov
06:45 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
I read https://tracker.ceph.com/issues/63858#note-7, but I'm not sure how to apply the workaround. I have tried delet... James Ringer
04:24 PM bluestore Bug #65659 (Triaged): OSD Resize Increases Used Capacity Not Available Capacity
Igor Fedotov
04:24 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
Hi James,
so I'm pretty sure this is a duplicate of https://tracker.ceph.com/issues/63858
Please see https://trac...
Igor Fedotov
08:19 PM rbd Backport #65547 (Resolved): quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Ilya Dryomov
12:48 PM rbd Backport #65547: quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/57029
merged
Yuri Weinstein
06:18 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
Kamoltat (Junior) Sirivadhna wrote in #note-13:
> @yuriw Please rerun this we have to many failed and dead jobs due ...
Yuri Weinstein
05:00 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@yuriw Please rerun this we have to many failed and dead jobs due to infrastructure failures Kamoltat (Junior) Sirivadhna
03:00 PM mgr Cleanup #55835: mgr: mute/hide NOTIFY_TYPES log errors
I made a draft pr for this: https://github.com/ceph/ceph/pull/57106
I didn't feel like verifying the unit tests pa...
John Mulligan
02:58 PM Ceph QA QA Run #65674 (QA Testing): wip-rishabh-testing-20240426.111959
https://github.com/ceph/ceph/pull/56981
https://github.com/ceph/ceph/pull/56846
https://github.com/ceph/ceph/pull/5...
Rishabh Dave
02:31 PM Ceph QA QA Run #65516 (QA Closed): wip-rishabh-testing-20240416.193735
Rishabh Dave
02:31 PM Ceph QA QA Run #65516: wip-rishabh-testing-20240416.193735
Testing was significantly slowed down. First due to new and persistent infra failures on CentOS 9 that caused ~95 dea... Rishabh Dave
12:48 PM Ceph QA QA Run #65638 (QA Closed): wip-yuriw4-testing-20240423.151325-quincy
Yuri Weinstein
07:07 AM Ceph QA QA Run #65638 (QA Approved): wip-yuriw4-testing-20240423.151325-quincy
Ilya Dryomov
12:06 PM Dashboard Bug #61312 (Fix Under Review): The command "ceph config set mgr mgr/dashboard/redirect_resolve_ip_addr True" fails
Zac Dover
11:43 AM rgw Backport #65666 (Fix Under Review): squid: rgw/lc: A few buckets stuck in UNINITIAL state
Soumya Koduri
11:04 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Patrick Donnelly wrote in #note-23:
> Dhairya Parmar wrote in #note-21:
> > Dhairya Parmar wrote in #note-20:
> > ...
Dhairya Parmar
10:57 AM CephFS Bug #61660 (Pending Backport): mds: the name and descriptions of the inotable testing only options need to be fixed
Rishabh Dave
10:28 AM crimson Bug #65673 (Fix Under Review): the main branch fails gcc-13 compilation
... Xuehan Xu
10:10 AM crimson Bug #65672 (New): Rados write requests user_version set to 0 when pg interval changes lead to duplicated client requests.
Client logs:... Xuehan Xu
09:47 AM Orchestrator Bug #65671: Add node-exporter using ceph orch
Vahideh Alinouri wrote:
> I think there is a functionality issue in below command because cephadm log printed succ...
Vahideh Alinouri
09:30 AM Orchestrator Bug #65671: Add node-exporter using ceph orch
Vahideh Alinouri wrote:
I have tried to add node-exporter to new host in ceph cluster by the command mentioned in do...
Vahideh Alinouri
09:28 AM Orchestrator Bug #65671: Add node-exporter using ceph orch
I have tried to add node-exporter to new host in ceph cluster by the command mentioned in docuemnt Vahideh Alinouri
09:26 AM Orchestrator Bug #65671 (New): Add node-exporter using ceph orch
... Vahideh Alinouri
09:44 AM RADOS Bug #65670: src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
This seems to be wrong.. Xuehan Xu
08:44 AM RADOS Bug #65670: src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
https://github.com/ceph/ceph/pull/57100 Xuehan Xu
08:42 AM RADOS Bug #65670 (Closed): src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
... Xuehan Xu
08:57 AM CephFS Bug #65669 (Fix Under Review): QuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
Leonid Usov
03:27 AM CephFS Bug #65669 (In Progress): QuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
Leonid Usov
01:12 AM CephFS Bug #65669 (Fix Under Review): QuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
This design decision appears counterintuitive after having seen it in the wild.
Here the --await was sent with a d...
Leonid Usov
07:38 AM CephFS Bug #63906: Inconsistent file mode across two clients
Hi Leonid,
Thanks. Good to hear.
Tao Lyu
02:43 AM CephFS Bug #63906: Inconsistent file mode across two clients
Tao, I appreciate your quick and detailed response. I reviewed the client code for setxattr, and the code doesn't use... Leonid Usov
02:21 AM crimson Bug #65610 (Resolved): unittest-object-data-handler crashes testing object_data_handler_test_t.overwrite_then_read_within_transaction
Yingxin Cheng
01:25 AM Ceph QA QA Run #65655 (QA Needs Approval): wip-yuri2-testing-2024-04-24-0914-squid
@rfriedma can you pls review when done? Yuri Weinstein

04/25/2024

11:05 PM rgw Bug #65668 (Fix Under Review): Notification: Persistent queue not deleted when topic is deleted via radosgw-admin
Post this "commit":https://github.com/ceph/ceph/commit/4c50ad69c37110d42f1f68f6e567cdf5ac506a32, the logic to remove ... Krunal Chheda
07:36 PM CephFS Bug #63906: Inconsistent file mode across two clients
Hi Leonid,
Thanks for your question. I answer you confusions below:
> I'd like to suggest that the issue here i...
Tao Lyu
06:03 PM CephFS Bug #63906: Inconsistent file mode across two clients
Hi, I came here from the PR.
I'd like to suggest that the issue here is not a bug in the MDS. While I can't find a...
Leonid Usov
07:28 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
I reached a point where I could resize another OSD to get the output from @ceph tell osd.N perf dump bluefs@. I follo... James Ringer
04:17 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
I'm actively working on this cluster. I have already replaced osd.0 to move forward with my work. I'll need to perfor... James Ringer
04:08 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
And please be aware of https://tracker.ceph.com/issues/63858 Igor Fedotov
03:59 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
James,
would you please share the output of 'ceph tell osd.N perf dump bluefs" after such an expansion then?
Igor Fedotov
02:04 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
Igor Fedotov wrote in #note-1:
> Hi James!
> I presume you haven't run ceph-bluestore-tool's bluefs-bdev-expand com...
James Ringer
01:56 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
Hi James!
I presume you haven't run ceph-bluestore-tool's bluefs-bdev-expand command against expanded OSD(s), have y...
Igor Fedotov
06:27 PM CephFS Bug #65660: mds: drop client metrics during recovery
Christopher Hoffman wrote in #note-2:
> >there's little reason to record historical metrics from the clients
>
> ...
Patrick Donnelly
06:19 PM CephFS Bug #65660: mds: drop client metrics during recovery
Xiubo Li wrote in #note-1:
> Is this new in the upstream master ? As I remembered we have improved this and the clie...
Patrick Donnelly
04:49 PM CephFS Bug #65660: mds: drop client metrics during recovery
>there's little reason to record historical metrics from the clients
Can you expand on this? Are we losing anythin...
Christopher Hoffman
12:38 AM CephFS Bug #65660: mds: drop client metrics during recovery
Is this new in the upstream master ? As I remembered we have improved this and the clients will only send the metrics... Xiubo Li
12:34 AM CephFS Bug #65660 (In Progress): mds: drop client metrics during recovery
When the rank is coming up, there's little reason to record historical metrics from the clients. We've also seen floo... Patrick Donnelly
06:22 PM rgw Backport #65667 (New): reef: rgw/lc: A few buckets stuck in UNINITIAL state
Casey Bodley
06:21 PM rgw Backport #65666 (Resolved): squid: rgw/lc: A few buckets stuck in UNINITIAL state
Casey Bodley
06:21 PM rgw Backport #65665 (New): quincy: rgw/lc: A few buckets stuck in UNINITIAL state
Casey Bodley
06:21 PM rgw Bug #65160: rgw/lc: A few buckets stuck in UNINITIAL state
quincy and reef backports also need https://github.com/ceph/ceph/pull/47595 Casey Bodley
02:35 PM rgw Bug #65160 (Pending Backport): rgw/lc: A few buckets stuck in UNINITIAL state
Casey Bodley
03:31 PM rgw Bug #63791: RGW: a subuser with no permission can still list buckets and create buckets
I believe this is also an issue for subusers with read permissions: they can still create buckets (at least on Quincy... Pierre Riteau
03:02 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya Parmar wrote in #note-21:
> Dhairya Parmar wrote in #note-20:
> > Patrick Donnelly wrote in #note-19:
> > ...
Patrick Donnelly
03:02 PM CephFS Bug #65265 (Fix Under Review): qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya Parmar wrote in #note-20:
> Patrick Donnelly wrote in #note-19:
> > Dhairya Parmar wrote in #note-18:
> > ...
Patrick Donnelly
03:00 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya Parmar wrote in #note-20:
> Patrick Donnelly wrote in #note-19:
> > Dhairya Parmar wrote in #note-18:
> > ...
Dhairya Parmar
02:34 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Patrick Donnelly wrote in #note-19:
> Dhairya Parmar wrote in #note-18:
> > I ran a couple of NFS jobs, no `MGR_DOWN`...
Dhairya Parmar
11:41 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya Parmar wrote in #note-18:
> I ran a couple of NFS jobs, no `MGR_DOWN` reported
>
> https://pulpito.ceph.c...
Patrick Donnelly
10:19 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
I ran a couple of NFS jobs, no `MGR_DOWN` reported
https://pulpito.ceph.com/dparmar-2024-04-10_06:37:26-fs:nfs-wip...
Dhairya Parmar
09:21 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
main branch - https://pulpito.ceph.com/rishabh-2024-04-24_07:32:23-fs-rishabh-main-apr17-a654945-testing-default-smit... Rishabh Dave
02:45 PM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
quoting https://www.boost.org/doc/libs/1_82_0/doc/html/boost_asio/reference/ssl__error__stream_errors.html:... Casey Bodley
01:52 PM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
following testing with openssl s_client, s3cmd and Warp, the following ec's occur under normal conditions:... Mark Kogan
01:51 PM rgw Bug #65664 (Fix Under Review): Crash observed in boost::asio module related to stream.async_shutdown()
continuing from downstream BZ#2275284
call stack:...
Mark Kogan
02:45 PM Ceph QA QA Run #65655: wip-yuri2-testing-2024-04-24-0914-squid
jammy failed, retriggered Yuri Weinstein
02:32 PM rgw Bug #65590 (Fix Under Review): rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
Casey Bodley
02:21 PM rgw Bug #65626 (Fix Under Review): rgw: false assumption on vault bucket key deletion
Casey Bodley
02:19 PM Ceph QA QA Run #65638 (QA Needs Approval): wip-yuriw4-testing-20240423.151325-quincy
Yuri Weinstein
02:19 PM Ceph QA QA Run #65638: wip-yuriw4-testing-20240423.151325-quincy
Ilya Dryomov wrote in #note-7:
> Hi Yuri,
>
> This needs a rerun for krbd since krbd_rxbounce job died on reboot ...
Yuri Weinstein
11:37 AM Ceph QA QA Run #65638 (QA Needs Rerun/Rebuilt): wip-yuriw4-testing-20240423.151325-quincy
Ilya Dryomov
11:36 AM Ceph QA QA Run #65638: wip-yuriw4-testing-20240423.151325-quincy
Hi Yuri,
This needs a rerun for krbd since krbd_rxbounce job died on reboot for some reason.
Ilya Dryomov
02:19 PM Ceph Bug #65652: vstart.sh can not start
it happen on compiling main branch(recent) on ubuntu 22,nm libec_jerasure.so,see jerasure_init is "U" symbol
but r...
Jack Lv
01:50 PM bluestore Fix #65600 (Fix Under Review): bluefs alloc unit should only be shrink
Igor Fedotov
01:14 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Hi Nir,
I built a container based on 18.2.3 (an upcoming release). It would be great if you could try it: podman ...
Ilya Dryomov
08:35 AM rbd Bug #65487 (Fix Under Review): rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov
01:10 PM rgw Bug #65462 (Pending Backport): rgw: eliminate ssl enforcement for sse-s3 encryption
Casey Bodley
01:09 PM rgw Bug #65473 (Pending Backport): rgw: exclude logging of request payer for 403 requests
Casey Bodley
12:06 PM rbd Backport #65587 (In Progress): squid: insufficient randomness for group and group snapshot IDs
Ilya Dryomov
12:04 PM rbd Backport #65586 (In Progress): reef: insufficient randomness for group and group snapshot IDs
Ilya Dryomov
12:03 PM rbd Backport #65588 (In Progress): quincy: insufficient randomness for group and group snapshot IDs
Ilya Dryomov
11:45 AM rbd Bug #65653 (Duplicate): run-rbd-unit-tests-0.sh: TestMigration.StressLive failure
Ilya Dryomov
11:45 AM rbd Bug #65653: run-rbd-unit-tests-0.sh: TestMigration.StressLive failure
This is with RBD_FEATURES=0 and a slightly different mismatch, but still too similar to track separately. Ilya Dryomov
11:31 AM crimson Bug #65663 (New): Enable LibRadosSnapshotsSelfManagedPP.RollbackPP
The test is currently disabled by `SKIP_IF_CRIMSON()`.
LibRadosSnapshotsPP.RollbackPP is supported and so should Lib...
Matan Breizman
10:33 AM RADOS Bug #54744: crash: void MonMap::add(const mon_info_t&): assert(addr_mons.count(a) == 0)
Seems same story here for Pacific 16.2.15
My prev monmap before change:...
Sergey Borodavkin
09:03 AM rgw Bug #62136: "test pushing kafka s3 notification on master" - no events are sent
created a separate tracker: https://tracker.ceph.com/issues/65662 Yuval Lifshitz
04:31 AM rgw Bug #62136: "test pushing kafka s3 notification on master" - no events are sent
could be another issue with the kafka consumer (on top of what was fixed in PR 54637):... Yuval Lifshitz
09:02 AM rgw-testing Bug #65662 (New): kafka: no creation event found for key
test is still failing even after the fix from https://github.com/ceph/ceph/pull/54637.
see: https://tracker.ceph.com...
Yuval Lifshitz
07:42 AM RADOS Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
Reef backport is in QA Matan Breizman
07:34 AM crimson Feature #65478: Support SnapMapper::Scrubber
this issue is hard for me, so need more time. junxiang mu
06:45 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Xiubo Li wrote in #note-3:
> Is that possible to enable the mds debug logs, let's see whether there are other logs...
Mykola Golub
01:10 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Mykola Golub wrote:
> Our customer were observing sporadic "client isn't responding to mclientcaps(revoke)" issue so...
Xiubo Li
01:00 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
Xiubo Li wrote in #note-1:
> I think you have enabled *recover_session* in kclient ?
>
> [...]
>
> More detail...
Xiubo Li
12:48 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
I think you have enabled *recover_session* in kclient ?... Xiubo Li
04:07 AM CephFS Bug #65630 (Fix Under Review): mds: rename request was deadlocked between two different MDSs
Xiubo Li
01:45 AM Ceph QA QA Run #65661 (QA Closed): wip-pdonnell-testing-20240425.015853-debug
* "PR #57059":https://github.com/ceph/ceph/pull/57059 -- mds: abort fragment/export when quiesced
* "PR #57044":http...
Patrick Donnelly
01:33 AM CephFS Bug #65603 (Fix Under Review): mds: quiesce timeout due to a freezing directory
Patrick Donnelly

04/24/2024

11:16 PM bluestore Bug #65659 (Triaged): OSD Resize Increases Used Capacity Not Available Capacity
h1. Deviation from expected behavior
After resizing the underlying disk at the hypervisor and OS level *resizing t...
James Ringer
07:49 PM rbd Bug #46875: TestLibRBD.TestPendingAio: test_librbd.cc:4539: Failure or SIGSEGV
from https://jenkins.ceph.com/job/ceph-pull-requests/133893/consoleFull... Casey Bodley
07:41 PM CephFS Bug #65658 (Fix Under Review): mds: MetricAggregator::ms_can_fast_dispatch2 acquires locks
Patrick Donnelly
07:33 PM CephFS Bug #65658 (Fix Under Review): mds: MetricAggregator::ms_can_fast_dispatch2 acquires locks
There was a lot of discussion surrounding this in
https://github.com/ceph/ceph/pull/26004/
but circling back we...
Patrick Donnelly
06:21 PM Orchestrator Bug #65657 (New): doc: lack of clarity for explicit placement analogue in yaml spec
https://docs.ceph.com/en/latest/cephadm/services/#explicit-placements
Specifically, I'm wondering if "host:[ip]=na...
Patrick Donnelly
05:45 PM CephFS Tasks #65615 (Resolved): lchown corrupts symlink entry
The code was using parent dir ent fscrypt info/key. Using an incorrect key to decrypt, will yield incorrect plaintext... Christopher Hoffman
05:05 PM Ceph QA QA Run #65641 (QA Building): wip-yuriw8-testing-20240424.000125-main
build failed
https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=cent...
Yuri Weinstein
02:03 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
repushed Yuri Weinstein
12:01 AM Ceph QA QA Run #65641 (QA Needs Approval): wip-yuriw8-testing-20240424.000125-main
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/51171 - osd/scrub: Change scrub cost to average...
Yuri Weinstein
04:58 PM rgw Bug #65656: Reduce default thread pool size
Test env:
---------
3x MON/MGR nodes
Dell R630
2x E5-2683 v3 (28 total cores, 56 threads)
128 GB RAM
8x...
Tim Wilkinson
04:41 PM rgw Bug #65656 (Fix Under Review): Reduce default thread pool size
Our recent RGW thread pool size profiling (RHEL 9.2, Ceph 18.2.0-131) revealed that for both smaller (max 256KB) and ... Tim Wilkinson
04:15 PM Ceph QA QA Run #65655 (QA Closed): wip-yuri2-testing-2024-04-24-0914-squid

--- done. these PRs were included:
https://github.com/ceph/ceph/pull/56777 - squid: osd/scrub: implement reservat...
Yuri Weinstein
03:59 PM Orchestrator Backport #64844 (Resolved): reef: Regression: Permanent KeyError: 'TYPE' : return self.blkid_api['TYPE'] == 'part'
Adam King
03:58 PM Orchestrator Bug #65035 (Duplicate): ERROR: required file missing from config-json: idmap.conf
duplicate of https://tracker.ceph.com/issues/65155 Adam King
03:54 PM Orchestrator Bug #64118 (Resolved): cephadm: RuntimeError: Failed command: apt-get update: E: The repository 'https://download.ceph.com/debian-quincy jammy Release' does not have a Release file.
I think this should be fixed now that we have quincy jammy builds Adam King
03:34 PM Orchestrator Backport #65378 (Resolved): squid: cephadm: client-keyring also overwrites ceph.conf
Adam King
03:15 PM rgw Bug #65654 (New): run-bucket-check.sh: failed assert len(json_out) == len(unlinked_keys)
https://qa-proxy.ceph.com/teuthology/suriarte-2024-04-23_15:04:03-rgw-rgw-update-boost-redis-distro-default-smithi/76... Casey Bodley
03:13 PM nvme-of Backport #65649 (In Progress): squid: Change some default values for OMAP lock parameters in nvmeof conf file
Adam King
01:45 PM nvme-of Backport #65649 (In Progress): squid: Change some default values for OMAP lock parameters in nvmeof conf file
https://github.com/ceph/ceph/pull/56497 Backport Bot
03:05 PM Ceph QA QA Run #65638 (QA Needs Approval): wip-yuriw4-testing-20240423.151325-quincy
@idryomov rgw PRs will be removed, so this is only one rbd PR Yuri Weinstein
02:21 PM Ceph QA QA Run #65638: wip-yuriw4-testing-20240423.151325-quincy
@cbodley Can't schedule rgw
yuriw@teuthology ~ [14:09:41]> teuthology-suite -v --ceph-repo $CEPH_REPO -c $CEPH_B...
Yuri Weinstein
02:56 PM nvme-of Backport #65650 (In Progress): reef: Change some default values for OMAP lock parameters in nvmeof conf file
Adam King
01:46 PM nvme-of Backport #65650 (In Progress): reef: Change some default values for OMAP lock parameters in nvmeof conf file
https://github.com/ceph/ceph/pull/56498 Backport Bot
02:52 PM rbd Bug #65653 (Duplicate): run-rbd-unit-tests-0.sh: TestMigration.StressLive failure
from https://jenkins.ceph.com/job/ceph-pull-requests/133815/consoleFull on a squid pr:... Casey Bodley
02:52 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
Sorry the last two weeks have been much busier than usual and this slipped my mind. I discussed this with Adam King a... John Mulligan
12:55 AM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
Any thoughts on this, John? I have to install a cluster from a private repo tomorrow, and it reminded me we'd had th... Dan Mick
02:44 PM Ceph Bug #65652: vstart.sh can not start
https://github.com/ceph/ceph/pull/57077 Jack Lv
02:12 PM Ceph Bug #65652 (New): vstart.sh can not start

2024-04-24T21:26:01.158+0800 7f4ec09ffd40 -1 load dlopen(/home/ecs-assist-user/ceph/build/lib/libec_jerasure.so): /...
Jack Lv
02:40 PM rgw Bug #64841 (Triaged): java_s3tests: testObjectCreateBadExpectMismatch failure
Casey Bodley
02:32 PM rgw Bug #62136: "test pushing kafka s3 notification on master" - no events are sent
this says resolved, but i still see failures like this on main:... Casey Bodley
02:12 PM rgw Bug #65651 (New): s3select: test_true_false_in_expressions s3test failure
from a rgw/sts job based on recent main
https://qa-proxy.ceph.com/teuthology/cbodley-2024-04-24_12:59:55-rgw-wip-cbo...
Casey Bodley
01:40 PM nvme-of Feature #65566 (Pending Backport): Change some default values for OMAP lock parameters in nvmeof conf file
Adam King
01:39 PM rgw Feature #18621 (Resolved): rgw: change default chunk size
Casey Bodley
12:28 PM rgw Bug #65648 (New): TestAMQP.MaxConnections FAILED ceph_assert(!conn->state)
... Casey Bodley
11:56 AM Dashboard Bug #61312: The command "ceph config set mgr mgr/dashboard/redirect_resolve_ip_addr True" fails
Nizamudeen tells me the following through Slack:
BEGIN QUOTED TEXT
this particular configuration is introduced ...
Zac Dover
11:54 AM Ceph Documentation #65631 (Resolved): clarify dual-stack mode
Zac Dover
11:53 AM RADOS Backport #65646 (Fix Under Review): squid: osd/scrub: must disable reservation timeout for reserver-based requests
Ronen Friedman
11:12 AM RADOS Backport #65646 (Resolved): squid: osd/scrub: must disable reservation timeout for reserver-based requests
Backport Bot
11:18 AM CephFS Bug #65647 (Triaged): Evicted kernel client may get stuck after reconnect
Our customer were observing sporadic "client isn't responding to mclientcaps(revoke)" issue so they configured auto e... Mykola Golub
11:04 AM RADOS Bug #65044 (Pending Backport): osd/scrub: must disable reservation timeout for reserver-based requests
Ronen Friedman
10:03 AM rgw Bug #65645 (New): lifecycle notifications are sent from radosgw-admin
when "radosgw-admin lc process" is called, and there are buckets that have bucket notification events set with "Objec... Yuval Lifshitz
09:43 AM CephFS Backport #65644 (Fix Under Review): quincy: qa/cephfs: absence of e03331e causes test_nfs to fail
@tasks.cephfs.test_nfs.TestNFS.test_non_existent_cluster@ failed on here - https://pulpito.ceph.com/vshankar-2024-03-... Rishabh Dave
09:33 AM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-17:
> However, the "unable to connect to remote cluster" error isn't cleared and you cont...
Ilya Dryomov
09:31 AM rgw Bug #64999: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone
Hi All,
I just wanted to quick follow-up on my previous query about "Slow RGW multisite sync
due to '304 Not Modi...
Mohammad Saif
08:36 AM Dashboard Bug #65643 (New): mgr/dashboard: dashboard landing page cant be seen as readonly
As a read only user you should be able to view the landing page, but it is not possible Pedro González Gómez
06:54 AM Dashboard Cleanup #65070 (Resolved): mgr/dashboard: use alertmanager v2 APIs mgr/dashboard: short_description
Nizamudeen A
06:54 AM Dashboard Backport #65255 (Resolved): squid: mgr/dashboard: use alertmanager v2 APIs mgr/dashboard: short_description
Nizamudeen A
06:22 AM crimson Bug #65585: unittest-seastore (Timeout)
If each test's execution time was correct, timeout is caused by "stuck in one of tests".
e.g. https://jenkins.ceph...
Rongqi Sun
02:07 AM CephFS Tasks #65613: truncate failing when using path
Greg Farnum wrote in #note-2:
> Hmm, I'm surprised you found missing Server logic here. Shouldn't that have turned u...
Xiubo Li

04/23/2024

11:35 PM Ceph QA QA Run #65126 (QA Closed): wip-yuri8-testing-2024-03-25-1419
Yuri Weinstein
10:47 PM Ceph QA QA Run #65126 (QA Approved): wip-yuri8-testing-2024-03-25-1419
@yuriw rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65126 Laura Flores
10:03 PM Ceph QA QA Run #65126 (QA Needs Approval): wip-yuri8-testing-2024-03-25-1419
Laura Flores
11:18 PM bluestore Bug #56262: crash: BlueStore::_txc_create(BlueStore::Collection*, BlueStore::OpSequencer*, std::list<Context*, std::allocator<Context*> >*, boost::intrusive_ptr<TrackedOp>)
There seems to be some race condition at the time of OSD shutdown. The kv db handle was destroyed and one of OSD thre... Prashant D
10:27 PM RADOS Bug #54515: mon/health-mute.sh: TEST_mute: return 1 (HEALTH WARN 3 mgr modules have failed dependencies)
/a/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7634102 Laura Flores
07:17 PM Ceph QA QA Run #65552 (QA Closed): wip-yuri2-testing-2024-04-17-0823-reef
Yuri Weinstein
04:02 PM Ceph QA QA Run #65552 (QA Approved): wip-yuri2-testing-2024-04-17-0823-reef
@yuriw rados approved! https://tracker.ceph.com/projects/rados/wiki/REEF#httpstrackercephcomissues65552 Laura Flores
06:38 PM Dashboard Bug #62972: ERROR: test_list_enabled_module (tasks.mgr.dashboard.test_mgr_module.MgrModuleTest)
https://jenkins.ceph.com/job/ceph-api/72895/ on main Casey Bodley
06:30 PM RADOS Backport #65376 (In Progress): quincy: crash: void PaxosService::propose_pending(): assert(have_pending)
Patrick Donnelly
06:29 PM RADOS Backport #65377 (In Progress): reef: crash: void PaxosService::propose_pending(): assert(have_pending)
Patrick Donnelly
06:28 PM mgr Backport #65621 (In Progress): quincy: mgr: update cluster state for new maps from the mons before notifying modules
Patrick Donnelly
06:28 PM mgr Backport #65623 (In Progress): reef: mgr: update cluster state for new maps from the mons before notifying modules
Patrick Donnelly
06:27 PM mgr Backport #65622 (In Progress): squid: mgr: update cluster state for new maps from the mons before notifying modules
Patrick Donnelly
06:27 PM CephFS Backport #65620 (In Progress): squid: qa: test_max_items_per_obj open procs not fully cleaned up
Patrick Donnelly
06:26 PM CephFS Backport #65619 (In Progress): squid: mds: quiesce_counter decay rate initialized from wrong config
Patrick Donnelly
06:23 PM CephFS Backport #65273 (In Progress): squid: PG_DEGRADED warnings during cluster creation via cephadm: "Health check failed: Degraded data redundancy: 2/192 objects degraded (1.042%), 1 pg degraded (PG_DEGRADED)"
Patrick Donnelly
06:20 PM Ceph Bug #64095 (Resolved): ceph-exporter is not included in the deb packages
Konstantin Shalygin
06:19 PM Ceph Bug #63637 (Resolved): debian packaging is missing bcrypt dependency for ceph-mgr's .requires file
Konstantin Shalygin
06:18 PM Ceph Backport #63638 (Resolved): reef: debian packaging is missing bcrypt dependency for ceph-mgr's .requires file
Konstantin Shalygin
06:12 PM Ceph Backport #63638: reef: debian packaging is missing bcrypt dependency for ceph-mgr's .requires file
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54662
merged
Yuri Weinstein
06:18 PM Ceph Backport #65172 (Resolved): reef: ceph-exporter is not included in the deb packages
Konstantin Shalygin
06:13 PM Ceph Backport #65172: reef: ceph-exporter is not included in the deb packages
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56541
merged
Yuri Weinstein
05:52 PM CephFS Bug #65603 (In Progress): mds: quiesce timeout due to a freezing directory
Patrick Donnelly
04:37 PM rgw Backport #65640 (In Progress): squid: [rgw][accounts] bucket quota management at account-level
Casey Bodley
04:35 PM rgw Backport #65640 (Resolved): squid: [rgw][accounts] bucket quota management at account-level
https://github.com/ceph/ceph/pull/57058 Casey Bodley
04:35 PM rgw Feature #65551 (Pending Backport): [rgw][accounts] bucket quota management at account-level
Casey Bodley
04:23 PM bluestore Bug #65482 (Fix Under Review): bluestore/bluestore_types: check 'it' valid before using
Igor Fedotov
04:22 PM rgw Backport #65002 (Resolved): quincy: [CVE-2023-46159] RGW crash upon misconfigured CORS rule
Casey Bodley
04:07 PM Ceph QA QA Run #65574: wip-yuri7-testing-2024-04-18-1351-reef
@matan can you review this run? Thought you'd be a good candidate since two of the PRs are yours. Laura Flores
04:06 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Greg Farnum wrote in #note-33:
> Venky Shankar wrote in #note-30:
> > OK. I'll elaborate. Generally, clients are no...
Patrick Donnelly
04:02 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Venky Shankar wrote in #note-32:
> Dhairya Parmar wrote in #note-28:
> > as mentioned in yesterday's standup - some...
Patrick Donnelly
03:25 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Venky Shankar wrote in #note-30:
> OK. I'll elaborate. Generally, clients are not trustable - someone can hook up a ...
Greg Farnum
04:05 PM Ceph QA QA Run #65560: wip-yuri5-testing-2024-04-17-1400
Hey @amathuri can you review this batch? Laura Flores
04:04 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
@ksirivad back to you! Laura Flores
02:14 PM Ceph QA QA Run #65349 (QA Needs Approval): wip-yuri3-testing-2024-04-05-0825
Yuri Weinstein
03:59 PM RADOS Bug #61832: Restoring #61785: osd-scrub-dump.sh: ERROR: Extra scrubs after test completion...not expected
/a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668423 Laura Flores
03:55 PM Infrastructure Bug #47690: RuntimeError: Stale jobs detected, aborting.
/a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668439 Laura Flores
03:53 PM bluestore Bug #56788: crash: void KernelDevice::_aio_thread(): abort
/a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668449... Laura Flores
03:49 PM CephFS Tasks #65613: truncate failing when using path
Hmm, I'm surprised you found missing Server logic here. Shouldn't that have turned up in kernel fscrypt testing? Xiub... Greg Farnum
03:48 PM RADOS Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
/a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668452 Laura Flores
03:44 PM Orchestrator Bug #64208: test_cephadm.sh: Container version mismatch causes job to fail.
/a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668470 Laura Flores
03:43 PM Infrastructure Bug #65639: smithi139 unable to be reached over ssh
Affects centos 9 stream, if that turns out to be relevant. Laura Flores
03:42 PM Infrastructure Bug #65639 (In Progress): smithi139 unable to be reached over ssh
/a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668463... Laura Flores
03:13 PM Ceph QA QA Run #65638 (QA Closed): wip-yuriw4-testing-20240423.151325-quincy
* "PR #57029":https://github.com/ceph/ceph/pull/57029 -- quincy: qa: fix krbd_msgr_segments and krbd_rxbounce failing... Yuri Weinstein
01:49 PM CephFS Feature #65637 (New): mds: continue sending heartbeats during recovery when MDS journal is large
When the MDS reaches up:rejoin / up:resolve after spending a long time (hours) in up:replay, it often gets in an loop... Patrick Donnelly
01:40 PM rgw Backport #65636 (In Progress): squid: release note for rgw_realm init
Casey Bodley
01:39 PM rgw Backport #65636 (Resolved): squid: release note for rgw_realm init
https://github.com/ceph/ceph/pull/57055 Casey Bodley
01:39 PM rgw Bug #65575 (Pending Backport): release note for rgw_realm init
Casey Bodley
01:34 PM Dashboard Backport #65255 (In Progress): squid: mgr/dashboard: use alertmanager v2 APIs mgr/dashboard: short_description
Nizamudeen A
01:29 PM Infrastructure Bug #63831 (Closed): "make check" fails on docs-related PRs sometimes
Zac Dover
01:19 PM Ceph Feature #63703: If a prefix is available, allow it be used to narrow the bounds of OMAP iterator
Xiang Li wrote in #note-1:
> Is anyone trying out this new feature? Can I give it a try?
I don't think anyone has...
Yixin Jin
02:12 AM Ceph Feature #63703: If a prefix is available, allow it be used to narrow the bounds of OMAP iterator
Is anyone trying out this new feature? Can I give it a try? Thomas Li
01:05 PM crimson Bug #65635 (New): Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
[ RUN ] omap_manager_test/omap_manager_test_t.force_leafnode_split_merge_fullandbalanced/0
INFO 2024-04-23 08:...
Rongqi Sun
11:58 AM Ceph Bug #65634 (New): rbd-mirror user does not have enough permissions to obtain (daemon) health status information
We are testing rbd-mirroring. There seems to be a permission error with the rbd-mirror user. Using this user to query... Stefan Kooman
09:45 AM sepia Support #65633 (In Progress): Sepia Lab Access Request
Hi Team,
I am from ceph qe team and Requesting sepia lab access for the first time. Please do the needful.
De...
Neha Gangadhar
09:13 AM crimson Bug #65585: unittest-seastore (Timeout)
https://jenkins.ceph.com/job/ceph-pull-requests-arm64/55512/console... Yingxin Cheng
07:34 AM Messengers Bug #65401: msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
the periodically connection fault can be found in log by following steps:
1. set ms_connection_idle_timeout=60; debu...
Xinying Song
06:41 AM Ceph Documentation #65631 (Fix Under Review): clarify dual-stack mode
Zac Dover
04:42 AM Ceph Documentation #65631 (Resolved): clarify dual-stack mode
Robert Sander asks whether Ceph supports dual-stack mode. Dual-stack mode is when both IPv4 and IPv6 networks are use... Zac Dover
06:21 AM crimson Bug #65632 (New): crimson osd crashes due to daggling pointers of operation blockers
There are time gaps between the destruction of OSDMapBlockers and OSDMapBlockers unreferencing from BlockingEvents. I... Xuehan Xu
05:45 AM Ceph Bug #65629 (Fix Under Review): cephfs_mirror: display 'sync_bytes' in peer status
Jos Collin
03:56 AM Ceph Bug #65629 (In Progress): cephfs_mirror: display 'sync_bytes' in peer status
Jos Collin
03:55 AM Ceph Bug #65629 (Fix Under Review): cephfs_mirror: display 'sync_bytes' in peer status
Display 'sync_bytes' for the 'last_synced_snap' in the 'peer status' command output. This is analogous with the perf ... Jos Collin
04:56 AM RADOS Feature #65583: mon store data should be available depending on the user keyring
> My understanding is the idea is restrict the visibility of configurables' values.
Yes, that's right, but can you...
Parth Arora
04:32 AM Ceph Documentation #65609 (Resolved): Documentation of maximum port number is incorrect
Zac Dover
04:10 AM CephFS Bug #65630 (Fix Under Review): mds: rename request was deadlocked between two different MDSs
This is reported by Nigel, more detail please see https://www.mail-archive.com/ceph-users@ceph.io/msg24587.html
In...
Xiubo Li
03:04 AM crimson Bug #65628 (Resolved): unittest-seastore (Timeout)
There is a certain probability of timeout happened on both ARM and X86 CI.
e.g.
1. https://jenkins.ceph.com/job/cep...
Rongqi Sun

04/22/2024

11:00 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
Unfortunately, noscrub and nodeep-scrub are not the only warnings we would need to mask for the thrashosds-health tes... Brad Hubbard
10:51 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
More in this run: https://pulpito.ceph.com/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro... Laura Flores
10:46 PM Orchestrator Bug #64374: Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)
/a/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7634080 Laura Flores
09:33 PM mgr Bug #65627 (New): Centos 9 stream ceph container iscsi test failure
h3. Missing k8sevents module
While waiting for the mgr to start, we get this traceback message:
teuthology.log
<...
Laura Flores
09:19 PM rgw Bug #65626: rgw: false assumption on vault bucket key deletion
PR: https://github.com/ceph/ceph/pull/57046 Seena Fallah
09:16 PM rgw Bug #65626 (Fix Under Review): rgw: false assumption on vault bucket key deletion
On bucket key deletion when the request to change the property of the key for deletion_allowed to true, it is expecte... Seena Fallah
09:17 PM Ceph QA QA Run #65558 (QA Closed): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
Yuri Weinstein
09:05 PM Ceph QA QA Run #65558 (QA Approved): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
Yuri Weinstein
08:31 PM Ceph QA QA Run #65558: wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
approved overall, except for two prs:
> https://github.com/ceph/ceph/pull/54172 - quincy: prevent anonymous topic ...
Casey Bodley
09:14 PM rgw Backport #65409: quincy: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56818
merged
Yuri Weinstein
09:13 PM rgw Backport #65341: quincy: rgw: update options yaml file so LDAP uri isn't an invalid example
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56722
merged
Yuri Weinstein
09:08 PM rgw Backport #63961: quincy: rgw: lack of headers in 304 response
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/55095
merged
Yuri Weinstein
09:07 PM rgw Backport #63253: quincy: Add bucket versioning info to radosgw-admin bucket stats output
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54190
merged
Yuri Weinstein
08:37 PM rbd Bug #65487 (In Progress): rbd-mirror daemon in ERROR state, require manual restart
Hi Nir,
Thanks for providing verbose logs. For now, I have all the information I need.
Due to rbd-mirror daemo...
Ilya Dryomov
08:34 PM rgw Backport #65625 (In Progress): quincy: rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
Casey Bodley
08:33 PM rgw Backport #65625 (In Progress): quincy: rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
https://github.com/ceph/ceph/pull/57045 Casey Bodley
08:26 PM rgw Bug #61772 (Pending Backport): rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
Casey Bodley
08:12 PM rbd Feature #65624 (Pending Backport): [pybind] expose CLONE_FORMAT and FLATTEN image options
C/C++ API:... Ilya Dryomov
07:53 PM rgw Backport #64766 (Resolved): reef: SSL session id reuse speedup mechanism of the SSL_CTX_set_session_id_context is not working
Casey Bodley
07:50 PM rgw Bug #62063 (New): notification tests fail on 'radosgw-admin -n client.0 user rm --uid foo.client.0 --purge-data --cluster ceph'
happening on quincy: https://qa-proxy.ceph.com/teuthology/yuriw-2024-04-20_15:31:09-rgw-wip-yuri4-testing-2024-04-19-... Casey Bodley
07:30 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
Bump up./ Radoslaw Zarzynski
07:18 PM RADOS Bug #56393: failed to complete snap trimming before timeout
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648606 was on fbfd55d0098... Radoslaw Zarzynski
07:12 PM mgr Backport #65623 (In Progress): reef: mgr: update cluster state for new maps from the mons before notifying modules
https://github.com/ceph/ceph/pull/57065 Backport Bot
07:12 PM mgr Backport #65622 (In Progress): squid: mgr: update cluster state for new maps from the mons before notifying modules
https://github.com/ceph/ceph/pull/57064 Backport Bot
07:12 PM mgr Backport #65621 (In Progress): quincy: mgr: update cluster state for new maps from the mons before notifying modules
https://github.com/ceph/ceph/pull/57066 Backport Bot
07:11 PM CephFS Backport #65620 (In Progress): squid: qa: test_max_items_per_obj open procs not fully cleaned up
https://github.com/ceph/ceph/pull/57063 Backport Bot
07:11 PM CephFS Backport #65619 (In Progress): squid: mds: quiesce_counter decay rate initialized from wrong config
https://github.com/ceph/ceph/pull/57062 Backport Bot
07:07 PM mgr Bug #64799 (Pending Backport): mgr: update cluster state for new maps from the mons before notifying modules
I'll sit on the backports for a while. Patrick Donnelly
07:06 PM CephFS Bug #65022 (Pending Backport): qa: test_max_items_per_obj open procs not fully cleaned up
Patrick Donnelly
07:04 PM CephFS Bug #65342 (Pending Backport): mds: quiesce_counter decay rate initialized from wrong config
Patrick Donnelly
06:52 PM Ceph QA QA Run #65596 (QA Approved): wip-pdonnell-testing-20240420.180737-debug
https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-04-20 Patrick Donnelly
06:47 PM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
... Patrick Donnelly
06:42 PM CephFS Bug #65618 (In Progress): qa: fsstress: cannot execute binary file: Exec format error
... Patrick Donnelly
06:40 PM RADOS Bug #53768 (Closed): timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
Radoslaw Zarzynski
06:39 PM CephFS Fix #65617 (Fix Under Review): qa: increase debugging for snap_schedule
Patrick Donnelly
06:36 PM CephFS Fix #65617 (Pending Backport): qa: increase debugging for snap_schedule
Patrick Donnelly
06:39 PM rgw Bug #65567: admin_socket_output: signal: Terminated from term radosgw
note from tracker scrub: looks like a duplicate of https://tracker.ceph.com/issues/59380. Radoslaw Zarzynski
06:31 PM rgw Bug #65567 (Duplicate): admin_socket_output: signal: Terminated from term radosgw
Laura Flores
06:33 PM CephFS Bug #65616 (Triaged): pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
Check timestamps:... Patrick Donnelly
06:28 PM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
Update: Still working to understand why my local reproducer worked with the latest fix but not in teuthology. Laura Flores
06:27 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
https://shaman.ceph.com/builds/ceph/wip-yuri3-testing-2024-04-05-0825/5d349943c59c9485df060d6adb0594f3940ec0eb/ Yuri Weinstein
06:15 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
Laura Flores wrote in #note-6:
> @yuriw can you rebase/rerun? One of the PRs got more changes.
removed https://gi...
Yuri Weinstein
05:47 PM Ceph QA QA Run #65349 (QA Needs Rerun/Rebuilt): wip-yuri3-testing-2024-04-05-0825
@yuriw can you rebase/rerun? One of the PRs got more changes. Laura Flores
06:23 PM RADOS Bug #62839 (Closed): Teuthology failure in LibRadosTwoPoolsPP.HitSetWrite
Cache tiering is deprecated, sorry. Radoslaw Zarzynski
06:20 PM Ceph QA QA Run #65552 (QA Needs Approval): wip-yuri2-testing-2024-04-17-0823-reef
Laura Flores wrote in #note-3:
> @yuriw can you try rerunning this?
rerunning failed
Yuri Weinstein
05:32 PM Ceph QA QA Run #65552 (QA Needs Rerun/Rebuilt): wip-yuri2-testing-2024-04-17-0823-reef
@yuriw can you try rerunning this? Laura Flores
06:19 PM RADOS Bug #65186: OSDs unreachable in upgrade test
In QA. Pinged. Radoslaw Zarzynski
06:17 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
Still in QA. Radoslaw Zarzynski
06:16 PM RADOS Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
Hi Nitzan, would you mind taking a look? Radoslaw Zarzynski
06:14 PM CephFS Tasks #65615 (Resolved): lchown corrupts symlink entry
lchown corrupts symlink entry:... Christopher Hoffman
06:12 PM RADOS Bug #65449: NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
In review. Radoslaw Zarzynski
06:09 PM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
note from scrub: bump up. Radoslaw Zarzynski
06:07 PM RADOS Feature #65583: mon store data should be available depending on the user keyring
This sounds like a feature request, not a bug.
My understanding is the idea is restrict the visibility of configurab...
Radoslaw Zarzynski
05:56 PM CephFS Bug #65614 (Fix Under Review): client: resends request to same MDS it just received a forward from if it does not have an open session with the target
Patrick Donnelly
05:46 PM CephFS Bug #65614 (Pending Backport): client: resends request to same MDS it just received a forward from if it does not have an open session with the target
... Patrick Donnelly
05:53 PM RADOS Documentation #16258: ceph audit logs are not logging to ceph.audit.log if we specify "mon cluster log file" option
If something stays in tracker, without huge attention, for 8+ years, it's probably not a high prio... Radoslaw Zarzynski
08:04 AM RADOS Documentation #16258: ceph audit logs are not logging to ceph.audit.log if we specify "mon cluster log file" option
No idea if this is still applicable. Unassigning from me because it hasn't been touched for almost a decade, and I'll... Joao Eduardo Luis
05:47 PM RADOS Bug #53240: full-object read crc is mismatch, because truncate modify oi.size and forget to clear data_digest
New changes in the PR (a unit test fix). Need to reQA. Radoslaw Zarzynski
05:39 PM RADOS Bug #65371: rados: PeeringState::calc_replicated_acting_stretch populate acting set before checking if < bucket_max
In review. Radoslaw Zarzynski
05:38 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Yuri provided an update. Still in QA. Radoslaw Zarzynski
05:32 PM rgw Backport #64496 (Resolved): squid: keystone admin token is not invalidated on http 401 response
Casey Bodley
05:32 PM rgw Backport #65353 (Resolved): squid: rgwlc: Executing radosgw-admin lc process --bucket <bkt-name> without setting lc rule results in Segmentation fault
Casey Bodley
05:32 PM rgw Backport #64552 (Resolved): squid: rgw/multisite: objects named "." or ".." are not replicated
Casey Bodley
04:55 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
Another one: https://pulpito.ceph.com/leonidus-2024-04-22_12:36:42-fs-wip-lusov-quiescer-distro-default-smithi/766829... Leonid Usov
04:41 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
Another case: https://pulpito.ceph.com/leonidus-2024-04-22_12:36:42-fs-wip-lusov-quiescer-distro-default-smithi/76682... Leonid Usov
02:44 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
... Leonid Usov
02:43 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
Another instance of the same at https://pulpito.ceph.com/leonidus-2024-04-22_12:36:42-fs-wip-lusov-quiescer-distro-de... Leonid Usov
04:54 PM CephFS Tasks #64133: Make pjd work on fscrypt
Make pjd tests pass that are failing:... Christopher Hoffman
04:50 PM CephFS Tasks #65613 (Resolved): truncate failing when using path
The fix:... Christopher Hoffman
04:44 PM CephFS Tasks #65613 (Resolved): truncate failing when using path
Reproducer:... Christopher Hoffman
04:19 PM Ceph Bug #65612 (New): qa: logrotate fails when state file is already locked
... Patrick Donnelly
04:16 PM rgw Bug #65160: rgw/lc: A few buckets stuck in UNINITIAL state
Can this be backported to Squid? Jane Zhu
03:23 PM RADOS Bug #49158 (Resolved): doc: ceph-monstore-tools might create wrong monitor store
Zac Dover
03:21 PM Ceph Documentation #57125 (Resolved): Improve wording of /doc/rados/*
Zac Dover
03:21 PM Ceph Documentation #57108 (Resolved): add ".. prompt :: bash $" to /doc/rados
Zac Dover
03:15 PM Ceph Bug #64446 (Resolved): Backport PR#55540 to Squid (and only Squid) when its commits are merged to main
Zac Dover
03:14 PM Ceph Documentation #65161 (Resolved): Update Zabbix Documentation
Zac Dover
03:12 PM Ceph Documentation #65599 (Resolved): "ceph osd crush rename bucket" command missing
Zac Dover
03:06 PM Ceph Bug #65249 (Resolved): peering_graph.generated.dot renders weird
I used these instructions to build an SVG file of the peering graph:
$ git clone https://github.com/ceph/ceph.git
...
Zac Dover
11:14 AM rbd Backport #65550 (In Progress): squid: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Ilya Dryomov
11:14 AM rbd Backport #65549 (In Progress): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Ilya Dryomov
11:12 AM rbd Backport #65547 (In Progress): quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Ilya Dryomov
10:45 AM Ceph Bug #65611 (New): Segmentation fault in upkeep_main
... Gunther Heinrich
10:27 AM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Another instance of this issue: https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-de... Leonid Usov
09:42 AM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
fixing the @request_drop_foreign_locks@ method uncovered another crash due to the same reason, this time when droppin... Leonid Usov
10:08 AM crimson Bug #65610 (Resolved): unittest-object-data-handler crashes testing object_data_handler_test_t.overwrite_then_read_within_transaction
... Xuehan Xu
09:28 AM crimson Bug #65491: recover_missing: racing read got wrong version
> *Hypothesis 2:*
> See: 'Version bump'. Version was bumped to 12 and then both requests were requeued (requeueing c...
Matan Breizman
09:23 AM Ceph Documentation #65609 (Resolved): Documentation of maximum port number is incorrect
The highest port number used by OSD or MDS daemons was increased from 7300 to 7568 in https://github.com/ceph/ceph/pu... Pierre Riteau
08:43 AM crimson Bug #64206: obc->is_loaded_and_valid() assertion
https://pulpito.ceph.com/matan-2024-04-21_15:36:23-crimson-rados-wip-matanb-crimson-testing-snap-overlap-distro-crims... Matan Breizman
08:06 AM RADOS Cleanup #10506 (Rejected): mon: get rid of QuorumServices
I hope this might have been addressed at some point. If not, it probably no longer makes sense to mess with the monit... Joao Eduardo Luis
08:03 AM RADOS Bug #42519: During deployment of the ceph,when the main node starts slower than the other nodes.It may lead to generate a core by assert.
No idea if this is still applicable. Unassigning from me because it hasn't been touched for 4 years now, and I'll lik... Joao Eduardo Luis
06:20 AM CephFS Bug #50260: pacific: qa: "rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty"
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648870
The *mnt.0* was...
Xiubo Li
06:13 AM CephFS Bug #64707: suites/fsstress.sh hangs on one client - test times out
Laura Flores wrote in #note-16:
> /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-defa...
Xiubo Li
06:05 AM Ceph Bug #65608 (New): Mirroring mode of rbd image changes when migrated between pools
When an rbd image is mirrored and migrated between pools (rbd migration) the mirroring mode changes from "snapshot" (... Stefan Kooman
05:37 AM rgw Bug #64999: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone

Hi Shilpa,
We are eagerly waiting for your direction to resolve it.
I appreciate your attention to this matter....
Mohammad Saif
05:12 AM CephFS Bug #65607: mds deadlock between 'lookup' and the 'rename/create, etc' requests
This possibly caused by the lock order issue as in https://tracker.ceph.com/issues/62123. Xiubo Li
05:09 AM CephFS Bug #65607 (Need More Info): mds deadlock between 'lookup' and the 'rename/create, etc' requests
Have suggested Erich to make *max_mds = 1* to reproduce it to get rid of the noises. Xiubo Li
04:51 AM CephFS Bug #65607: mds deadlock between 'lookup' and the 'rename/create, etc' requests
As Erich mentioned he enabled multiple active MDSs, but he only updated the block ops from on MDS. I guess maybe anot... Xiubo Li
04:33 AM CephFS Bug #65607 (Need More Info): mds deadlock between 'lookup' and the 'rename/create, etc' requests
This is reported by Eric, more detail please see https://www.mail-archive.com/ceph-users@ceph.io/msg24587.html
The...
Xiubo Li
04:12 AM Dashboard Bug #65571 (Resolved): mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
Nizamudeen A
04:12 AM Dashboard Backport #65581 (Resolved): squid: mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
Nizamudeen A

04/21/2024

07:27 PM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
The incorrect behavior of the method that stripped the local quiesce lock from the request resulted in the crash when... Leonid Usov
07:17 PM CephFS Bug #65606 (Fix Under Review): workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Leonid Usov
06:57 PM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
We had a successful quiesce on the mds.0 followed by the said export dir request. The export dir request has failed t... Leonid Usov
06:30 PM CephFS Bug #65606 (Pending Backport): workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))

https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666598/
The f...
Leonid Usov
06:08 PM CephFS Bug #65605 (Duplicate): fsx.sh workload fails with status 2 due to a makefile error
Duplicate of https://tracker.ceph.com/issues/64572 Leonid Usov
06:06 PM CephFS Bug #65605: fsx.sh workload fails with status 2 due to a makefile error
another instance of the same failure https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-dist... Leonid Usov
06:05 PM CephFS Bug #65605 (Duplicate): fsx.sh workload fails with status 2 due to a makefile error
https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666610/... Leonid Usov
05:46 PM CephFS Bug #65604 (Triaged): dbench.sh workload times out after 3h when run with-quiescer
https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666604/
No quie...
Leonid Usov
05:14 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666602/... Leonid Usov
04:24 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
The directory appears to be fragmenting, as we see from a few messages in the log... Leonid Usov
04:09 PM CephFS Bug #65603 (Pending Backport): mds: quiesce timeout due to a freezing directory
Analyzing one of the ETIMEDOUT error for a quiesce, looking at
https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13...
Leonid Usov
02:02 PM Ceph QA QA Run #65592: wip-yuriw-testing-20240419.185239-main
https://pulpito.ceph.com/?branch=wip-yuriw-testing-20240419.185239-main Yuri Weinstein
02:01 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
https://pulpito.ceph.com/?branch=wip-yuriw-testing-20240419.202307-squid Yuri Weinstein
01:21 PM crimson Bug #65532 (Fix Under Review): osd crashes due to invalid clone_range ops
Matan Breizman
01:17 PM crimson Bug #64782 (Resolved): test_python.sh TestIoctx.test_locator failes in cases of SeaStore
Matan Breizman
01:12 PM crimson Bug #65531 (In Progress): crimson-osd: dump_historic_slow_ops command not correctly run
Matan Breizman
01:12 PM sepia Support #65535 (In Progress): Sepia Lab Access Request
Hey Kalpesh Pandya,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn a...
adam kraitman
01:00 PM crimson Support #65602 (New): Support RBD mirror testing
See: qa/suites/rbd/mirror-thrash and qa/suites/rbd/mirror Matan Breizman
12:57 PM crimson Bug #65601 (New): rados_python.yaml enable tests
Currently some of rados_python tests are disabled:... Matan Breizman
12:27 PM bluestore Fix #65600 (Fix Under Review): bluefs alloc unit should only be shrink
The alloc unit has already forbidden changed for bluestore, what's more, it should forbidden increased in bluefs. Oth... Arvin Liang
09:43 AM crimson Bug #65474 (Resolved): mgr crash due to corrupted incremental osdmap sent by crimson-osds
Matan Breizman
09:43 AM crimson Bug #65200 (Resolved): PeeringState::get_peer_info(pg_shard_t) const: Assertion `it != peer_info.end()' failed.
Matan Breizman
09:42 AM crimson Bug #59242 (Resolved): [crimson] Pool compression does not take effect
Matan Breizman
09:25 AM CephFS Backport #65556 (Fix Under Review): squid: mds: avoid recalling Fb when quiescing file
Leonid Usov
09:21 AM CephFS Backport #65556 (In Progress): squid: mds: avoid recalling Fb when quiescing file
Leonid Usov
09:08 AM crimson Bug #63647: SnapTrimEvent AddressSanitizer: heap-use-after-free
https://pulpito.ceph.com/matan-2024-04-21_07:41:30-crimson-rados-wip-matanb-crimson-only-testing-april-17-distro-crim... Matan Breizman
08:12 AM rgw Feature #53662: rgw: radosgw-admin can list and remove bucket notification topics; it must also be able to create them
agree we should close.
* topic creation by an admin will mess up the topic ownership logic
* we can create notifica...
Yuval Lifshitz
07:58 AM Ceph Documentation #65599: "ceph osd crush rename bucket" command missing
Eugen Block, as usual, to the rescue:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/IQUPWQZ5ZIQ...
Zac Dover
07:52 AM Ceph Documentation #65599 (Resolved): "ceph osd crush rename bucket" command missing
https://docs.ceph.com/en/latest/rados/operations/crush-map/
The "ceph osd crush rename bucket" command is not list...
Zac Dover
12:13 AM Ceph Bug #65598 (New): github v18.2.2 tag removed
I have some automation that looks for git tags on github that broke recently because the v18.2.2 tag was removed from... Iggy Jackson

04/20/2024

06:07 PM Ceph QA QA Run #65596 (QA Approved): wip-pdonnell-testing-20240420.180737-debug
* "PR #57010":https://github.com/ceph/ceph/pull/57010 -- mds: add missing policylock to test F_QUIESCE_BLOCK
* "PR #...
Patrick Donnelly
06:07 PM Ceph QA QA Run #65562 (QA Closed): wip-pdonnell-testing-20240418.004638-debug
Patrick Donnelly
03:41 PM Ceph QA QA Run #65594 (QA Needs Approval): wip-yuriw11-testing-20240501.200505-squid
Yuri Weinstein
03:39 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
Note to self: this batch is by mistake is labeled "yuri" instead of "yuri11" Yuri Weinstein
03:33 PM Ceph QA QA Run #65592 (QA Needs Approval): wip-yuriw-testing-20240419.185239-main
Yuri Weinstein
03:32 PM Ceph QA QA Run #65558 (QA Needs Approval): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
Yuri Weinstein
03:23 PM Ceph QA QA Run #65126: wip-yuri8-testing-2024-03-25-1419
@pdvian if you approved this batch pls change the status and assign to me for merge Yuri Weinstein
03:21 PM Ceph QA QA Run #65574 (QA Needs Approval): wip-yuri7-testing-2024-04-18-1351-reef
Yuri Weinstein
01:14 AM Ceph QA QA Run #65574 (QA Approved): wip-yuri7-testing-2024-04-18-1351-reef
Yuri Weinstein
03:07 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
I reproduced the issue again with debug logs.
Tested flow:
- Configure rbd mirroring on both clusters
- Wait for...
Nir Soffer

04/19/2024

11:31 PM Ceph QA QA Run #65126: wip-yuri8-testing-2024-03-25-1419
Failure, unrelated :
1. cephadm: Health detail: HEALTH_WARN 1/3 mons down, quorum a,c in cluster log
2. cephadm: Heal...
Prashant D
11:30 PM CephFS Bug #65595 (Fix Under Review): mds: missing policylock acquisition for quiesce
Patrick Donnelly
11:28 PM CephFS Bug #65595 (Pending Backport): mds: missing policylock acquisition for quiesce
In order to check an inode's F_QUIESCE_BLOCK, the quiesce_inode op must acquire the policylock. Furthermore, to ensur... Patrick Donnelly
11:20 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Update on this bug:
We are pretty close to getting the fix out for this. Thanks all for waiting so long. In additi...
Laura Flores
08:23 PM Ceph QA QA Run #65594 (QA Approved): wip-yuriw11-testing-20240501.200505-squid
* "PR #57006":https://github.com/ceph/ceph/pull/57006 -- squid: osd/PGBackend::be_scan_list: only call stat, getattrs... Yuri Weinstein
07:21 PM RADOS Backport #65593: squid: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
https://github.com/ceph/ceph/pull/57006 Radoslaw Zarzynski
06:55 PM RADOS Backport #65593 (New): squid: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
/a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098/log/b1f19696-e81a-11ee... Radoslaw Zarzynski
06:52 PM Ceph QA QA Run #65592 (QA Closed): wip-yuriw-testing-20240419.185239-main
* "PR #56995":https://github.com/ceph/ceph/pull/56995 -- osd: only call stat/getattrs once per object during deep-scrub Yuri Weinstein
06:51 PM RADOS Bug #65185 (Fix Under Review): OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
Radoslaw Zarzynski
05:57 PM rgw Feature #20094 (Resolved): RFW: make civetweb max request size configurable to allow larger s3 object metadata
Casey Bodley
05:56 PM rgw Feature #19917 (Closed): radosgw access log is lacking useful information
Casey Bodley
05:56 PM rgw Feature #19510 (Resolved): per-object storage class
Casey Bodley
05:55 PM rgw Feature #20398 (Resolved): rgw: Swift TempURL does not support prefix-based scope
Casey Bodley
05:54 PM rgw Feature #20733 (Closed): RGW bucket limits
Casey Bodley
05:53 PM rgw Feature #20650 (Resolved): Support webhook for authentication
Casey Bodley
05:52 PM rgw Feature #20795 (Resolved): rgw: the TempURL implementation should support ISO8601 in temp_url_expires
Casey Bodley
05:52 PM rgw Feature #20883 (Resolved): rgw: responses for HEAD/GET on Swift's container should contain Last-Modified
Casey Bodley
05:51 PM rgw Feature #21334 (Resolved): support log response header “x-amz-request-id ”
Casey Bodley
05:50 PM rgw Feature #21799 (Rejected): multisite: sync parts of multipart uploads
Casey Bodley
05:49 PM rgw Feature #22565 (Resolved): Multiple Data Pool Support for a Bucket
this is supported through storage classes: https://docs.ceph.com/en/latest/radosgw/placement/ Casey Bodley
05:48 PM rgw Feature #24335 (Resolved): Get the user metadata of the user used to sign the request
Casey Bodley
05:46 PM rgw Feature #24493 (Resolved): rgw does not implement list_object_v2 in S3
Casey Bodley
05:43 PM rgw Feature #24507 (Resolved): [rfe] rgw: relaxed region constraint enforcement
Casey Bodley
05:41 PM rgw Feature #39084 (Resolved): ability to control user op mask via admin apis
Casey Bodley
05:40 PM rgw Feature #40241 (Rejected): radosgw: ldap groups
Casey Bodley
05:39 PM rgw Feature #40242 (Rejected): radosgw-admin: export & import buckets
Casey Bodley
05:37 PM rgw Feature #40392 (Rejected): radosgw-admin: create bucket
Casey Bodley
05:35 PM rgw Feature #40714 (Closed): usage log differ from civetweb and beast
Casey Bodley
05:35 PM rgw Feature #41062 (Resolved): Extend SSE-KMS in Rados Gateway to support HashiCorp Vault
Casey Bodley
05:35 PM rgw Feature #41222 (Rejected): multisite: delay sync data to non-master zone
Casey Bodley
05:34 PM rgw Feature #42513 (Resolved): rgw: radosgw-admin command line parsing cleanup and improvements
Casey Bodley
05:34 PM rgw Feature #42627 (Resolved): rgw: bucket granularity sync: bucket dependency index
Casey Bodley
05:33 PM rgw Feature #42626 (Resolved): rgw: bucket granularity sync: core sync changes
Casey Bodley
05:33 PM rgw Feature #42625 (Resolved): rgw: bucket granularity sync: sync policy
Casey Bodley
05:33 PM rgw Feature #42272 (Resolved): rgw set cpu affinity at startup
Casey Bodley
05:33 PM rgw Feature #42493 (Rejected): Simplify Login Radosgw-admin API
ceph provides a shell script in https://github.com/ceph/ceph/blob/main/examples/rgw/rgw_admin_curl.sh that adds sigv2... Casey Bodley
05:31 PM rgw Feature #45444 (Resolved): Add bucket name to bucket stats error logging
Casey Bodley
05:30 PM rgw Feature #45568 (Resolved): Swift Extract Archive Operation
Casey Bodley
05:30 PM rgw Feature #45748 (Closed): recommended max number of buckets....
we don't intend there to be any scaling limit to the number of total buckets in the system. there are limitations on ... Casey Bodley
05:27 PM rgw Feature #46028 (Resolved): RGW User Policy
Casey Bodley
05:25 PM rgw Feature #48402 (Resolved): multisite option to enable keepalive
Casey Bodley
05:25 PM rgw Feature #48513 (Rejected): uses librgw2 to directly access the rados cluster for hadoop
Casey Bodley
05:24 PM rgw Feature #48798 (Resolved): RGW:Multisite: Verify if the synced object is identical to source
Casey Bodley
05:24 PM rgw Feature #49227 (Resolved): rgw: register daemon in service map with more details
Casey Bodley
05:22 PM rgw Feature #50262 (Duplicate): rgw header size limit should configurable
Casey Bodley
05:20 PM rgw Feature #53546 (Resolved): rgw/beast: add max_header_size option with 16k default, up from 4k
Casey Bodley
05:09 PM rgw Feature #55016 (Resolved): radosgw-admin should allow setting user policy
Casey Bodley
05:07 PM rgw Bug #23264 (In Progress): Server side encryption support for s3 COPY operation
Casey Bodley
05:07 PM rgw Feature #55481 (Resolved): The latest version of server encryption does not support "aes256" as kms encryption method
Casey Bodley
05:04 PM rgw Feature #55640 (Rejected): make lua scripting optional
the attached pull request closed a year ago
i personally don't see much benefit to disabling lua at compile time. ...
Casey Bodley
04:56 PM rgw Feature #53662 (Need More Info): rgw: radosgw-admin can list and remove bucket notification topics; it must also be able to create them
trying to scrub some old feature requests. is there still interest in this?
in general, i don't think radosgw-admi...
Casey Bodley
04:36 PM rgw Feature #59593 (Closed): The capability of resetting an empty bucket to the clean-slate state in multi-site environment
Casey Bodley
03:46 PM rgw Feature #63930 (Duplicate): s3: implement GetObjectAttributes
Casey Bodley
03:06 PM rgw Feature #64190 (Resolved): support lifecycle NewerNoncurrentVersions in NoncurrentVersionExpiration
already backported to squid with https://github.com/ceph/ceph/pull/56144 Casey Bodley
02:50 PM RADOS Bug #65591 (New): Pool MAX_AVAIL goes UP when an OSD is marked down+in
Example:
* Cluster with 4 OSD nodes, 10 OSDs each
* 3x replicated pool
* `max_avail` from `ceph df detail --format...
Michael Kidd
02:28 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
@lflores rerun => https://pulpito.ceph.com/yuriw-2024-04-19_14:26:57-rados-wip-yuri6-testing-2024-04-02-1310-distro-d... Yuri Weinstein
02:25 PM Ceph QA QA Run #65270 (QA Needs Approval): wip-yuri6-testing-2024-04-02-1310
Yuri Weinstein
02:00 PM Ceph QA QA Run #65574: wip-yuri7-testing-2024-04-18-1351-reef
repushed Yuri Weinstein
01:32 PM rgw Bug #65590 (Pending Backport): rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
... Casey Bodley
12:38 PM sepia Support #64967: Sepia Lab Access Request
Hi Adam,
Please update if these new creds have been granted access. Thanks!
Soumya Koduri
12:09 PM Ceph Support #65589 (New): is there any method to restore deleted rbd images
Hi, there
We're running a very old ceph rbd cluster. Today a team deleted a bunch of (about 1.5k images and 12TiB ...
Jianyun Cheng
10:39 AM rbd Backport #65588 (In Progress): quincy: insufficient randomness for group and group snapshot IDs
https://github.com/ceph/ceph/pull/57090 Backport Bot
10:38 AM rbd Backport #65587 (Resolved): squid: insufficient randomness for group and group snapshot IDs
https://github.com/ceph/ceph/pull/57092 Backport Bot
10:38 AM rbd Backport #65586 (In Progress): reef: insufficient randomness for group and group snapshot IDs
https://github.com/ceph/ceph/pull/57091 Backport Bot
10:34 AM rbd Bug #65573 (Pending Backport): insufficient randomness for group and group snapshot IDs
Ilya Dryomov
09:39 AM Ceph Bug #65176: BlueFS: _estimate_log_size_N calculates the log size incorrectly
What is calculated here should be the total bytes occupied by the names of all files.@ Igor Fedotov linke wang
09:29 AM crimson Bug #65531: crimson-osd: dump_historic_slow_ops command not correctly run
https://github.com/ceph/ceph/pull/56994 junxiang mu
07:50 AM crimson Bug #65585: unittest-seastore (Timeout)
https://github.com/ceph/ceph/pull/56979... Yingxin Cheng
07:47 AM crimson Bug #65585: unittest-seastore (Timeout)
https://github.com/ceph/ceph/pull/56982... Yingxin Cheng
07:35 AM crimson Bug #65585: unittest-seastore (Timeout)
The pasted log is from https://github.com/ceph/ceph/pull/56998#issuecomment-2065880693 Yingxin Cheng
07:33 AM crimson Bug #65585 (Resolved): unittest-seastore (Timeout)
... Yingxin Cheng
07:18 AM ceph-volume Bug #65584 (Fix Under Review): ceph-volume: use os.makedirs to implement mkdir_p
ceph-volume failed if /var/lib/ceph/osd/ does not exist... Yuanrun Chen
06:47 AM RADOS Feature #65583 (New): mon store data should be available depending on the user keyring
For the specific ceph user data should be restricted on the mon store.
Let's say if client.user1 store data `clien...
Parth Arora
06:00 AM Ceph Backport #65582 (New): squid: qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
Backport Bot
05:41 AM Ceph Bug #65565 (Pending Backport): qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
Nizamudeen A
05:40 AM Dashboard Backport #65581 (In Progress): squid: mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
Nizamudeen A
05:29 AM Dashboard Backport #65581 (Resolved): squid: mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
https://github.com/ceph/ceph/pull/56999 Backport Bot
05:23 AM Dashboard Bug #65571 (Pending Backport): mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
Nizamudeen A
05:17 AM CephFS Bug #65580 (Triaged): mds/client: add dummy client feature to test client eviction
Currently, fs:upgrade:featureful_client:old_client uses octopus client with a newer MDS. The octopus client lacks a p... Venky Shankar
03:49 AM CephFS Fix #65579 (New): mds: use _exit for QA killpoints rather than SIGABRT
Using signals to abruptly kill the MDS has a few issues:
- teuthology logs are polluted with stacktraces
- coredu...
Patrick Donnelly
03:37 AM cephsqlite Bug #65494 (Fix Under Review): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Patrick Donnelly
02:48 AM devops Backport #65578 (In Progress): reef: ccache is always miss in confusa14
Rixin Luo
02:38 AM devops Backport #65578 (In Progress): reef: ccache is always miss in confusa14
https://github.com/ceph/ceph/pull/56993 Backport Bot
02:47 AM devops Backport #65577 (In Progress): squid: ccache is always miss in confusa14
Rixin Luo
02:38 AM devops Backport #65577 (In Progress): squid: ccache is always miss in confusa14
https://github.com/ceph/ceph/pull/56992 Backport Bot
02:47 AM devops Backport #65576 (In Progress): quincy: ccache is always miss in confusa14
Rixin Luo
02:38 AM devops Backport #65576 (In Progress): quincy: ccache is always miss in confusa14
https://github.com/ceph/ceph/pull/56991 Backport Bot
02:30 AM devops Bug #65175 (Pending Backport): ccache is always miss in confusa14
Rixin Luo
12:45 AM Ceph Bug #65249: peering_graph.generated.dot renders weird
size="7,7" in peering_graph_generated.dot causes the peering_graph_generated.svg file to look the (wrong) way that ca... Zac Dover
12:37 AM Ceph Bug #65249: peering_graph.generated.dot renders weird
dot -Tsvg doc/dev/peering_graph.generated.dot > doc/dev/peering_graph.generated.svg
The above command as of today ...
Zac Dover
12:34 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
I have fix the kernel call trace in kernel space, the patch like is https://patchwork.kernel.org/project/ceph-devel/l... Xiubo Li

04/18/2024

11:05 PM rbd Bug #54292: run-rbd-unit-tests-127.sh times out on Jenkins "make check" runs
sorry to pile on, but it's hard to know which tracker issue is related to which crash. from squid pr https://jenkins.... Casey Bodley
10:52 PM Ceph QA QA Run #65237: wip-ceph_test_rados-partial-reads
@rzarzynski ping? Yuri Weinstein
10:51 PM Ceph QA QA Run #65560 (QA Needs Approval): wip-yuri5-testing-2024-04-17-1400
Yuri Weinstein
10:12 PM rgw Bug #65575 (Pending Backport): release note for rgw_realm init
Casey Bodley
09:25 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
... Samuel Just
08:19 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
https://github.com/rzarzynski/ceph/commit/1a4d3f01816cedb15106fe2cdb52322029482827 changed ScrubMap::object::attrs to... Samuel Just
09:24 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
i tried running the python reproducer from https://tracker.ceph.com/issues/58286, but it doesn't reproduce the @bad m... Casey Bodley
09:08 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
thanks Ali, that's super helpful. i came across https://tracker.ceph.com/issues/58286 which looks like the exact same... Casey Bodley
08:55 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
Here is a snippet with two of those "bad method" statements from the log I referenced in the last comment.
https:/...
Ali Maredia
08:33 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
After having radosgw under valgrind and running the java s3tests I was able to reproduce the "failed to read header: ... Ali Maredia
08:52 PM Ceph QA QA Run #65574 (QA Closed): wip-yuri7-testing-2024-04-18-1351-reef

--- done. these PRs were included:
https://github.com/ceph/ceph/pull/54150 - reef: ceph_mon: Fix MonitorDBStore us...
Yuri Weinstein
05:21 PM rbd Bug #65573 (Fix Under Review): insufficient randomness for group and group snapshot IDs
Ilya Dryomov
05:12 PM rbd Bug #65573 (Pending Backport): insufficient randomness for group and group snapshot IDs
Nithya noticed that group IDs end up being very similar:... Ilya Dryomov
05:13 PM Dashboard Feature #56429: mgr/dashboard: Remote user authentication (e.g. via apache2)
If SSO should be the primary login method, and the local login is only needed for emergencies (Network/IdP down), the... Jan Graichen
05:07 PM Dashboard Feature #56429: mgr/dashboard: Remote user authentication (e.g. via apache2)
Hello Ernesto,
This interface seems to imply that a username and password is entered on a login page and passed to...
Jan Graichen
05:06 PM rbd Backport #65548 (Duplicate): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
The bot created two reef backport tickets for some reason. Ilya Dryomov
04:18 PM rgw Feature #65551 (Fix Under Review): [rgw][accounts] bucket quota management at account-level
Casey Bodley
04:00 PM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
Eugen Block wrote in #note-10:
> Thanks, Matan! It sounds very promising. I talked to the customer and they are will...
Matan Breizman
03:34 PM Ceph QA QA Run #65510 (QA Closed): wip-yuriw-testing-20240416.150233
Yuri Weinstein
03:34 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
@matan @lflores thx a million! Yuri Weinstein
03:27 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
Laura Flores wrote in #note-6:
> @matan I forgot to say, can you also include a "Rados approved: <link to your summa...
Matan Breizman
03:18 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
@matan I forgot to say, can you also include a "Rados approved: <link to your summary>" message on the PRs now that t... Laura Flores
12:10 PM Ceph QA QA Run #65510 (QA Approved): wip-yuriw-testing-20240416.150233
7659275, 7659345, 7659406, 7659407, 7659470 - https://tracker.ceph.com/issues/61774
7659280 - https://tracker.ceph.c...
Matan Breizman
03:34 PM RADOS Backport #65306: squid: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56814
merged
Yuri Weinstein
03:33 PM RADOS Backport #65312: squid: decoding chunk_refs_by_hash_t return wrong values
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56697
merged
Yuri Weinstein
03:33 PM RADOS Backport #65072: squid: rados/thrash: slow reservation response from 1 (115547ms) in cluster log
https://github.com/ceph/ceph/pull/56482 merged Yuri Weinstein
03:31 PM RADOS Backport #65140: squid: osd: modify PG deletion cost for mClock scheduler
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56474
merged
Yuri Weinstein
03:31 PM mgr Backport #65117: squid: rados/upgrade/parallel: [WRN] TELEMETRY_CHANGED: Telemetry requires re-opt-in
Laura Flores wrote:
> https://github.com/ceph/ceph/pull/56457
merged
Yuri Weinstein
03:30 PM RADOS Backport #65097: squid: ceph osd pool rmsnap clone object leak
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56432
merged
Yuri Weinstein
03:03 PM Ceph QA QA Run #65330 (QA Closed): wip-yuri7-testing-2024-04-04-0800
Yuri Weinstein
02:49 PM Ceph QA QA Run #65330 (QA Approved): wip-yuri7-testing-2024-04-04-0800
Yuri Weinstein
01:06 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
fs approve. failures are - https://tracker.ceph.com/projects/cephfs/wiki/Squid#2024-04-18
NOTE: A couple of PRs ha...
Venky Shankar
05:38 AM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
Yuri Weinstein wrote in #note-7:
> @vshankar ping!
Apologies - on it now!
Venky Shankar
03:03 PM CephFS Backport #65295: squid: High cephfs MDS latency and CPU load with snapshots and unlink operations
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56671
merged
Yuri Weinstein
03:03 PM CephFS Backport #65106: squid: qa: probabilistically ignore PG_AVAILABILITY/PG_DEGRADED
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56665
merged
Yuri Weinstein
03:02 PM CephFS Backport #65275: squid: mds: some request errors come from errno.h rather than fs_types.h
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56663
merged
Yuri Weinstein
02:21 PM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
Venky, IMO this also should be the same issue with this:
https://pulpito.ceph.com/vshankar-2024-03-13_13:59:32-fs...
Xiubo Li
10:19 AM Linux kernel client Bug #65563 (Fix Under Review): WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
Xiubo Li
07:45 AM Linux kernel client Bug #65563 (In Progress): WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
The mds sent out the open session reply with *cap_auths [MDSCapAuth( uid=1000 gids=1301readable=1, writeable=1),MDSCa... Xiubo Li
07:26 AM Linux kernel client Bug #65563 (Fix Under Review): WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
https://pulpito.ceph.com/yuriw-2024-04-05_22:36:11-fs-wip-yuri7-testing-2024-04-04-0800-distro-default-smithi/7642062... Venky Shankar
02:21 PM rgw Bug #64971 (New): Rgw lifecycle skip
Casey Bodley
02:20 PM rgw Bug #64983 (Fix Under Review): multisite: two-zonegroup tests get stuck in redirect loops
Casey Bodley
02:18 PM Ceph QA QA Run #65558: wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
retriggered centos8 Yuri Weinstein
02:17 PM rgw Bug #65216 (In Progress): rgw: only accept valid ipv4 from host header
Casey Bodley
02:17 PM Ceph QA QA Run #65552 (QA Needs Approval): wip-yuri2-testing-2024-04-17-0823-reef
Yuri Weinstein
02:16 PM Ceph QA QA Run #65552: wip-yuri2-testing-2024-04-17-0823-reef
@lflores running with -p 75 Yuri Weinstein
02:16 PM rgw Bug #65369 (Fix Under Review): rgw: allow disabling bucket stats on head bucket
Casey Bodley
02:16 PM rgw Bug #65397 (Fix Under Review): rgw: allow disabling mdsearch APIs
Casey Bodley
02:15 PM rgw Bug #65436 (Need More Info): Getting Object Crashing radosgw services
> After upgrade to 17.2.7, this bug gone
it sounds like this bug is fixed in later point release, can you please t...
Casey Bodley
02:10 PM rgw Bug #65462 (Fix Under Review): rgw: eliminate ssl enforcement for sse-s3 encryption
Casey Bodley
02:09 PM rgw Bug #65468 (Fix Under Review): rgw: set correct requestId and hostId on s3select error
Casey Bodley
02:03 PM rgw Bug #65337 (Fix Under Review): rgw: Segmentation fault in rgw::notify::Manager during realm reload
Casey Bodley
01:48 PM CephFS Bug #65572 (New): Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi155 with status 1
This has started to show up again (with fs/thrash). See: https://pulpito.ceph.com/yuriw-2024-04-05_22:36:11-fs-wip-yu... Venky Shankar
01:28 PM CephFS Backport #65570 (Fix Under Review): squid: Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
Leonid Usov
12:40 PM CephFS Backport #65570 (Fix Under Review): squid: Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
Backport Bot
01:22 PM Dashboard Bug #65571 (Resolved): mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
... Nizamudeen A
01:19 PM sepia Bug #65475: folio03 install
Thanks @akraitma , i am able to connect and use folio03 Nitzan Mordechai
01:02 PM Dashboard Bug #62972: ERROR: test_list_enabled_module (tasks.mgr.dashboard.test_mgr_module.MgrModuleTest)
https://jenkins.ceph.com/job/ceph-api/72585/ Casey Bodley
01:02 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-14:
> log_to_file gets set to true by Rook as part of enabling the log collector:
>
>...
Nir Soffer
07:23 AM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
log_to_file gets set to true by Rook as part of enabling the log collector:
https://github.com/rook/rook/blob/a9fd...
Ilya Dryomov
12:58 PM RADOS Bug #65449 (Fix Under Review): NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
Nitzan Mordechai
12:09 PM RADOS Bug #65449: NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659537 Matan Breizman
12:35 PM CephFS Bug #65545 (Pending Backport): Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
Leonid Usov
12:34 PM sepia Support #65535: Sepia Lab Access Request
Public ssh key:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCXMJ5WoP5k7wk5XuRZjjUESEuH38UoIDWmYqb0e7VPEy3c05whYa2ctuLj+/...
Kalpesh Pandya
12:10 PM RADOS Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659542
Matan Breizman
12:09 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659395
/a/yuriw-2024-04-...
Matan Breizman
12:08 PM RADOS Bug #65186: OSDs unreachable in upgrade test
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659312
/a/yuriw-2024-04-...
Matan Breizman
12:07 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659305
Matan Breizman
12:07 PM Infrastructure Bug #65448: Teuthology unable to find the "ceph-radosgw" package
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659304
Matan Breizman
12:06 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659300
/a/yuriw-2024-04-...
Matan Breizman
12:06 PM Orchestrator Bug #52109: test_cephadm.sh: Timeout('Port 8443 not free on 127.0.0.1.',)
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659292
Matan Breizman
12:05 PM RADOS Bug #62839: Teuthology failure in LibRadosTwoPoolsPP.HitSetWrite
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659285
Matan Breizman
12:05 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659280
Matan Breizman
12:04 PM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659275/
/a/yuriw-2024-04...
Matan Breizman
11:19 AM CephFS Bug #64659: mds: switch to using xlists instead of elists
@vshankar any thoughts on this? Dhairya Parmar
11:00 AM Dashboard Bug #65569 (New): exporter: allow all zone names pattern for sync counters
Currently exporter only supports zone name which have `-`'s in between for rgw sync metrics. Adopt the regex to also... Avan Thakkar
10:28 AM crimson Bug #65568 (New): osd crashes when trimming snaps involves unrecovered objects
The current crimson implementation doesn't recover objects when trimming snaps. So, if we are trimming a snapshot, an... Xuehan Xu
09:58 AM Ceph Bug #65228 (Fix Under Review): class:device-class config database mask does not work for osd_compact_on_start
Igor Fedotov
09:52 AM rgw Bug #65567 (Duplicate): admin_socket_output: signal: Terminated from term radosgw
... Matan Breizman
09:49 AM Dashboard Bug #65506 (Resolved): rgw roles e2e tests failure
Nizamudeen A
09:49 AM Dashboard Backport #65542 (Resolved): squid: rgw roles e2e tests failure
Nizamudeen A
09:44 AM Ceph Bug #65565: qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
The commit has been cherry-picked to a different PR for a faster merge and to avoid circular dependency for CI to be ... Rishabh Dave
09:24 AM Ceph Bug #65565 (Pending Backport): qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
Rishabh Dave
09:37 AM nvme-of Feature #65566 (Pending Backport): Change some default values for OMAP lock parameters in nvmeof conf file
We want to change some default values in the OMAP lock parameters in the nvmeof conf file generated by cephadm:
* ...
Gil Bregman
09:35 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Dhairya Parmar wrote in #note-28:
> as mentioned in yesterday's standup - some of the PRs (https://github.com/ceph/c...
Venky Shankar
09:19 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
apart from the discussion about MDS verifying clients OSDs set, https://tracker.ceph.com/issues/64563#note-28 also ne... Dhairya Parmar
09:23 AM Ceph Bug #65533 (Resolved): qa/vstart_runner.py: don't let command run after timeout
Rishabh Dave
09:15 AM CephFS Bug #65564 (Fix Under Review): Test failure: test_snap_schedule_subvol_and_group_arguments_08 (tasks.cephfs.test_snap_schedules.TestSnapSchedulesSubvolAndGroupArguments)
/a/yuriw-2024-04-05_22:36:11-fs-wip-yuri7-testing-2024-04-04-0800-distro-default-smithi/7642196... Venky Shankar
08:33 AM CephFS Bug #64977: mds spinlock due to lock contention leading to memory exaustion
We've uploaded a new set of logs with debug_ms 1 at 20d8ba67-8bb0-4cfc-a986-b72ec250728d Abhishek Lekshmanan
07:03 AM CephFS Bug #54404 (Closed): snap-schedule retention not working as expected
Closing tracker due to lack of info.
If no valid retention is found during pruning phase, then all snapshots are imm...
Milind Changire
01:23 AM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
@lflores rebuilding (note it's very slow :() Yuri Weinstein
12:46 AM Ceph QA QA Run #65562 (QA Closed): wip-pdonnell-testing-20240418.004638-debug
* "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli... Patrick Donnelly

04/17/2024

10:25 PM Ceph QA QA Run #65561 (QA Closed): wip-yuriw-testing-20240417.222151-quincy
duplicate of Yuri Weinstein
10:22 PM Ceph QA QA Run #65561 (QA Closed): wip-yuriw-testing-20240417.222151-quincy
* "PR #56818":https://github.com/ceph/ceph/pull/56818 -- quincy: qa/rgw: barbican uses branch stable/2023.1
* "PR #5...
Yuri Weinstein
10:24 PM Ceph QA QA Run #65558: wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
also dupe is https://tracker.ceph.com/issues/65561 Yuri Weinstein
08:47 PM Ceph QA QA Run #65558 (QA Closed): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/54172 - quincy: prevent anonymous topic operati...
Yuri Weinstein
09:47 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
When you're ready to approve it, change the Tracker status to "QA Approved" and reassign to Yuri. If anything needs r... Laura Flores
09:46 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
@matan can you review the rados suite? Laura Flores
09:44 PM Ceph QA QA Run #65270 (QA Needs Rerun/Rebuilt): wip-yuri6-testing-2024-04-02-1310
Hey @yuriw, https://github.com/ceph/ceph/pull/53545 caused some regressions. Can you remove it from the batch and reb... Laura Flores
09:41 PM RADOS Bug #65557 (Closed): Admin socket times out after osd restart
This was actually related to a WIP branch that hasn't merged yet. Laura Flores
08:46 PM RADOS Bug #65557 (Closed): Admin socket times out after osd restart
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652505... Laura Flores
09:34 PM RADOS Bug #65559 (Closed): src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
Actually seems related to a WIP branch that hadn't been merged yet. Laura Flores
08:55 PM RADOS Bug #65559 (Closed): src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652491... Laura Flores
09:24 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Nir Soffer wrote in #note-12:
> Ilya Dryomov wrote in #note-11:
> > Hi Nir,
> >
> > I think the problem is the m...
Nir Soffer
01:08 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-11:
> Hi Nir,
>
> I think the problem is the method you used to set these config opti...
Nir Soffer
01:04 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Hi Nir,
I think the problem is the method you used to set these config options. Note that the way it's done in OD...
Ilya Dryomov
08:56 PM Ceph QA QA Run #65560 (QA Closed): wip-yuri5-testing-2024-04-17-1400
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/49438 - os/bluestore: set rocksdb iterator boun...
Yuri Weinstein
08:46 PM RADOS Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
Laura Flores wrote in #note-12:
> /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-defaul...
Laura Flores
01:59 AM RADOS Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
@lflores There's little chance that the above crash is related to what Joseph saw here, let's close this one and open... Samuel Just
08:14 PM Ceph QA QA Run #65530 (QA Closed): wip-pdonnell-testing-20240417.021458-debug
Some strange ansible.cephlab failure breaking this run. Patrick Donnelly
02:15 AM Ceph QA QA Run #65530 (QA Closed): wip-pdonnell-testing-20240417.021458-debug
* "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli... Patrick Donnelly
08:08 PM CephFS Backport #65556 (Fix Under Review): squid: mds: avoid recalling Fb when quiescing file
Backport Bot
08:03 PM CephFS Bug #65472 (Pending Backport): mds: avoid recalling Fb when quiescing file
Patrick Donnelly
07:58 PM devops Bug #65555 (New): old pinned mistune in admin/doc-requirements.txt is vulnerable to CVE-2022-34749
@admin/doc-requirements.txt@ pins to an older @mistune@ library version. Security scanners treat this as a vulnerabil... Ken Dreyer
07:32 PM Dashboard Bug #46735: FAIL: test_all (tasks.mgr.dashboard.test_rgw.RgwBucketTest)
from https://jenkins.ceph.com/job/ceph-api/72562/consoleFull... Casey Bodley
07:18 PM Orchestrator Bug #65546: quincy|reef: qa/suites/upgrade/pacific-x: failure to pull image causes dead jobs
https://pulpito.ceph.com/teuthology-2024-04-17_01:16:02-upgrade:quincy-x-reef-distro-default-smithi/ Patrick Donnelly
02:12 PM Orchestrator Bug #65546 (New): quincy|reef: qa/suites/upgrade/pacific-x: failure to pull image causes dead jobs
https://pulpito.ceph.com/teuthology-2024-04-17_01:08:06-upgrade:pacific-x-reef-distro-default-smithi/
Beyond the i...
Patrick Donnelly
07:06 PM mgr Bug #64799: mgr: update cluster state for new maps from the mons before notifying modules
Per let's not hurry up with backporting this chnage. IMHO it deserves some _baking_ in `main`.:
> let's not hurry...
Radoslaw Zarzynski
07:03 PM RADOS Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
I created a pull request for this: https://github.com/ceph/ceph/pull/56971
A warning message is now generated if use...
Suyash Dongre
06:36 PM Ceph Bug #65509: osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
Per https://github.com/ceph/ceph/pull/56924#issuecomment-2061948862 a workaround exists:
> (...) we could recover ...
Radoslaw Zarzynski
06:04 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
https://jenkins.ceph.com/job/ceph-api/72561/... Casey Bodley
06:00 PM RADOS Bug #53000: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called
from https://jenkins.ceph.com/job/ceph-pull-requests/133465/consoleFull... Casey Bodley
05:34 PM Orchestrator Bug #65554 (In Progress): mgr/nfs: nfs module commands do not accept json-pretty format
... Adam King
04:10 PM Ceph QA QA Run #65045 (QA Closed): wip-yuri5-testing-2024-03-21-0833
Yuri Weinstein
04:08 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
Aishwarya Mathuria wrote in #note-20:
> All failures are tracked here: https://tracker.ceph.com/projects/rados/wiki/...
Yuri Weinstein
03:53 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
All failures are tracked here: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65045 Aishwarya Mathuria
02:56 PM Ceph QA QA Run #65045 (QA Needs Approval): wip-yuri5-testing-2024-03-21-0833
Yuri Weinstein
02:54 PM Ceph QA QA Run #65045 (QA Approved): wip-yuri5-testing-2024-03-21-0833
Aishwarya Mathuria wrote in #note-17:
> Rados approved
Great!
@amathuri ps change the status to QA Approve in the f...
Yuri Weinstein
12:43 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
Rados approved Aishwarya Mathuria
03:36 PM Orchestrator Bug #65553 (Pending Backport): cephadm: agent tries to json load response payload before checking for errors
If the connection itself fails, the agent will end up hitting another exception... Adam King
03:33 PM rgw Backport #65353 (In Progress): squid: rgwlc: Executing radosgw-admin lc process --bucket <bkt-name> without setting lc rule results in Segmentation fault
Casey Bodley
03:32 PM Ceph QA QA Run #65552 (QA Closed): wip-yuri2-testing-2024-04-17-0823-reef

--- done. these PRs were included:
https://github.com/ceph/ceph/pull/54662 - reef: debian: add missing bcrypt to c...
Yuri Weinstein
03:30 PM rgw Backport #64496 (In Progress): squid: keystone admin token is not invalidated on http 401 response
Casey Bodley
03:28 PM rgw Backport #64552 (In Progress): squid: rgw/multisite: objects named "." or ".." are not replicated
Casey Bodley
03:26 PM rgw Feature #65551 (Pending Backport): [rgw][accounts] bucket quota management at account-level
Account feature has been introduced by https://github.com/ceph/ceph/pull/54333 and we are planning to migrate our rad... Oguzhan Ozmen
03:19 PM rbd Backport #65550 (In Progress): squid: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
https://github.com/ceph/ceph/pull/57031 Backport Bot
03:18 PM rbd Backport #65549 (In Progress): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
https://github.com/ceph/ceph/pull/57030 Backport Bot
03:07 PM rbd Backport #65548 (Duplicate): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Backport Bot
03:07 PM rbd Backport #65547 (Resolved): quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
https://github.com/ceph/ceph/pull/57029 Backport Bot
03:02 PM rbd Bug #65481 (Pending Backport): [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Ilya Dryomov
01:40 PM CephFS Bug #65545 (Fix Under Review): Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
Leonid Usov
01:34 PM CephFS Bug #65545 (Pending Backport): Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
Reported by the QE team at https://bugzilla.redhat.com/show_bug.cgi?id=2275459... Leonid Usov
01:37 PM rgw Feature #65050: Add alternative way for providing user name/password for Kafka endpoint authentication
Needs review. Corresponding PR is here:
https://github.com/ceph/ceph/pull/56493
Igor Gomon
01:01 PM CephFS Backport #65325 (In Progress): reef: client: log message when unmount call is received
Patrick Donnelly
01:01 PM CephFS Backport #65326 (In Progress): quincy: client: log message when unmount call is received
Patrick Donnelly
12:52 PM CephFS Backport #65365 (In Progress): reef: qa: run TestSnapshots.test_kill_mdstable for all mount types
Patrick Donnelly
12:52 PM CephFS Backport #65366 (In Progress): squid: qa: run TestSnapshots.test_kill_mdstable for all mount types
Patrick Donnelly
12:51 PM CephFS Backport #65520 (In Progress): reef: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
Patrick Donnelly
12:50 PM CephFS Backport #65519 (In Progress): squid: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
Patrick Donnelly
12:38 PM rgw Backport #65543 (In Progress): squid: rgw: increase log level on abort_early
Casey Bodley
12:37 PM rgw Backport #65543 (In Progress): squid: rgw: increase log level on abort_early
https://github.com/ceph/ceph/pull/56949 Casey Bodley
12:37 PM Ceph Backport #65540 (In Progress): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
Adam King
09:39 AM Ceph Backport #65540 (Resolved): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
https://github.com/ceph/ceph/pull/56948 Backport Bot
12:37 PM Orchestrator Bug #63784: qa/standalone/mon/mkfs.sh:'mkfs/a' already exists and is not empty: monitor may already exist
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648785/ Aishwarya Mathuria
12:37 PM rgw Backport #65544 (New): reef: rgw: increase log level on abort_early
Casey Bodley
12:35 PM RADOS Bug #50245: TEST_recovery_scrub_2: Not enough recovery started simultaneously
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648573/ Aishwarya Mathuria
12:31 PM Ceph Backport #65539 (In Progress): squid: Add alerts to ceph monitoring stack for the nvmeof gateways
Adam King
09:39 AM Ceph Backport #65539 (Resolved): squid: Add alerts to ceph monitoring stack for the nvmeof gateways
https://github.com/ceph/ceph/pull/56947 Backport Bot
12:27 PM rgw Bug #65469 (Pending Backport): rgw: increase log level on abort_early
Casey Bodley
12:11 PM Orchestrator Bug #65035: ERROR: required file missing from config-json: idmap.conf
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648585/ Aishwarya Mathuria
12:04 PM RADOS Bug #53544: src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in thrash_cache_writeback_proxy_none tests
@lflores FYI seeing this one after a while in one of the main runs - /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-tes... Aishwarya Mathuria
11:48 AM Dashboard Backport #64791 (In Progress): squid: mgr/dashboard: In rgw multisite, during zone creation acess/secret key should not be compulsory provide an edit option to set these keys
Aashish Sharma
11:47 AM RADOS Bug #56393: failed to complete snap trimming before timeout
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648606
/a/yuriw-2024-04-...
Aishwarya Mathuria
11:45 AM Dashboard Bug #61786: test_dashboard_e2e.sh: Can't run because no spec files were found; couldn't determine Mocha version
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648833 Aishwarya Mathuria
11:43 AM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648693/ Aishwarya Mathuria
11:41 AM Infrastructure Bug #65229: Failed to reconnect to smithiXXX
@akraitma I am seeing this here - /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default... Aishwarya Mathuria
10:48 AM Orchestrator Bug #63502: Regression: Permanent KeyError: 'TYPE' : return self.blkid_api['TYPE'] == 'part'
Vadym Kukharenko wrote in #note-1:
> I got the same problem.
> Fistly tried to upgrade from 17.2.6 to 17.2.7.
> Se...
Vadym Kukharenko
10:14 AM Dashboard Bug #65534 (Fix Under Review): mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
Afreen Misbah
07:10 AM Dashboard Bug #65534 (Pending Backport): mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
Overall Performance does not display the graphs
Description of problem:
# cat /var/lib/ceph/tmp/grafana.yaml
s...
Nizamudeen A
10:10 AM Dashboard Cleanup #65207 (Resolved): mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
Afreen Misbah
10:10 AM Dashboard Backport #65504 (Resolved): reef: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
Afreen Misbah
10:09 AM Dashboard Backport #65505 (Resolved): squid: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
Afreen Misbah
10:08 AM Dashboard Backport #65542 (In Progress): squid: rgw roles e2e tests failure
Afreen Misbah
10:02 AM Dashboard Backport #65542 (Resolved): squid: rgw roles e2e tests failure
https://github.com/ceph/ceph/pull/56945 Backport Bot
10:07 AM RADOS Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (osd_client_message_cap=256)
Please see my latest update to the PR: https://github.com/ceph/ceph/pull/53477
I can confirm the fix is good and a...
Lee Sanders
09:55 AM Dashboard Bug #65506 (Pending Backport): rgw roles e2e tests failure
Afreen Misbah
09:45 AM Dashboard Bug #65541 (New): Empty (string, list, object) should be blank in dashboard
Empty (string, list, object) should be blank in dashboard
We need to see how we show empty data structures in dash...
Afreen Misbah
09:31 AM Ceph Backport #65538 (New): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
Backport Bot
09:28 AM Ceph Feature #64335 (Pending Backport): Add alerts to ceph monitoring stack for the nvmeof gateways
Avan Thakkar
08:37 AM Orchestrator Bug #64868: cephadm/osds, cephadm/workunits: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) in cluster log
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648713/
/a/yuriw-2024-04...
Aishwarya Mathuria
08:29 AM RADOS Bug #64942: rados/verify: valgrind reports "Invalid read of size 8" error.
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648773/ Aishwarya Mathuria
08:23 AM Orchestrator Bug #64871: rados/cephadm/workunits: Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)" in cluster log
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648786/
/a/yuriw-2024-04-...
Aishwarya Mathuria
08:16 AM Messengers Documentation #65537 (New): RDMA support
Hi guys,
I needed to setup Ceph over RDMA, but I faced many issues! Because there is not enough info in the docume...
Vahideh Alinouri
08:03 AM CephFS Bug #65536 (Fix Under Review): mds: after the unresponsive client was evicted the blocked slow requests were not successfully cleaned up
Xiubo Li
07:50 AM CephFS Bug #65536 (Fix Under Review): mds: after the unresponsive client was evicted the blocked slow requests were not successfully cleaned up

Firstly a *client.188978:3 lookup #0x10000000000/csi* client request came and then was added to the waiter list:
...
Xiubo Li
07:47 AM Orchestrator Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648850/ Aishwarya Mathuria
07:33 AM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648574/ Aishwarya Mathuria
07:28 AM sepia Support #65535 (In Progress): Sepia Lab Access Request
kapandya@macvm Bp4C6fBYeF56whmrSsCAoQ 6567cd0199a79c959e1a34a0793b2db3a9cc16ac2a503b772de4cd9f642cc590 Kalpesh Pandya
07:23 AM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648705
/a/yuriw-2024-04-...
Aishwarya Mathuria
07:16 AM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
/a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648565
/a/yuriw-2024-04-...
Aishwarya Mathuria
06:57 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Venky Shankar wrote in #note-15:
> Dhairya Parmar wrote in #note-14:
> > Venky Shankar wrote in #note-13:
> > > Ve...
Dhairya Parmar
06:47 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya Parmar wrote in #note-14:
> Venky Shankar wrote in #note-13:
> > Venky Shankar wrote in #note-11:
> > > Dh...
Venky Shankar
06:14 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Venky Shankar wrote in #note-13:
> Venky Shankar wrote in #note-11:
> > Dhairya Parmar wrote in #note-10:
> > > I ...
Dhairya Parmar
05:03 AM CephFS Feature #65503 (New): mgr/stats, cephfs-top: provide per volume/sub-volume based performance metrics to monitor / troubleshoot performance issues
Venky Shankar
04:54 AM CephFS Feature #65503 (Rejected): mgr/stats, cephfs-top: provide per volume/sub-volume based performance metrics to monitor / troubleshoot performance issues
Jos Collin
04:40 AM Ceph Bug #65533 (Resolved): qa/vstart_runner.py: don't let command run after timeout
LocalRemote.run() accepts parameter @timeout@ but it is not passed to @subprocess@ and therefore has no effect. Rishabh Dave
03:16 AM crimson Bug #65532: osd crashes due to invalid clone_range ops
It seems that this is due to incorrect clone_overlap calculations, will go into it. Xuehan Xu
03:15 AM crimson Bug #65532 (Fix Under Review): osd crashes due to invalid clone_range ops
... Xuehan Xu
02:57 AM sepia Support #65359: Sepia Lab Access Request
Hi Adam,
I am able to access sepia lab
amk:openvpn$ ping teuthology.front.sepia.ceph.com
PING teuthology.front...
Amarnath Reddy
02:51 AM crimson Bug #65531: crimson-osd: dump_historic_slow_ops command not correctly run
I don't think it's necessary to put history_cliend_request and history_slow_cliend_request together, so I will separa... junxiang mu
02:34 AM crimson Bug #65531 (In Progress): crimson-osd: dump_historic_slow_ops command not correctly run
right now, historic ops and historic slow ops all placed in OperationTypeCode::historic_client_request op_list, use l... junxiang mu
02:15 AM Ceph QA QA Run #65523 (QA Closed): wip-pdonnell-testing-20240416.232211-debug
https://github.com/ceph/ceph/pull/56934#pullrequestreview-2004866085 Patrick Donnelly
01:43 AM rgw Bug #65436: Getting Object Crashing radosgw services
I have same issue. After some days, i found bug https://tracker.ceph.com/issues/61359
After upgrade to 17.2.7, this ...
hoan nv

04/16/2024

11:26 PM Ceph QA QA Run #65510 (QA Needs Approval): wip-yuriw-testing-20240416.150233
Yuri Weinstein
03:03 PM Ceph QA QA Run #65510 (QA Closed): wip-yuriw-testing-20240416.150233
* "PR #56814":https://github.com/ceph/ceph/pull/56814 -- squid: osd/SnapMapper: fix _lookup_purged_snap
* "PR #56697...
Yuri Weinstein
11:22 PM Ceph QA QA Run #65522 (QA Closed): wip-pdonnell-testing-20240416.232051-debug
Patrick Donnelly
11:21 PM Ceph QA QA Run #65522 (QA Closed): wip-pdonnell-testing-20240416.232051-debug
* "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli... Patrick Donnelly
11:22 PM Ceph QA QA Run #65523 (QA Closed): wip-pdonnell-testing-20240416.232211-debug
* "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli... Patrick Donnelly
11:19 PM RADOS Cleanup #65521 (New): Add expected warnings in cluster log to ignorelists
Relevant Slack conversation:
Hey all, as I brought up in today's RADOS call, there has been a surge of cluster war...
Laura Flores
11:19 PM CephFS Backport #65520 (In Progress): reef: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
https://github.com/ceph/ceph/pull/56951 Backport Bot
11:18 PM CephFS Backport #65519 (In Progress): squid: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
https://github.com/ceph/ceph/pull/56950 Backport Bot
11:17 PM CephFS Bug #65271 (Pending Backport): qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
Patrick Donnelly
09:06 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-9:
> Nir Soffer wrote in #note-8:
> > Yes, the configuration is applied to both cluster...
Nir Soffer
08:22 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Nir Soffer wrote in #note-8:
> Yes, the configuration is applied to both clusters. If I understand correctly,
> The...
Ilya Dryomov
08:06 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-7:
> Nir Soffer wrote in #note-6:
> > The other log file (e.g. 62f28287-356f-4f81-87dc-...
Nir Soffer
07:04 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Nir Soffer wrote in #note-6:
> The other log file (e.g. 62f28287-356f-4f81-87dc-51bb05942553-client.rbd-mirror-peer....
Ilya Dryomov
01:03 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Nir Soffer wrote in #note-5:
> > https://github.com/red-hat-storage/ocs-operator/blob/4a0325d824a409e84fac21ffbf0a...
Nir Soffer
08:55 PM RADOS Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652505 Laura Flores
08:55 PM CephFS Bug #65518 (Fix Under Review): mds: regular file inode flags are not replicated by the policylock
Patrick Donnelly
08:53 PM CephFS Bug #65518 (Pending Backport): mds: regular file inode flags are not replicated by the policylock
Currently, the flags are only replicated for directory inodes. Patrick Donnelly
08:40 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652514 Laura Flores
08:37 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620629
/a/yuriw-2024-03...
Laura Flores
08:36 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
Hey @nmordech can you have a look? Laura Flores
08:35 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
Looks like the change was made in https://github.com/ceph/ceph/pull/53308, which did initially pass QA testing, but m... Laura Flores
08:31 PM RADOS Bug #65517 (Fix Under Review): rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652508... Laura Flores
08:16 PM Orchestrator Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652511 Laura Flores
08:12 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
... Laura Flores
08:10 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652474... Laura Flores
08:08 PM RADOS Bug #62776: rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652467 Laura Flores
08:05 PM Ceph Bug #65509 (Fix Under Review): osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
https://github.com/ceph/ceph/pull/56924 Samuel Just
02:47 PM Ceph Bug #65509: osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
See: https://github.com/ceph/ceph/pull/56924 Mark Nelson
02:03 PM Ceph Bug #65509 (Fix Under Review): osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
User hit this:... Mark Nelson
08:04 PM CephFS Bug #65496 (Fix Under Review): mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
Patrick Donnelly
02:42 AM CephFS Bug #65496 (Pending Backport): mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
The logic for checking if an inode already had these vxattrs set has the serious defect that it will only execute xlo... Patrick Donnelly
07:59 PM Orchestrator Backport #65415 (Resolved): squid: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
Adam King
07:58 PM Orchestrator Backport #65382 (Resolved): squid: NLM should be enabled in NFS-Ganesha config file for locking functionality to work with v3 protocol
Adam King
07:55 PM Orchestrator Bug #64865 (Resolved): cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
Adam King
07:55 PM Orchestrator Backport #65414 (Resolved): squid: cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
Adam King
07:54 PM Dashboard Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
More in this run:
https://pulpito.ceph.com/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-...
Laura Flores
07:50 PM Dashboard Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
And in a cephadm test: /a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/765... Laura Flores
07:44 PM Dashboard Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
Also found in an upgrade test:
description: rados/upgrade/parallel/{0-random-distro$/{ubuntu_22.04} 0-start 1-task...
Laura Flores
07:51 PM Orchestrator Bug #52109: test_cephadm.sh: Timeout('Port 8443 not free on 127.0.0.1.',)
/a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7650646 Laura Flores
07:49 PM Ceph QA QA Run #65516 (QA Closed): wip-rishabh-testing-20240416.193735
https://github.com/ceph/ceph/pull/56846
https://github.com/ceph/ceph/pull/56732
https://github.com/ceph/ceph/pull/5...
Rishabh Dave
07:46 PM Infrastructure Bug #65448: Teuthology unable to find the "ceph-radosgw" package
/a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7650669 Laura Flores
07:42 PM RADOS Bug #65231: upgrade/quincy-x/parallel: "Reduced data availability: 1 pg peering (PG_AVAILABILITY)" in cluster log
/a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7650700
Laura Flores
07:30 PM rgw Backport #64510 (Resolved): squid: backport rgw/lc: decorating log events with more details
Casey Bodley
07:29 PM rgw Backport #64949 (Resolved): squid: rgw-multisite: add x-rgw-replicated-at
Casey Bodley
07:29 PM rgw Backport #65292 (Resolved): squid: pubsub: validate Name in CreateTopic api
Casey Bodley
07:28 PM rgw Backport #65297 (Resolved): squid: allow AWS lifecycle event types to configure lifecycle notifications and Replication notifications
Casey Bodley
07:28 PM rgw Backport #65375 (Resolved): squid: lifecycle transition crashes since reloading bucket attrs for notification
Casey Bodley
07:27 PM rgw Feature #65466 (Resolved): rgw user accounts
Casey Bodley
07:27 PM rgw Backport #65402 (Resolved): squid: persistent topic stats test fails
Casey Bodley
07:27 PM rgw Feature #50078 (Resolved): [RFE] multisite: Bucket notification information should be shared between zones.
Casey Bodley
07:27 PM rgw Backport #64818 (Resolved): squid: [RFE] multisite: Bucket notification information should be shared between zones.
Casey Bodley
07:26 PM rgw Backport #65467 (Resolved): squid: rgw user accounts
Casey Bodley
07:25 PM rgw Backport #65411 (Resolved): squid: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
Casey Bodley
06:14 PM Dashboard Backport #65515 (In Progress): squid: mgr/dashboard: fix duplicate grafana panels when on mgr failover
Adam King
05:58 PM Dashboard Backport #65515 (In Progress): squid: mgr/dashboard: fix duplicate grafana panels when on mgr failover
https://github.com/ceph/ceph/pull/56931 Backport Bot
06:10 PM Dashboard Backport #65513 (In Progress): quincy: mgr/dashboard: fix duplicate grafana panels when on mgr failover
Adam King
05:51 PM Dashboard Backport #65513 (In Progress): quincy: mgr/dashboard: fix duplicate grafana panels when on mgr failover
https://github.com/ceph/ceph/pull/56930 Backport Bot
06:01 PM Dashboard Backport #65512 (In Progress): reef: mgr/dashboard: fix duplicate grafana panels when on mgr failover
Adam King
05:51 PM Dashboard Backport #65512 (In Progress): reef: mgr/dashboard: fix duplicate grafana panels when on mgr failover
https://github.com/ceph/ceph/pull/56929 Backport Bot
05:52 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Dhairya Parmar wrote in #note-27:
> Venky Shankar wrote in #note-26:
> > Dhairya Parmar wrote in #note-25:
> > > V...
Venky Shankar
08:24 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Dhairya Parmar wrote in #note-27:
> Venky Shankar wrote in #note-26:
> > Dhairya Parmar wrote in #note-25:
> > > V...
Dhairya Parmar
08:09 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
as mentioned in yesterday's standup - some of the PRs (https://github.com/ceph/ceph/pull/49971, https://github.com/ce... Dhairya Parmar
05:51 PM Dashboard Backport #65514 (New): squid: mgr/dashboard: fix duplicate grafana panels when on mgr failover
Backport Bot
05:49 PM Dashboard Bug #64970 (Pending Backport): mgr/dashboard: fix duplicate grafana panels when on mgr failover
Adam King
05:44 PM Ceph QA QA Run #65420 (QA Closed): wip-yuri2-testing-2024-04-10-1311-squid
this was assigned to me before there were any qa runs attached. i've already tested all of these prs. i think all but... Casey Bodley
05:22 PM RADOS Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
... Suyash Dongre
05:11 PM rgw Backport #65351 (Resolved): squid: rgw: crash in lc while transitioning to cloud
Casey Bodley
04:59 PM Orchestrator Documentation #64596 (Resolved): secure monitoring stack support is not documented
Adam King
04:59 PM Orchestrator Backport #64631 (Resolved): squid: secure monitoring stack support is not documented
Adam King
04:41 PM sepia Support #65359: Sepia Lab Access Request
Hey Amarnath Reddy,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn a...
adam kraitman
03:52 PM sepia Support #65359: Sepia Lab Access Request
Hi Adam,
These are new Credentials.
Earlier I did not have access to sepia lab.
Regards,
Amarnath
Amarnath Reddy
04:40 PM sepia Bug #65475: folio03 install
Hey @nmordech folio03 is now installed with rhel 9.3 , you should have ssh access in about an hour from now adam kraitman
04:19 PM Orchestrator Bug #65511 (Pending Backport): cephadm: anonymous_access: false is dropped from grafana spec after apply
... Adam King
03:17 PM Dashboard Bug #65506: rgw roles e2e tests failure
Same issue happening on squid hence adding backport Afreen Misbah
11:54 AM Dashboard Bug #65506 (Fix Under Review): rgw roles e2e tests failure
Afreen Misbah
11:09 AM Dashboard Bug #65506 (Resolved): rgw roles e2e tests failure
*Rgw roles tests failing with 500 internal server error:*... Afreen Misbah
03:06 PM rgw Backport #65427 (Resolved): squid: Admin Ops socket crashes RGW
Casey Bodley
02:30 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
@lflores sorry for the delay! Will wrap it up by tomorrow. Aishwarya Mathuria
02:17 PM crimson Bug #65491: recover_missing: racing read got wrong version
Not a fix yet, bug I added few missing log lines that may help here:
https://github.com/ceph/ceph/pull/56916/commits...
Matan Breizman
10:06 AM crimson Bug #65491: recover_missing: racing read got wrong version
WIP Matan Breizman
01:57 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
Yeah, there is a change in @attrs@ processing. Already prepared a commit: https://github.com/rzarzynski/ceph/commit/c... Radoslaw Zarzynski
01:53 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests

I am considering the following suspect(s):
PR #54930 modified ScrubMap::object::attrs (where we see a problem) from ...
Ronen Friedman
01:28 PM CephFS Bug #65508 (Fix Under Review): qa: lockup not long enough to for test_quiesce_authpin_wait
Patrick Donnelly
01:25 PM CephFS Bug #65508 (Pending Backport): qa: lockup not long enough to for test_quiesce_authpin_wait
https://pulpito.ceph.com/leonidus-2024-04-16_05:41:33-fs-wip-lusov-quiesce-xlock-distro-default-smithi/7657916/ Patrick Donnelly
12:08 PM Ceph Bug #65507 (New): diskprediction_local failed with python3.10
1. failed messages:... Liyan Wang
11:56 AM Dashboard Backport #65504 (In Progress): reef: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
Afreen Misbah
10:44 AM Dashboard Backport #65504 (Resolved): reef: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
https://github.com/ceph/ceph/pull/56921 Backport Bot
11:55 AM Dashboard Backport #65505 (In Progress): squid: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
Afreen Misbah
10:44 AM Dashboard Backport #65505 (Resolved): squid: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
https://github.com/ceph/ceph/pull/56920 Backport Bot
11:47 AM RADOS Bug #65449 (In Progress): NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
Nitzan Mordechai
11:05 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Venky Shankar wrote in #note-33:
> OK. So this bug has upgrades written all over it - it seemed obvious given that t...
Venky Shankar
10:56 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
OK. So this bug has upgrades written all over it - it seemed obvious given that this is an upgrade task but we were t... Venky Shankar
10:39 AM CephFS Feature #65503 (New): mgr/stats, cephfs-top: provide per volume/sub-volume based performance metrics to monitor / troubleshoot performance issues
Reported by BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2275081
Currently the cephfs-top utility only displays...
Jos Collin
10:37 AM Dashboard Cleanup #65207 (Pending Backport): mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
Pedro González Gómez
10:37 AM Dashboard Backport #65502 (New): squid: mgr/dashboard: provide hub Cluster HA for multi-cluster setup
Backport Bot
10:32 AM Dashboard Bug #65499 (Pending Backport): mgr/dashboard: provide hub Cluster HA for multi-cluster setup
Aashish Sharma
06:23 AM Dashboard Bug #65499 (Pending Backport): mgr/dashboard: provide hub Cluster HA for multi-cluster setup
When adding a cluster to the multi-cluster setup, set all the mgr IP's as cross_origin_url in the connected cluster t... Aashish Sharma
10:08 AM Dashboard Backport #65501 (In Progress): squid: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
Ivo Almeida
09:51 AM Dashboard Backport #65501 (In Progress): squid: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
https://github.com/ceph/ceph/pull/56918 Backport Bot
10:07 AM Dashboard Backport #65500 (In Progress): reef: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
Ivo Almeida
09:51 AM Dashboard Backport #65500 (In Progress): reef: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
https://github.com/ceph/ceph/pull/56917 Backport Bot
09:52 AM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
Thanks, Matan! It sounds very promising. I talked to the customer and they are willing to test this cleanup procedure... Eugen Block
09:45 AM Dashboard Bug #65493 (Pending Backport): mgr/dashboard: snap schedule remove minutely from retention policy dropdown
Ivo Almeida
05:58 AM Messengers Bug #65401: msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
Could anyone give a review on this? Thanks very much! Xinying Song
05:51 AM Dashboard Backport #65498 (New): squid: mgr/dashboard: fetch prometheus api host with ip addr
Backport Bot
05:48 AM Dashboard Bug #65302 (Pending Backport): mgr/dashboard: fetch prometheus api host with ip addr
Aashish Sharma
04:45 AM CephFS Bug #65497 (Fix Under Review): qa: enhance labelled perf counters tests in test_admin.py
Jos Collin
04:28 AM CephFS Backport #65347 (In Progress): squid: qa: failed cephfs-shell test_reading_conf
Neeraj Pratap Singh
02:50 AM crimson Bug #64680: transaction_manager_test/tm_random_block_device_test_t.scatter_allocation/0 status failed
This is caused by prefilling rbm devices, which is used to create scatterly allocated devices and is only used in uni... Xuehan Xu
01:05 AM crimson Feature #65478: Support SnapMapper::Scrubber
It will be completed in these few days junxiang mu

04/15/2024

10:55 PM RADOS Bug #64863 (Resolved): rados/thrash-old-clients: Health detail: HEALTH_WARN 1/3 mons down, quorum a,c in cluster log
https://github.com/ceph/ceph/pull/56619
Radoslaw Zarzynski wrote in #note-3:
> Hmm, I think I saw Laura's PR for ...
Laura Flores
10:30 PM rgw Backport #65339 (Resolved): squid: rgw: update options yaml file so LDAP uri isn't an invalid example
Casey Bodley
10:30 PM rgw Backport #65412 (Resolved): squid: multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
Casey Bodley
10:29 PM rgw Backport #64954 (Resolved): squid: Notification FilterRules for S3key, S3Metadata & S3Tags spit incorrect json output
Casey Bodley
10:24 PM teuthology Bug #64727: suites/dbench.sh: Socket exception: No route to host (113)
/a/yuriw-2024-04-09_01:14:16-smoke-reef-release-distro-default-smithi/7647071
/a/yuriw-2024-04-09_01:14:16-smoke-ree...
Laura Flores
10:23 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Ilya Dryomov wrote in #note-6:
> Nir Soffer wrote:
> > Restarting the ceph-mgr pod does not help, rbd-mirroring is ...
Nir Soffer
09:36 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Nir Soffer wrote:
> Restarting the ceph-mgr pod does not help, rbd-mirroring is broken and
> we don't have any work...
Ilya Dryomov
09:03 PM cephsqlite Bug #65494 (In Progress): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Patrick Donnelly
09:03 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
... Patrick Donnelly
08:58 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Tested with:
* image: quay.io/ceph/ceph:v18
* imageID: quay.io/ceph/ceph@sha256:8c1697a0a924bbd625c9f1b33893bbc47b9...
Nir Soffer
07:53 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Looks like a sqlite issue. Patrick, can you take a look please? Yaarit Hatuka
07:15 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
Thread from rook slack:
https://rook-io.slack.com/archives/CK9CF5H2R/p1711467112958679
Nir Soffer
07:13 PM cephsqlite Bug #65494 (Pending Backport): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
h1. Description
We have a random error (about 1 in 200 deploys) when after creating a rook
cephcluster and cephbl...
Nir Soffer
10:14 PM RADOS Bug #62776: rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647437 Laura Flores
10:11 PM Dashboard Bug #64377: tasks/e2e: Modular dependency problems
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647494 Laura Flores
10:07 PM CephFS Bug #64946: qa: unable to locate package libcephfs1
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647487
/a/yuriw-2024-04-09_01:16:20-rados-reef...
Laura Flores
10:01 PM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647835 Laura Flores
06:25 PM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
Just a supplement to Nitzan's comment:
* this PG was @down@ and
* @ 'blocked_by': [2]@.
This brings the questi...
Radoslaw Zarzynski
09:59 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-4:
> > This is not ODF environment, this is upstream rook environment.
> >
> > You ca...
Nir Soffer
09:21 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Nir Soffer wrote in #note-3:
> It can be, but rbd mirror should fail (and restart) if pod networking is broken, no?
...
Ilya Dryomov
08:54 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Ilya Dryomov wrote in #note-1:
> Hi Nir,
>
> rbd-mirror daemon states that it was unable to connect to the remote...
Nir Soffer
08:44 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
Tested with:
* image: quay.io/ceph/ceph:v18
* imageID: quay.io/ceph/ceph@sha256:06ddc3ef5b66f2dcc6d16e41842d33a3d...
Nir Soffer
08:43 PM rbd Bug #65487 (Need More Info): rbd-mirror daemon in ERROR state, require manual restart
Hi Nir,
rbd-mirror daemon states that it was unable to connect to the remote cluster. Could it be some kind of po...
Ilya Dryomov
01:26 PM rbd Bug #65487 (Pending Backport): rbd-mirror daemon in ERROR state, require manual restart
h1. Description
We experience a random error in rbd-mirror daemon, occurring 1-2 times per 100 deployments.
Whe...
Nir Soffer
09:59 PM RADOS Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647721
/a/yuriw-2024-04-09_01:16:20-rados-reef...
Laura Flores
09:55 PM Orchestrator Bug #64208: test_cephadm.sh: Container version mismatch causes job to fail.
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647904
/a/yuriw-2024-04-09_01:16:20-rados-reef...
Laura Flores
09:51 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
/a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647523
/a/yuriw-2024-04-09_01:16:20-rados-reef...
Laura Flores
08:32 PM RADOS Bug #65495: 1 slow request in rgw suite causes test failure
i see that one of the osds on the other node has a similarly large log:... Casey Bodley
08:19 PM RADOS Bug #65495 (New): 1 slow request in rgw suite causes test failure
on an integration branch based on squid, a rgw suite job failed due to 'slow request' errors: https://qa-proxy.ceph.c... Casey Bodley
06:32 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
Bump up. In QA. Radoslaw Zarzynski
06:31 PM RADOS Bug #65227: noscrub cluster flag prevents deep-scrubs from starting
IIRC Ronen is already working on start orchestration between deep- and shallow-scrubs, Radoslaw Zarzynski
06:28 PM RADOS Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
Bump up. Radoslaw Zarzynski
06:27 PM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
The PR is in QA. Radoslaw Zarzynski
06:16 PM RADOS Bug #65449: NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
Hi Nitzan! Would you mind taking a look? Radoslaw Zarzynski
06:11 PM RADOS Bug #59670 (Need More Info): Ceph status shows PG recovering when norecover flag is set
The fix has been merged on 5 Jan 2024, so this could fit. It has been bacported only to Reef.
Wes Dillingham, do y...
Radoslaw Zarzynski
06:10 PM Orchestrator Backport #65383 (In Progress): reef: NLM should be enabled in NFS-Ganesha config file for locking functionality to work with v3 protocol
Adam King
05:53 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
Needs to be whitelisted; will bring this issue and others like it to the next RADOS meeting so we can divide up that ... Laura Flores
05:46 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
Thanks Venky, it was a mistake that I added it there in the first place. Laura Flores
01:15 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
Laura, handing this back to you since this isn't really cephfs related. Venky Shankar
05:45 PM RADOS Bug #53472 (Need More Info): Active OSD processes do not see reduced memory target when adding more OSDs
Pacific is EOL. Does it replicate on newer releases? Radoslaw Zarzynski
05:45 PM RADOS Bug #53472: Active OSD processes do not see reduced memory target when adding more OSDs
This tracker is 2 years old. I'm not sure how the situation was back then but, at least now, BlueStore is observing t... Radoslaw Zarzynski
05:35 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Waiting for upstream QA. Radoslaw Zarzynski
05:34 PM RADOS Bug #65371: rados: PeeringState::calc_replicated_acting_stretch populate acting set before checking if < bucket_max
Bump up. Radoslaw Zarzynski
05:02 PM rbd Bug #65481 (Fix Under Review): [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
Ilya Dryomov
09:51 AM rbd Bug #65481 (Pending Backport): [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
https://qa-proxy.ceph.com/teuthology/yuriw-2024-04-09_15:14:48-krbd-reef-release-testing-default-smithi/7649268/teuth... Ilya Dryomov
04:15 PM CephFS Documentation #57011: doc: 'profile cephfs-mirror' description is missing
not sure why this got moved out of cephfs, this is our documentation bug Patrick Donnelly
03:58 PM cleanup Tasks #65471 (Fix Under Review): rgw_sal_posix.cc printf compiler warnings
Daniel Gryniewicz
03:48 PM Dashboard Bug #65493 (Pending Backport): mgr/dashboard: snap schedule remove minutely from retention policy dropdown
Remove minutely from retention policy dropdown Ivo Almeida
03:28 PM rgw Bug #65473 (Fix Under Review): rgw: exclude logging of request payer for 403 requests
Casey Bodley
03:27 PM Orchestrator Backport #65417 (In Progress): squid: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
Adam King
03:26 PM Orchestrator Backport #65415 (In Progress): squid: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
Adam King
03:23 PM Orchestrator Backport #65382 (In Progress): squid: NLM should be enabled in NFS-Ganesha config file for locking functionality to work with v3 protocol
Adam King
03:20 PM Orchestrator Backport #65381 (In Progress): squid: upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
Adam King
03:14 PM Orchestrator Backport #65378 (In Progress): squid: cephadm: client-keyring also overwrites ceph.conf
Adam King
02:59 PM crimson Bug #53661 (Closed): Creation of the cluster failed with the crimson build
Please re-open if still relevant. Matan Breizman
02:55 PM crimson Bug #53047 (Closed): cmake command not found in the standalone cluster to execute cmake -DWITH_SEASTAR=ON .. command
Please re-open if still relevant. Matan Breizman
02:54 PM crimson Bug #52623 (Closed): Cache tries to get an invalid root extent
Please re-open if still relevant. Matan Breizman
02:53 PM crimson Bug #51639 (Closed): crimson/store_nbd: crash after start
Please re-open if still relevant. Matan Breizman
02:53 PM pulpito Feature #65492 (New): support looking at runs by user
like
https://pulpito.ceph.com/?user=teuthology
Patrick Donnelly
02:52 PM crimson Bug #47597 (Closed): got crush when stop one osd and restart it during rados bench
Please re-open if still relevant. Matan Breizman
02:52 PM crimson Bug #47030 (Closed): segault when evicting osdmap from cache
Please re-open if still relevant. Matan Breizman
02:50 PM crimson Bug #57547 (Closed): Hang with seastore at wait_for_active stage
Please re-open if still relevant. Matan Breizman
02:49 PM crimson Bug #57548 (Closed): Hang with alienstore
Please re-open if still relevant. Matan Breizman
02:49 PM crimson Subtask #45535 (Closed): crimson: crimson-osd failure in ceph-container
Please re-open if still relevant. Matan Breizman
02:40 PM rgw Bug #64571: lifecycle transition crashes since reloading bucket attrs for notification
The cause for this issue seems to be due to multiple LC worker threads updating the same `bucket` handle, which is no... Soumya Koduri
02:31 PM CephFS Backport #65489 (In Progress): squid: mds: enhance scrub to fragment/merge dirfrags
Christopher Hoffman
01:28 PM CephFS Backport #65489 (In Progress): squid: mds: enhance scrub to fragment/merge dirfrags
https://github.com/ceph/ceph/pull/56896 Christopher Hoffman
02:30 PM CephFS Backport #65488 (In Progress): reef: mds: enhance scrub to fragment/merge dirfrags
Christopher Hoffman
01:28 PM CephFS Backport #65488 (In Progress): reef: mds: enhance scrub to fragment/merge dirfrags
https://github.com/ceph/ceph/pull/56895 Christopher Hoffman
02:28 PM CephFS Backport #65490 (In Progress): quincy: mds: enhance scrub to fragment/merge dirfrags
Christopher Hoffman
01:28 PM CephFS Backport #65490 (In Progress): quincy: mds: enhance scrub to fragment/merge dirfrags
https://github.com/ceph/ceph/pull/56894 Christopher Hoffman
02:11 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Venky Shankar wrote in #note-11:
> Dhairya Parmar wrote in #note-10:
> > I was confident of the code, I've mentione...
Venky Shankar
01:27 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
this doesn't seem related to test cases at all
time when the MGR_DOWN warning was seen:...
Dhairya Parmar
06:13 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya Parmar wrote in #note-10:
> I was confident of the code, I've mentioned this in https://tracker.ceph.com/iss...
Venky Shankar
02:09 PM CephFS Bug #65423 (Need More Info): Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
fuchen ma wrote in #note-1:
> Another information:
> I found that the version of the non-crashed is 18.2.2, and the...
Venky Shankar
02:08 PM CephFS Bug #65455 (Need More Info): read operation hung in Client::get_caps
tod chen wrote in #note-1:
> the ceph version is 15.2.17 and 16.2.14
ceph 15.x is EOL'd and unsupported. Could yo...
Venky Shankar
02:06 PM rgw Bug #65463: rgw/notifications: test data path v2 persistent migration fails
* even tough no crash is observed, it seems like a similar issue to: https://tracker.ceph.com/issues/65337. when runn... Yuval Lifshitz
01:49 PM crimson Bug #65491 (In Progress): recover_missing: racing read got wrong version
... Matan Breizman
01:12 PM rgw Bug #65486 (Fix Under Review): valgrind error on kafka shutdown
Yuval Lifshitz
01:11 PM rgw Bug #65486 (Fix Under Review): valgrind error on kafka shutdown
see: https://tracker.ceph.com/issues/65337#note-4
may cause crash on close.
Yuval Lifshitz
01:06 PM CephFS Bug #62123: mds: detect out-of-order locking
This may also caused *MDS Behind on Trimming...*: https://www.mail-archive.com/ceph-users@ceph.io/msg24587.html. Xiubo Li
12:24 PM bluestore Backport #65485: squid: bluestore/bluestore_types: check 'it' valid before using
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/56891
ceph-backport.sh versi...
Rongqi Sun
12:10 PM bluestore Backport #65485 (In Progress): squid: bluestore/bluestore_types: check 'it' valid before using
Backport Bot
12:18 PM CephFS Feature #61866: MDSMonitor: require --yes-i-really-mean-it when failing an MDS with MDS_HEALTH_TRIM or MDS_HEALTH_CACHE_OVERSIZED health warnings
Patrick, should we include other health warnings too? I didn't include it in PR because it was mentioned on this tick... Rishabh Dave
12:17 PM bluestore Backport #65484: reef: bluestore/bluestore_types: check 'it' valid before using
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/56890
ceph-backport.sh versi...
Rongqi Sun
12:09 PM bluestore Backport #65484 (In Progress): reef: bluestore/bluestore_types: check 'it' valid before using
Backport Bot
12:16 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Venky Shankar wrote in #note-26:
> Dhairya Parmar wrote in #note-25:
> > Venky Shankar wrote in #note-24:
> > > Dh...
Dhairya Parmar
10:52 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Dhairya Parmar wrote in #note-25:
> Venky Shankar wrote in #note-24:
> > Dhariya,
> >
> > Anything blocking w.r....
Venky Shankar
10:10 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Venky Shankar wrote in #note-24:
> Dhariya,
>
> Anything blocking w.r.t. the design for this enhancement? The lag...
Dhairya Parmar
10:00 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
Dhariya,
Anything blocking w.r.t. the design for this enhancement? The laggy OSD list is obviously something that ...
Venky Shankar
12:14 PM bluestore Backport #65483: quincy: bluestore/bluestore_types: check 'it' valid before using
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/56889
ceph-backport.sh versi...
Rongqi Sun
12:09 PM bluestore Backport #65483 (Resolved): quincy: bluestore/bluestore_types: check 'it' valid before using
Backport Bot
12:10 PM CephFS Bug #65157 (Can't reproduce): cephfs-mirror: set layout.pool_name xattr of destination subvol correctly
Can't reproduce this:... Jos Collin
12:03 PM CephFS Backport #65316 (In Progress): squid: mds: CInode::item_caps used in two different lists
https://github.com/ceph/ceph/pull/56887 Dhairya Parmar
12:03 PM CephFS Backport #65315 (In Progress): reef: mds: CInode::item_caps used in two different lists
https://github.com/ceph/ceph/pull/56886 Dhairya Parmar
11:50 AM bluestore Bug #65482 (Pending Backport): bluestore/bluestore_types: check 'it' valid before using
Rixin Luo
11:47 AM bluestore Bug #65482 (Fix Under Review): bluestore/bluestore_types: check 'it' valid before using
When sanitizer is enabled, unittest_bluestore_types fails as following
[ RUN ] sb_info_space_efficient_map_t....
Rongqi Sun
10:27 AM crimson Feature #65478: Support SnapMapper::Scrubber
junxiang mu wrote in #note-1:
> I can try implement this, can i tack this issue? :)
No problem!
I noticed that y...
Matan Breizman
09:30 AM crimson Feature #65478: Support SnapMapper::Scrubber
I can try implement this, can i tack this issue? :) junxiang mu
08:59 AM crimson Feature #65478 (New): Support SnapMapper::Scrubber
We need to make crimson aware about SnapMapper::Scrubber and the purged snaps flow (track record_purged_snaps() in th... Matan Breizman
10:23 AM CephFS Bug #61009: crash: void interval_set<T, C>::erase(T, T, std::function<bool(T, T)>) [with T = inodeno_t; C = std::map]: assert(p->first <= start)
Explanation of the preallocated machinery which might help in the future:
I played around a bit more with prealloc...
Venky Shankar
10:20 AM CephFS Bug #61009: crash: void interval_set<T, C>::erase(T, T, std::function<bool(T, T)>) [with T = inodeno_t; C = std::map]: assert(p->first <= start)
Please see https://github.com/ceph/ceph/pull/53752#issuecomment-2056469527 for the status of the change.
This issu...
Venky Shankar
09:09 AM ceph-volume Backport #65480 (In Progress): squid: prepare/create/activate refactor
Guillaume Abrioux
09:06 AM ceph-volume Backport #65480 (In Progress): squid: prepare/create/activate refactor
https://github.com/ceph/ceph/pull/56883 Backport Bot
09:07 AM crimson Bug #57739 (Need More Info): crimson: LogMissingRequest and RepRequest operator<< access possibly invalid req
Matan Breizman
09:06 AM crimson Bug #57758 (Need More Info): crimson: disable autoscale for crimson in teuthology
Matan Breizman
09:05 AM crimson Bug #57801 (Resolved): crimson: tag pool types as crimson, disallow snapshot, scrub, ec operations
Matan Breizman
09:05 AM crimson Bug #64975 (Resolved): crimson: Health check failed: 9 scrub errors (OSD_SCRUB_ERRORS)" in cluster log'
Matan Breizman
09:03 AM Dashboard Bug #65479 (Fix Under Review): mgr/dashboard: use grafana server instead of grafana-server in grafana 10.4.0
The grafana-server command is deprecated in grafana v10.4.0. It is advised to use grafan server in place of it. Aashish Sharma
09:02 AM crimson Bug #57990 (Closed): Crimson OSD crashes when trying to bring it up
Yingxin Cheng wrote in #note-1:
> Crimson is not production ready yet, and there will be no backport to Quincy.
>
...
Matan Breizman
09:01 AM crimson Bug #58391 (Need More Info): crimson-osd can't finish "mkfs" under RelWithDebInfo build type
@rainman
Is this still relevant?
Matan Breizman
08:59 AM ceph-volume Cleanup #61827 (Pending Backport): prepare/create/activate refactor
Guillaume Abrioux
08:58 AM ceph-volume Cleanup #61827 (Fix Under Review): prepare/create/activate refactor
Guillaume Abrioux
08:54 AM crimson Bug #61227 (Resolved): [crimson] ceph df stats are twice of actual values
Matan Breizman
08:52 AM ceph-volume Bug #65477 (Fix Under Review): `ceph-volume lvm prepare` does not create LVs anymore when using partitions
Guillaume Abrioux
08:27 AM ceph-volume Bug #65477 (Fix Under Review): `ceph-volume lvm prepare` does not create LVs anymore when using partitions
`ceph-volume lvm prepare` used to create VGs/LVs on partitions. This has changed with commit 1e7223281fa044c9653633e3... Guillaume Abrioux
08:50 AM crimson Bug #61875 (Resolved): crimson crashes during reboot when there are snap objects
Matan Breizman
08:50 AM crimson Bug #62526: during recovery crimson sends OI_ATTR with MAXed soid and kills classical OSDs
@rzarzynski,
Is https://github.com/ceph/ceph/pull/53084 still relevant?
Matan Breizman
08:48 AM crimson Bug #62550 (Resolved): osd crashes when doing peering
Matan Breizman
08:48 AM crimson Bug #63307 (Resolved): crimson: SnapTrimObjSubEvent doesn't actually seem to submit delta_stats
Matan Breizman
08:46 AM crimson Bug #64282 (Resolved): osd crashes due to unexpected pg creation
Matan Breizman
08:45 AM crimson Bug #64535 (Resolved): crimson osd crashes during crimson-rados-experimental teuthology tests
Matan Breizman
08:11 AM crimson Bug #64782 (Fix Under Review): test_python.sh TestIoctx.test_locator failes in cases of SeaStore
Matan Breizman
08:09 AM crimson Bug #65113 (Fix Under Review): crimson: SnapTrimObjSubEvent num_bytes stats calculation
Matan Breizman
08:07 AM crimson Bug #65130: crimson: crimson-rados did not detect reintroduction of https://tracker.ceph.com/issues/61875
Added label: crimson-replicated-recovery to track all the required fixes
https://github.com/ceph/ceph/pulls?q=+is%...
Matan Breizman
08:06 AM crimson Bug #65247 (Need More Info): ObjectContext::drop_recovery_read(): Assertion `recovery_read_marker' failed.
Matan Breizman
08:05 AM Dashboard Backport #65465 (In Progress): squid: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
Ivo Almeida
08:05 AM crimson Feature #65288 (Fix Under Review): crimson: OSD support `trim stale osdmaps` socket command
Matan Breizman
08:04 AM Dashboard Backport #65464 (In Progress): reef: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
Ivo Almeida
08:03 AM crimson Bug #65399 (Fix Under Review): osd crash due to deferred recovery
Matan Breizman
08:03 AM crimson Bug #65451 (Fix Under Review): tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
Matan Breizman
08:02 AM crimson Bug #65453 (Fix Under Review): osd crashes due to outdated recovery ops
Matan Breizman
08:02 AM crimson Bug #65474 (Fix Under Review): mgr crash due to corrupted incremental osdmap sent by crimson-osds
Matan Breizman
08:01 AM crimson Feature #65476 (In Progress): Support Erasure coded pools
Matan Breizman
07:41 AM crimson Bug #64332 (In Progress): seastar submodule: Enable SEASTAR_GATE_HOLDER_DEBUG
Nitzan Mordechai
05:40 AM Dashboard Backport #65168 (In Progress): quincy: mgr/dashboard: CVE-2023-26159, CVE-2024-28849 follow-redirects package
Nizamudeen A
05:34 AM Dashboard Backport #65170 (In Progress): reef: mgr/dashboard: CVE-2023-26159, CVE-2024-28849 follow-redirects package
Nizamudeen A
04:40 AM mgr Bug #59580: memory leak (RESTful module, maybe others?)
waitting for https://github.com/ceph/ceph/pull/54984 merge and backport Nitzan Mordechai

04/14/2024

02:43 PM mgr Bug #59580: memory leak (RESTful module, maybe others?)
Hi,
It seems that the ceph-mgr oom issue happened again on 16.2.15. We had ceph-mgr "oom" this morning.
I have ...
A. Saber Shenouda
01:25 PM sepia Bug #65475 (In Progress): folio03 install
adam kraitman
11:39 AM sepia Bug #65475 (In Progress): folio03 install
@akraitma we need a new install on folio03, currently it's RHEL 8.6 and we can't use GCC with higher versions. Nitzan Mordechai
11:01 AM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
h2. Analysis (WIP)
* the following test run is a sure way to create the ‘__header’ failure in ‘main’:
@./teutholo...
Ronen Friedman
01:39 AM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
I created a test branch with some extra logging and managed to reproduce the issue with slightly more info.... Samuel Just
07:17 AM crimson Bug #65474 (Resolved): mgr crash due to corrupted incremental osdmap sent by crimson-osds
... Xuehan Xu

04/13/2024

12:05 AM rgw Bug #65473: rgw: exclude logging of request payer for 403 requests
PR: https://github.com/ceph/ceph/pull/56868 Seena Fallah
12:02 AM rgw Bug #65473 (Pending Backport): rgw: exclude logging of request payer for 403 requests
As per AWS doc (https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html#ChargeDetails), reque... Seena Fallah

04/12/2024

10:24 PM CephFS Bug #65472 (Pending Backport): mds: avoid recalling Fb when quiescing file
To avoid extensive flushes by the client. (We don't need to trigger an fsync to quiesce a tree.)
See also: https:/...
Patrick Donnelly
08:48 PM cleanup Tasks #65471 (Fix Under Review): rgw_sal_posix.cc printf compiler warnings
... Casey Bodley
08:40 PM rgw Feature #65470 (New): Beast lacks ssl_short_trust option to reload ssl certificate without restart
Previously civetweb rgw had an option (ssl_short_trust) to automatically reload certs, for instance when they are sho... Brien Dieterle
08:01 PM rgw Bug #65469 (Fix Under Review): rgw: increase log level on abort_early
Casey Bodley
07:59 PM rgw Bug #65469 (Pending Backport): rgw: increase log level on abort_early
The function is typically invoked on client errors like NoSuchBucket. Logging these errors with level 1 may initially... Seena Fallah
07:29 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
the crash during the realm reload is due to connection being destroyed while its in use,
we call `kafka::shutdown` d...
Krunal Chheda
05:22 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
@yuvalif the crash issue with kafka is all about the conn->destroyed being called while publish_internal() might be p... Krunal Chheda
05:00 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
In our testing we are seeing the same crash, however we do not see it during the realm upload or shutdown.
Its just ...
Krunal Chheda
06:56 PM rgw Bug #65468: rgw: set correct requestId and hostId on s3select error
PR: https://github.com/ceph/ceph/pull/56864 Seena Fallah
06:51 PM rgw Bug #65468 (Fix Under Review): rgw: set correct requestId and hostId on s3select error
Previously, these fields remained constant despite the possibility of populating them with appropriate values. Seena Fallah
06:12 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Venky Shankar wrote in #note-31:
> [...]
>
> And patched up the yaml to use the custom quincy build to upgrade to...
Patrick Donnelly
05:11 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
... Venky Shankar
10:18 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
I have a custom quincy branch (patched with debug in ceph-fuse/fuse_ll). That should give us enough debug to see what... Venky Shankar
06:04 PM rgw Backport #65467 (In Progress): squid: rgw user accounts
Casey Bodley
05:58 PM rgw Backport #65467 (Resolved): squid: rgw user accounts
https://github.com/ceph/ceph/pull/56863 Backport Bot
05:56 PM rgw Feature #65466 (Resolved): rgw user accounts
Casey Bodley
05:18 PM rgw Bug #64381 (Resolved): iam role: CreateDate can go backwards
Casey Bodley
05:17 PM rgw Bug #64475 (Resolved): multisite: forwarded CreateRole request generates different CreateDate
Casey Bodley
03:35 PM rgw Bug #61772 (Closed): rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
Ali Maredia
03:32 PM rgw-testing Bug #17776 (Closed): rgw: test aws4
Ali Maredia
03:29 PM teuthology Bug #59284: Missing `/home/ubuntu/cephtest/archive/coredump` file or directory
from http://qa-proxy.ceph.com/teuthology/cbodley-2024-04-12_12:44:47-rgw-wip-rgw-account-v3-distro-default-smithi/765... Casey Bodley
03:07 PM Ceph QA QA Run #65385 (QA Closed): wip-yuri4-testing-2024-04-08-1432
Yuri Weinstein
03:06 PM Ceph QA QA Run #65420 (QA Needs Approval): wip-yuri2-testing-2024-04-10-1311-squid
@cbodley pls review all tests scheduled Yuri Weinstein
03:05 PM Dashboard Backport #65465 (In Progress): squid: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
https://github.com/ceph/ceph/pull/56881 Backport Bot
03:05 PM Dashboard Backport #65464 (In Progress): reef: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
https://github.com/ceph/ceph/pull/56880 Backport Bot
03:01 PM Dashboard Bug #64980 (Pending Backport): mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
Ivo Almeida
02:59 PM Dashboard Backport #65459 (In Progress): reef: mgr/dashboard: fix snap schedule delete retention
Ivo Almeida
11:52 AM Dashboard Backport #65459 (In Progress): reef: mgr/dashboard: fix snap schedule delete retention
https://github.com/ceph/ceph/pull/56862 Backport Bot
02:58 PM Dashboard Backport #65458 (In Progress): squid: mgr/dashboard: fix snap schedule delete retention
Ivo Almeida
11:52 AM Dashboard Backport #65458 (In Progress): squid: mgr/dashboard: fix snap schedule delete retention
https://github.com/ceph/ceph/pull/56861 Backport Bot
02:57 PM rgw Bug #65463 (New): rgw/notifications: test data path v2 persistent migration fails
from https://qa-proxy.ceph.com/teuthology/cbodley-2024-04-12_12:44:47-rgw-wip-rgw-account-v3-distro-default-smithi/76... Casey Bodley
02:41 PM rgw Bug #65462: rgw: eliminate ssl enforcement for sse-s3 encryption
PR: https://github.com/ceph/ceph/pull/56860 Seena Fallah
02:40 PM rgw Bug #65462 (Pending Backport): rgw: eliminate ssl enforcement for sse-s3 encryption
Implement distinct SSL enforcement configurations for SSE-S3, SSE-C, and SSE-KMS encryption methods.
This can be hel...
Seena Fallah
02:23 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
I was confident of the code, I've mentioned this in https://tracker.ceph.com/issues/65265#note-6. I then raised a PR ... Dhairya Parmar
01:32 PM Ceph Bug #65228 (In Progress): class:device-class config database mask does not work for osd_compact_on_start
Igor Fedotov
01:30 PM Ceph QA QA Run #65447 (QA Closed): wip-pdonnell-testing-20240411.210829-debug
https://github.com/ceph/ceph/pull/56755#issuecomment-2051518639 Patrick Donnelly
01:28 PM cleanup Tasks #65460 (New): audit rgw_get_request_metadata(), stop storing unneccessary headers as xattrs
@rgw_get_request_metadata()@ adds object/bucket xattrs for most of the headers in @x_meta_map@ (which stores any head... Casey Bodley
12:14 PM CephFS Tasks #64819 (Resolved): data corruption during rmw after lseek
The reproducers above were simplifications of failures/errors from running the ffsb test suite on a fscrypt enabled d... Christopher Hoffman
12:11 PM CephFS Bug #62246: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
Venky Shankar wrote in #note-12:
> Rishabh, do we need this for squid too?
Answering this myself - the PR was mer...
Venky Shankar
08:21 AM CephFS Bug #62246: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
Rishabh, do we need this for squid too? Venky Shankar
11:44 AM Dashboard Bug #65370 (Pending Backport): mgr/dashboard: fix snap schedule delete retention
Ivo Almeida
10:21 AM Dashboard Bug #65457: mgr/dashboard: ninja fails on `src/pybind/mgr/dashboard/frontend/dist`
Sorry for the Chinese words.
核心已转储 should be core dump.
taki zhao
09:57 AM Dashboard Bug #65457: mgr/dashboard: ninja fails on `src/pybind/mgr/dashboard/frontend/dist`
Arch is arm. I don’t know if nodejs needs to adjust to the arm architecture. taki zhao
09:41 AM Dashboard Bug #65457 (New): mgr/dashboard: ninja fails on `src/pybind/mgr/dashboard/frontend/dist`
After I install deps, execute `./do_cmake.sh`, cd `bulid` and `ninja`.
It fails on dashboard frontend.
There is...
taki zhao
10:02 AM sepia Support #65238: Sepia Lab Access Request
adam kraitman wrote in #note-2:
> Hey Jiffin Tony Thottan, Are these new/additional or replacement credentials?
M...
Jiffin Tony Thottan
09:24 AM Ceph QA QA Run #65456: wip-rishabh-testing-20240407.092921-reef
Link to wiki - https://tracker.ceph.com/projects/cephfs/wiki/Reef#12-April-2024
Backport PR has been merged - https:...
Rishabh Dave
09:20 AM Ceph QA QA Run #65456 (QA Closed): wip-rishabh-testing-20240407.092921-reef
Rishabh Dave
09:18 AM Ceph QA QA Run #65456 (QA Closed): wip-rishabh-testing-20240407.092921-reef
https://github.com/ceph/ceph/pull/54942
Not adding more PRs to the testing branch since this one already has too m...
Rishabh Dave
09:17 AM Ceph QA QA Run #65329 (QA Closed): wip-rishabh-testing-20240404.111254-quincy
Backport PR has been merged - https://github.com/ceph/ceph/pull/54946#event-12446845381
Link to wiki - https://track...
Rishabh Dave
09:16 AM Ceph QA QA Run #65329: wip-rishabh-testing-20240404.111254-quincy
https://pulpito.ceph.com/rishabh-2024-04-11_13:44:02-fs-wip-rishabh-testing-20240404.111254-quincy-testing-default-sm... Rishabh Dave
09:12 AM CephFS Backport #63834 (Resolved): reef: mon/FSCommands: support swapping file systems by name
Rishabh Dave
09:11 AM CephFS Backport #63407 (Resolved): quincy: cephfs: print better error message when MDS caps perms are not right
Rishabh Dave
08:48 AM rgw Backport #65003 (Resolved): reef: [CVE-2023-46159] RGW crash upon misconfigured CORS rule
Guillaume Abrioux
08:14 AM CephFS Bug #62188: AttributeError: 'RemoteProcess' object has no attribute 'read'
Rishabh Dave wrote in #note-9:
> All the recent failures are from QA runs for Reef, this is because the fix for this...
Venky Shankar
06:56 AM CephFS Bug #65455: read operation hung in Client::get_caps
the ceph version is 15.2.17 and 16.2.14 tod chen
06:55 AM CephFS Bug #65455 (Rejected): read operation hung in Client::get_caps
How to reproduce the scene
1. I used two nfs ganesha+libcephfs as the nfs server (server1, server2), and used the sa...
tod chen
06:40 AM RADOS Bug #59831: crash: void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*): assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))
I had the same problem with version 14.2.21,is there any progress... dovefi Z
06:02 AM Ceph QA QA Run #65454 (QA Needs Approval): wip-vshankar-testing-20240411.061452
- https://github.com/ceph/ceph/pull/56193
- https://github.com/ceph/ceph/pull/56135
- https://github.com/ceph/ceph/...
Venky Shankar
05:59 AM crimson Bug #65453 (Fix Under Review): osd crashes due to outdated recovery ops
PGs' recovery backends don't discard old recovery ops... Xuehan Xu
05:53 AM CephFS Bug #65246 (Fix Under Review): qa/cephfs: test_multifs_single_path_rootsquash (tasks.cephfs.test_admin.TestFsAuthorize)
Rishabh Dave
05:19 AM Ceph QA QA Run #65324 (QA Closed): wip-vshankar-testing-20240330.170739
All merged. Venky Shankar
05:15 AM Ceph QA QA Run #65324: wip-vshankar-testing-20240330.170739
Moving https://github.com/ceph/ceph/pull/55945 to another test branch since we hit a client crash due to another PR t... Venky Shankar
04:59 AM Ceph QA QA Run #65324: wip-vshankar-testing-20240330.170739
Dropping https://github.com/ceph/ceph/pull/55144 due to https://github.com/ceph/ceph/pull/55144#discussion_r1562013425 Venky Shankar
04:53 AM Ceph QA QA Run #65324: wip-vshankar-testing-20240330.170739
Dropping https://github.com/ceph/ceph/pull/56148 from the list of PRs as the change is buggy - Jos has updated it and... Venky Shankar
05:18 AM CephFS Feature #57481 (Pending Backport): mds: enhance scrub to fragment/merge dirfrags
Venky Shankar
05:09 AM RADOS Bug #59670: Ceph status shows PG recovering when norecover flag is set
We saw this issue again in another setup and it has been fixed here: https://github.com/ceph/ceph/pull/54708.
The p...
Aishwarya Mathuria
03:04 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
!clipboard-202404121104-m5hdh.png!
qing zhao
03:03 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
!clipboard-202404121103-bsweb.png!
qing zhao
03:02 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
!clipboard-202404121102-9u6kt.png!
qing zhao
03:01 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
case:primary osd executes do_osd_ops write fail and need to execute record_write_error. In record_write_error functio... qing zhao
02:47 AM Ceph Bug #65452 (New): peer pg_info_t's last_complete in primary pg cannot be updated
!clipboard-202404121047-ovd7r.png!
qing zhao
02:07 AM crimson Bug #65451: tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
Probably can be addressed by https://github.com/ceph/ceph/commit/3a6332fd6676da590b9ede46954b2a6a74308bd7, will split... Yingxin Cheng
01:58 AM crimson Bug #65451 (Resolved): tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
See the assert failure in osd.1 from https://pulpito.ceph.com/yingxin-2024-04-11_01:17:19-crimson-rados-ci-yingxin-cr... Yingxin Cheng

04/11/2024

11:11 PM Ceph QA QA Run #65385 (QA Approved): wip-yuri4-testing-2024-04-08-1432
@yuriw rados approved: https://tracker.ceph.com/projects/rados/wiki/SQUID#httpstrackercephcomissues65385 Laura Flores
11:03 PM RADOS Bug #65450: rados/thrash-old-clients: "PG_BACKFILL: Low space hindering backfill" warning in cluster log
Should be evaluated to see whether this should be added to the ignorelist, or if it points to a larger bug. Laura Flores
11:03 PM RADOS Bug #65450 (New): rados/thrash-old-clients: "PG_BACKFILL: Low space hindering backfill" warning in cluster log
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7649192... Laura Flores
10:59 PM RADOS Bug #65449 (Fix Under Review): NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7649011... Laura Flores
10:52 PM Infrastructure Bug #65448: Teuthology unable to find the "ceph-radosgw" package
Please refile this as an "Infrastructure" bug if not explicitly related to RGW. Laura Flores
10:52 PM Infrastructure Bug #65448 (New): Teuthology unable to find the "ceph-radosgw" package
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648970... Laura Flores
10:49 PM Dashboard Bug #64377: tasks/e2e: Modular dependency problems
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7649198 Laura Flores
10:44 PM Orchestrator Bug #52109: test_cephadm.sh: Timeout('Port 8443 not free on 127.0.0.1.',)
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648958 Laura Flores
10:42 PM ceph-volume Bug #56620: Deploy a ceph cluster with cephadm,using ceph-volume lvm create command to create osd can not managed by cephadm
Looks like a case of this:
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smith...
Laura Flores
09:49 PM Orchestrator Bug #65233: upgrade/cephfs/mds_upgrade_sequence: 'ceph orch ps' command times out
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648940 Laura Flores
09:44 PM rgw Backport #65427 (In Progress): squid: Admin Ops socket crashes RGW
Casey Bodley
02:37 PM rgw Backport #65427 (Resolved): squid: Admin Ops socket crashes RGW
https://github.com/ceph/ceph/pull/56840 Backport Bot
09:08 PM Ceph QA QA Run #65447 (QA Closed): wip-pdonnell-testing-20240411.210829-debug
* "PR #56755":https://github.com/ceph/ceph/pull/56755 -- mds/quiesce: xlock the file to let clients keep their buffer... Patrick Donnelly
07:23 PM Ceph QA QA Run #65446 (QA Closed): wip-pdonnell-testing-20240411.192137-squid-debug
Patrick Donnelly
07:22 PM Ceph QA QA Run #65446: wip-pdonnell-testing-20240411.192137-squid-debug
... Patrick Donnelly
07:21 PM Ceph QA QA Run #65446 (QA Closed): wip-pdonnell-testing-20240411.192137-squid-debug
* "PR #56671":https://github.com/ceph/ceph/pull/56671 -- squid: mds: skip sr moves when target is an unlinked dir Patrick Donnelly
06:50 PM CephFS Bug #62188 (Duplicate): AttributeError: 'RemoteProcess' object has no attribute 'read'
All the recent failures are from QA runs for Reef, this is because the fix for this issue (https://tracker.ceph.com/i... Rishabh Dave
06:24 PM CephFS Backport #65441 (New): quincy: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
Rishabh Dave
06:24 PM CephFS Backport #65440 (New): reef: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
Rishabh Dave
06:22 PM CephFS Bug #62246 (Pending Backport): qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
Rishabh Dave
05:33 PM CephFS Bug #62246: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
*The PR linked here fixes multiple issues. This specific commit
from the PR fixes the issue -
https://github.com/ceph...
Rishabh Dave
05:32 PM CephFS Bug #62246 (Resolved): qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
Rishabh Dave
06:07 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
sure thing Laura Kamoltat (Junior) Sirivadhna
06:01 PM sepia Support #64967: Sepia Lab Access Request
adam kraitman wrote in #note-6:
> Hey If you re-run the new-client script, It's unfortunately not idempotent so if y...
Soumya Koduri
01:36 PM sepia Support #64967: Sepia Lab Access Request
Hey If you re-run the new-client script, It's unfortunately not idempotent so if you re-ran it and still have the out... adam kraitman
05:14 PM rgw Bug #65436 (Need More Info): Getting Object Crashing radosgw services
Hello,
We are seeing crashes when users are trying to get a specific file....
Reid Guyett
05:04 PM Ceph QA QA Run #65270 (QA Needs Approval): wip-yuri6-testing-2024-04-02-1310
Yuri Weinstein
05:04 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
Laura Flores wrote in #note-23:
> Once this is resolved, I need the results rerun.
Attempting a rerun again
Yuri Weinstein
04:04 PM Ceph QA QA Run #65270 (QA Needs Rerun/Rebuilt): wip-yuri6-testing-2024-04-02-1310
@yuriw this will need to be rerun. I see a lot of failures from "Failed to establish a new connection" that I suspect... Laura Flores
05:03 PM Ceph QA QA Run #65435: wip-pdonnell-testing-20240411.165310
Example tracker for https://github.com/ceph/ceph/pull/56835 Patrick Donnelly
05:02 PM Ceph QA QA Run #65435 (QA Closed): wip-pdonnell-testing-20240411.165310
Patrick Donnelly
04:53 PM Ceph QA QA Run #65435 (QA Closed): wip-pdonnell-testing-20240411.165310
* "PR #56755":https://github.com/ceph/ceph/pull/56755 -- mds/quiesce: xlock the file to let clients keep their buffer... Patrick Donnelly
04:07 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
@amathuri can you review this run? It looks like a lot of failures but many seem to be expected warnings. LMK if you ... Laura Flores
03:59 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
@yuriw I'll take a look. I see now that it's ready for QA approval Laura Flores
03:49 PM Ceph QA QA Run #65045 (QA Needs Approval): wip-yuri5-testing-2024-03-21-0833
Yuri Weinstein
03:48 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
@lflores what about this batch? Yuri Weinstein
02:48 PM rgw Feature #63915 (New): propagate kafka errors to client in case of sync notifications
Yuval Lifshitz
02:37 PM rgw Backport #65426 (New): quincy: Admin Ops socket crashes RGW
Backport Bot
02:36 PM rgw Backport #65425 (New): reef: Admin Ops socket crashes RGW
Backport Bot
02:30 PM rgw Bug #64244 (Pending Backport): Admin Ops socket crashes RGW
Casey Bodley
02:20 PM rgw Cleanup #63962 (New): rgw-file: FLAG_SYMBOLIC_LINK decl aliases other flags
Casey Bodley
02:17 PM rgw Bug #64805 (Fix Under Review): rgw: dynamic resharding will block write op
Casey Bodley
02:11 PM rgw Bug #61710 (Won't Fix): quincy/pacific: PUT requests during reshard of versioned bucket fail with 404 and leave behind dark data
Casey Bodley
02:05 PM rgw Bug #63378 (New): rgw/multisite: Segmentation fault during full sync
Casey Bodley
01:55 PM CephFS Bug #65261: qa/cephfs: cephadm related failure on fs/upgrade job
https://pulpito.ceph.com/rishabh-2024-04-08_08:23:45-fs-wip-rishabh-testing-20240407.092921-reef-testing-default-smit... Rishabh Dave
01:39 PM sepia Support #65359 (In Progress): Sepia Lab Access Request
adam kraitman
01:39 PM sepia Support #65359: Sepia Lab Access Request
Hey Amarnath Reddy Are these new/additional or replacement credentials? adam kraitman
01:37 PM Infrastructure Bug #65229 (In Progress): Failed to reconnect to smithiXXX
Hey @lflores please ping me if you see this failure again adam kraitman
01:13 PM CephFS Backport #62425 (Fix Under Review): reef: nofail option in fstab not supported
Leonid Usov
01:12 PM CephFS Backport #62426 (Fix Under Review): quincy: nofail option in fstab not supported
Leonid Usov
01:12 PM CephFS Backport #63362 (Fix Under Review): quincy: mds: create an admin socket command for raising a signal
Leonid Usov
01:12 PM CephFS Backport #63363 (Fix Under Review): reef: mds: create an admin socket command for raising a signal
Leonid Usov
01:11 PM CephFS Backport #63479 (Fix Under Review): reef: src/mds/MDLog.h: 100: FAILED ceph_assert(!segments.empty())
Leonid Usov
01:11 PM CephFS Backport #63480 (Fix Under Review): quincy: src/mds/MDLog.h: 100: FAILED ceph_assert(!segments.empty())
Leonid Usov
01:11 PM CephFS Backport #63822 (Fix Under Review): reef: cephfs/fuse: renameat2 with flags has wrong semantics
Leonid Usov
01:10 PM CephFS Tasks #63669 (Fix Under Review): qa: add teuthology tests for quiescing a group of subvolumes
Leonid Usov
12:19 PM CephFS Bug #64977 (Fix Under Review): mds spinlock due to lock contention leading to memory exaustion
Xiubo Li
11:25 AM rbd Bug #65421 (Duplicate): upgrade/reef-x/stress-split: TestMigration.StressLive failure
This isn't specific to upgrade/reef-x/stress-split -- no need to track separately. Ilya Dryomov
09:54 AM Ceph QA QA Run #65099: wip-yuri10-testing-2024-03-24-1159
In the new run (https://pulpito.ceph.com/yuriw-2024-04-10_14:20:47-rados-wip-yuri10-testing-2024-03-24-1159-distro-de... Radoslaw Zarzynski
09:07 AM CephFS Bug #65423: Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
fuchen ma wrote in #note-1:
> Another information:
> I found that the version of the non-crashed is 18.2.2, and the...
fuchen ma
09:06 AM CephFS Bug #65423: Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
Another information:
I found that the version of the non-crashed is 18.2.2, and the version of the crashed ones are ...
fuchen ma
08:35 AM CephFS Bug #65423 (Rejected): Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
I have created a ceph cluster with 5 monitors and 2 metadata servers.
After that, I want to create a fs. Thus, I use...
fuchen ma
09:03 AM Orchestrator Documentation #65424 (New): hardware-monitoring/#developers is broken
https://docs.ceph.com/en/latest/hardware-monitoring/#developpers
It just contains a bunch of python-mock doc stri...
Sebastian Wagner
06:11 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Dhairya mentioned that the tracebacks seems in the mgr logs are logged by object formatter and not necessarily unhand... Venky Shankar
04:30 AM Orchestrator Backport #65414 (In Progress): squid: cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
Nitzan Mordechai

04/10/2024

10:15 PM RADOS Bug #65422 (New): upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648908... Laura Flores
10:01 PM rbd Bug #65421 (Duplicate): upgrade/reef-x/stress-split: TestMigration.StressLive failure
... Laura Flores
09:11 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648890
/a/yuriw-2024-0...
Laura Flores
07:26 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648938/remote/smithi122... Laura Flores
05:13 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
Laura Flores wrote:
> /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098...
Laura Flores
08:59 PM CephFS Bug #64707: suites/fsstress.sh hangs on one client - test times out
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648870 Laura Flores
08:12 PM Ceph QA QA Run #65420 (QA Closed): wip-yuri2-testing-2024-04-10-1311-squid

--- done. these PRs were included:
https://github.com/ceph/ceph/pull/56069 - squid: rgw: replicate v2 topic/notifi...
Yuri Weinstein
08:07 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
@vshankar ping! Yuri Weinstein
07:17 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
@pdonnell fyi again Yuri Weinstein
06:58 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
@pdonnell fyi Yuri Weinstein
07:46 PM CephFS Fix #65408 (Fix Under Review): qa: under valgrind, restart valgrind/mds when MDS exits with 0
So, the mds_valgrind_exit already exists and is turned on. The original problem in #65314 wasn't caused by a failover... Patrick Donnelly
05:18 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
@vshankar test Patrick Donnelly
05:18 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
@pdonnell test Patrick Donnelly
01:33 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
(Trying to see if redmine adds Venky to the "Watchers" list) Patrick Donnelly
01:32 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
test @vshankar Patrick Donnelly
01:31 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
test @vshankar Patrick Donnelly
01:27 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
cc @vshankar Patrick Donnelly
01:27 PM CephFS Fix #65408 (Fix Under Review): qa: under valgrind, restart valgrind/mds when MDS exits with 0
Instead of issuing a re-... Patrick Donnelly
07:31 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
it's possible that the intent was to preface the call to pull_image with something that logs into the repo on the rem... Dan Mick
05:29 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
Thanks, I see that the pull_image function doesn't honor those settings currently. I have some other somewhat related... John Mulligan
04:56 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
the command that was failing was cephadm.py:pull_image, which invokes sudo cephadm --image <name> pull. I'm not 100%... Dan Mick
02:11 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
In theory it should work. The code in the task translates the yaml paramaters to cli parameters for bootstrap. Here's... John Mulligan
02:02 AM Orchestrator Feature #65398 (New): allow images from private repos in teuthology test/ceph orch/cephadm
It appears as though the cephadm teuthology task supports private registries (those that require username/password lo... Dan Mick
07:06 PM rgw Backport #65351 (Fix Under Review): squid: rgw: crash in lc while transitioning to cloud
Soumya Koduri
06:31 PM CephFS Tasks #64819: data corruption during rmw after lseek
The RC for this issue is fixed by:... Christopher Hoffman
05:39 PM Orchestrator Backport #65419 (New): quincy: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
Backport Bot
05:39 PM Orchestrator Backport #65418 (New): reef: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
Backport Bot
05:39 PM Orchestrator Backport #65417 (Resolved): squid: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
https://github.com/ceph/ceph/pull/56903 Backport Bot
05:37 PM Orchestrator Bug #65122 (Pending Backport): cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
Adam King
05:37 PM RADOS Bug #64460: rados/upgrade/parallel: "[WRN] MON_DOWN: 1/3 mons down, quorum a,b" in cluster log
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648863 Laura Flores
05:33 PM Orchestrator Bug #64868: cephadm/osds, cephadm/workunits: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) in cluster log
Also during stress/split: yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7... Laura Flores
05:31 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648857 Laura Flores
05:31 PM Orchestrator Backport #65416 (New): reef: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
Backport Bot
05:31 PM Orchestrator Backport #65415 (Resolved): squid: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
https://github.com/ceph/ceph/pull/56902 Backport Bot
05:30 PM Orchestrator Bug #65155 (Pending Backport): cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
Adam King
05:29 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
There are many instances of this flag getting set in the test run intentionally, so it makes sense to whitelist.
<pr...
Laura Flores
05:27 PM nvme-of Feature #65259 (Resolved): cephadm - make changes to ceph-nvmeof.conf template
Adam King
05:27 PM nvme-of Backport #65296 (Rejected): squid: cephadm - make changes to ceph-nvmeof.conf template
Handling this backport as part of https://github.com/ceph/ceph/pull/56497 that includes other changes to the nvmeof c... Adam King
05:26 PM Orchestrator Bug #65234: upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648854 Laura Flores
05:23 PM Orchestrator Backport #65414 (Resolved): squid: cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
https://github.com/ceph/ceph/pull/56826 Backport Bot
05:17 PM Orchestrator Bug #64865 (Pending Backport): cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
Adam King
05:10 PM CephFS Bug #50719: xattr returning from the dead (sic!)
Those MDs logs would be everything. they are from the moment I built the MDS services until you requested the logs wh... Matthew Hutchinson
04:52 PM sepia Bug #65413 (New): uid mismatch on some machines
... Patrick Donnelly
04:41 PM rgw Backport #65412 (In Progress): squid: multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
Casey Bodley
04:40 PM rgw Backport #65412 (Resolved): squid: multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
https://github.com/ceph/ceph/pull/56822 Backport Bot
04:36 PM rgw Bug #65373 (Pending Backport): multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
Casey Bodley
02:51 PM rgw Bug #63791: RGW: a subuser with no permission can still list buckets and create buckets
This commit can be backported to quincy reef ? hoan nv
02:25 PM rgw Bug #63791 (Resolved): RGW: a subuser with no permission can still list buckets and create buckets
Casey Bodley
02:40 PM Ceph QA QA Run #65252 (QA Closed): wip-yuri2-testing-2024-04-01-1235-quincy
Yuri Weinstein
05:46 AM Ceph QA QA Run #65252 (QA Approved): wip-yuri2-testing-2024-04-01-1235-quincy
Nitzan Mordechai
05:44 AM Ceph QA QA Run #65252: wip-yuri2-testing-2024-04-01-1235-quincy
@yuriw rados approved Nitzan Mordechai
02:40 PM mgr Backport #65154: quincy: pybind/mgr/devicehealth: "rados.ObjectNotFound: [errno 2] RADOS object not found (Failed to operate read op for oid $dev"
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56480
merged
Yuri Weinstein
02:38 PM CephFS Backport #63823 (Fix Under Review): quincy: cephfs/fuse: renameat2 with flags has wrong semantics
Leonid Usov
02:36 PM bluestore Backport #63914 (Resolved): quincy: Some of ObjectStore/*Deferred* test cases are failing with bluestore_allocator is set to bitmap
Igor Fedotov
02:34 PM Infrastructure Bug #64481: Octo Lab VMWare TestBed Setup Requirement for VMWare Certification Tests
@kramaswamy test comment adam kraitman
02:32 PM Ceph Feature #63801: verified mon backups
Christian Rohmann wrote in #note-2:
> My thoughts would be:
> * Full restore might not always be wanted, so extra...
Daniel Poelzleithner
02:32 PM Dashboard Backport #65026: quincy: mgr/dashboard: Develop a Chinese version for dashboard
Rongqi Sun wrote in #note-2:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/p...
Yuri Weinstein
02:31 PM bluestore Bug #63795: Some of ObjectStore/*Deferred* test cases are failing with bluestore_allocator is set to bitmap
https://github.com/ceph/ceph/pull/55779 merged Yuri Weinstein
02:24 PM Ceph Feature #64436 (Fix Under Review): rgw: add remaining x-amz-replication-status options
Casey Bodley
02:21 PM Ceph QA QA Run #65099 (QA Needs Approval): wip-yuri10-testing-2024-03-24-1159
Yuri Weinstein
02:19 PM Ceph QA QA Run #65270 (QA Needs Approval): wip-yuri6-testing-2024-04-02-1310
Yuri Weinstein
01:38 PM rgw Backport #65411 (In Progress): squid: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
Casey Bodley
01:36 PM rgw Backport #65411 (Resolved): squid: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
https://github.com/ceph/ceph/pull/56820 Backport Bot
01:38 PM rgw Backport #65410 (In Progress): reef: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
Casey Bodley
01:36 PM rgw Backport #65410 (In Progress): reef: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
https://github.com/ceph/ceph/pull/56819 Backport Bot
01:38 PM rgw Backport #65409 (In Progress): quincy: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
Casey Bodley
01:36 PM rgw Backport #65409 (In Progress): quincy: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
https://github.com/ceph/ceph/pull/56818 Backport Bot
01:32 PM rgw Bug #65334 (Pending Backport): Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
Casey Bodley
01:24 PM CephFS Bug #65262 (Triaged): qa/cephfs: kernel_untar_build.sh failed due to build error
Venky Shankar
01:17 PM rgw Backport #65402 (In Progress): squid: persistent topic stats test fails
backport included in https://github.com/ceph/ceph/pull/56069 for https://tracker.ceph.com/issues/64818 Casey Bodley
10:39 AM rgw Backport #65402 (Resolved): squid: persistent topic stats test fails
Backport Bot
12:56 PM CephFS Bug #65350 (Triaged): mgr/snap_schedule: restore yearly spec from uppercase Y to lowercase y
Venky Shankar
12:29 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
@ NotADirectoryError@ is probably not a valid (in-built) exception in some python version. My question is, if this ex... Venky Shankar
08:40 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
Venky Shankar wrote in #note-3:
> Thanks for taking a look, Laura.
>
> Dhariya, please take this one. AFAICT, thi...
Dhairya Parmar
12:23 PM Orchestrator Bug #65407: sequence item 0: expected str instance, dict found
/var/log/user.log:Apr 10 15:03:33 d3p1u01-rc9h7j020-01 ceph-mgr[4176565]: [cephadm ERROR cephadm.serve] Failed to app... Sergei Emelyanov
12:20 PM Orchestrator Bug #65407 (New): sequence item 0: expected str instance, dict found
ceph version 17.2.4 (1353ed37dec8d74973edc3d5d5908c20ad5a7332) quincy (stable)
ceph orch apply -i osd_ssd.yaml
<pre...
Sergei Emelyanov
12:01 PM CephFS Backport #65406 (In Progress): quincy: mds: Reduce log level for messages when mds is stopping
https://github.com/ceph/ceph/pull/57228 Backport Bot
12:01 PM CephFS Backport #65405 (In Progress): reef: mds: Reduce log level for messages when mds is stopping
https://github.com/ceph/ceph/pull/57227 Backport Bot
12:01 PM CephFS Backport #65404 (In Progress): squid: mds: Reduce log level for messages when mds is stopping
https://github.com/ceph/ceph/pull/57224 Backport Bot
11:57 AM CephFS Bug #65260 (Pending Backport): mds: Reduce log level for messages when mds is stopping
Venky Shankar
11:44 AM CephFS Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
Venky Shankar wrote in #note-18:
> So, for some reason this part of the code
>
> [...]
>
> especially derefere...
lei liu
11:34 AM CephFS Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
So, for some reason this part of the code... Venky Shankar
07:58 AM CephFS Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
I haven't been unable to reproduce this with the main branch. If possible, please collect ceph-mds coredump and attac... Venky Shankar
11:28 AM CephFS Bug #65317 (Fix Under Review): cephfs_mirror: update peer status for invalid metadata in remote snapshot
Jos Collin
11:06 AM RADOS Backport #65307 (In Progress): quincy: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
Matan Breizman
11:06 AM RADOS Backport #65306 (In Progress): squid: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
Matan Breizman
11:05 AM RADOS Backport #65305 (In Progress): reef: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
Matan Breizman
10:47 AM CephFS Bug #48680: mds: scrubbing stuck "scrub active (0 inodes in the stack)"
This might be due to enabling of frags as seen in the job description for the job mentioned in comment#4 and probably... Milind Changire
10:39 AM rgw Backport #65403 (New): reef: persistent topic stats test fails
Backport Bot
10:37 AM rgw Bug #65354 (Duplicate): rgw/notifications: topic migration test failures
the issues above are failures due to test issues that were fixed here: https://tracker.ceph.com/issues/63909
sometim...
Yuval Lifshitz
10:32 AM CephFS Bug #65171 (Fix Under Review): Provide metrics support for the Replication Start/End Notifications
Jos Collin
10:31 AM rgw Bug #63909 (Pending Backport): persistent topic stats test fails
Yuval Lifshitz
10:19 AM RADOS Feature #54525: osd/mon: log memory usage during tick
PR: https://github.com/ceph/ceph/pull/56812 junxiang mu
10:15 AM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired

in: /a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636676
th...
Nitzan Mordechai
05:34 AM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
/a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636676 Nitzan Mordechai
09:46 AM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
the valgrind report indicates a crash during sutdown. when we shutdown the kafka manager, we destroy all connections,... Yuval Lifshitz
09:44 AM Messengers Bug #65401: msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
I'm not sure this is by designed or a mistake, so I push a pr for disccussion. pr:https://github.com/ceph/ceph/pull/5... Xinying Song
09:26 AM Messengers Bug #65401 (New): msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
I find the connection between osd and mgr are periodically mark_down due to ms_connection_idle_timeout config.
This ...
Xinying Song
08:56 AM Dashboard Feature #65268 (Resolved): mgr/dashboard: update NVMe-oF API "listener add" sync
Ernesto Puerta
08:56 AM Dashboard Backport #65390 (Resolved): squid: mgr/dashboard: update NVMe-oF API "listener add" sync
Ernesto Puerta
08:33 AM Ceph Bug #52604 (Closed): osd: mkfs: bluestore_stored > 235GiB from start
The fix was merged Konstantin Shalygin
07:55 AM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
Eugen Block wrote in #note-6:
> I know I'm a bit early asking this, but I helped raise this issue and Mykola picked ...
Matan Breizman
07:45 AM Ceph Bug #65400 (New): ceph-exporter
During the run of the ocs-ci tests (for example "test_fsgroupchangepolicy_when_depoyment_scaled") we receive the foll... Aliaksei Makarau
07:02 AM bluestore Bug #65298: Free space can be leaked in Quincy+ when bdev_async_discard is enabled
PR https://github.com/ceph/ceph/pull/56744 should solve this issue Gabriel BenHanokh
06:40 AM crimson Bug #65399 (Fix Under Review): osd crash due to deferred recovery
Crimson OSD will fail if a recovery op is finished after a recovery/backfill is deferred:... Xuehan Xu
05:40 AM RADOS Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636628 Nitzan Mordechai
05:37 AM RADOS Bug #64725: rados/singleton: application not enabled on pool 'rbd'
/a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636638
/a/yuriw-2...
Nitzan Mordechai
05:23 AM Infrastructure Bug #58907: OCI runtime error: runc: runc create failed: unable to start container process
/a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636710 Nitzan Mordechai
04:47 AM CephFS Bug #64977: mds spinlock due to lock contention leading to memory exaustion
The *client.379194623:32785 lookup* request was spinning infinitely in MDS:... Xiubo Li
03:05 AM CephFS Bug #62123 (Fix Under Review): mds: detect out-of-order locking
Xiubo Li
01:11 AM rgw Bug #64803 (Fix Under Review): ninja all on fedora 39 fails because arrow_ext requires C++14
Brad Hubbard

04/09/2024

11:50 PM rgw Bug #65397: rgw: allow disabling mdsearch APIs
PR: https://github.com/ceph/ceph/pull/56802 Seena Fallah
11:48 PM rgw Bug #65397 (Fix Under Review): rgw: allow disabling mdsearch APIs
Since this is visible to the bucket owners, it can be presumed to be a functional feature. Providing the ability to d... Seena Fallah
07:43 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
https://shaman.ceph.com/builds/ceph/wip-yuri6-testing-2024-04-02-1310/a5074d4516d566e9d8b6aec912f26afd099de101/ Yuri Weinstein
07:33 PM Ceph QA QA Run #65270 (QA Needs Rerun/Rebuilt): wip-yuri6-testing-2024-04-02-1310
Yuri Weinstein
07:29 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
Laura Flores wrote in #note-16:
> Hey @yuriw just checking that this run is getting / has gotten rebased and rerun?
...
Yuri Weinstein
06:51 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
Hey @yuriw just checking that this run is getting / has gotten rebased and rerun? Laura Flores
07:31 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
i managed to reproduce under valgrind. this report of use-after-free looks relevant:... Casey Bodley
06:54 PM Orchestrator Bug #64208 (In Progress): test_cephadm.sh: Container version mismatch causes job to fail.
Adam King
06:36 PM Orchestrator Bug #65396 (New): smb service takes a very long time to delete
Executing `ceph orch rm smb.foo` gets stuck in `<deleting>` phase.
I suspect that there may be an issue removing s...
John Mulligan
05:45 PM Ceph QA QA Run #65099: wip-yuri10-testing-2024-03-24-1159
https://shaman.ceph.com/builds/ceph/wip-yuri10-testing-2024-03-24-1159/bf9c5618cc018927fa77780d25b6deb1f5dc254d/ Yuri Weinstein
05:01 PM Ceph QA QA Run #65099 (QA Needs Rerun/Rebuilt): wip-yuri10-testing-2024-03-24-1159
Yuri Weinstein
05:01 PM Ceph QA QA Run #65099: wip-yuri10-testing-2024-03-24-1159
Seems like it needs rebase as I see a recent commit by @rzarzynski
rebasing
Yuri Weinstein
04:23 PM mgr Feature #64318: mgr/prometheus add support for TLS and client cert authentication
Redouane Kachach Elhichou wrote in #note-5:
> Christian Rohmann wrote:
> > Redouane Kachach Elhichou wrote:
> > ...
Christian Rohmann
04:18 PM RADOS Bug #65227: noscrub cluster flag prevents deep-scrubs from starting
https://github.com/ceph/ceph/blob/main/doc/dev/osd_internals/scrub.rst
https://github.com/ceph/ceph/blob/v17.2.7/src...
junxiang mu
03:31 PM Orchestrator Bug #65367 (Resolved): PermissionError: [Errno 13] Permission denied in the fake filesystem
all 4 PRs are now merged. This should no longer occur in any make check runs started after this point. Adam King
03:16 PM Orchestrator Bug #65395 (Fix Under Review): [node-proxy] the agent shouldn't fail when RedFish returns empty data
Guillaume Abrioux
03:11 PM Orchestrator Bug #65395 (Fix Under Review): [node-proxy] the agent shouldn't fail when RedFish returns empty data
If for some reason the redfish returns empty data, node-proxy fails because it can't access non-existing keys, it bas... Guillaume Abrioux
03:06 PM Stable releases Tasks #65393: reef v18.2.3
h3. QE VALIDATION (STARTED 4/8/23)
PRs list => https://pad.ceph.com/p/reef_v18.2.3_QE_PRs_LIST
*%{color:blue}Releas...
Yuri Weinstein
03:02 PM Stable releases Tasks #65393 (New): reef v18.2.3
h3. Workflow
* "Preparing the release":http://ceph.com/docs/master/dev/development-workflow/#preparing-a-new-relea...
Yuri Weinstein
03:05 PM Orchestrator Feature #65394 (In Progress): [node-proxy] implement 'endpoints discovering'
RFE in order to add the required logic in order to make the daemon explore the API for discovering the different endp... Guillaume Abrioux
03:03 PM Orchestrator Bug #65392 (Fix Under Review): [node-proxy] the node-proxy daemon crashes when get_logger() is passed a log level
Guillaume Abrioux
02:48 PM Orchestrator Bug #65392 (Fix Under Review): [node-proxy] the node-proxy daemon crashes when get_logger() is passed a log level
... Guillaume Abrioux
02:59 PM Ceph QA QA Run #65385 (QA Needs Approval): wip-yuri4-testing-2024-04-08-1432
Yuri Weinstein
02:52 PM Ceph Backport #65368 (Resolved): squid: install-deps: enable copr ceph/grpc
Ernesto Puerta
11:28 AM Ceph Backport #65368 (In Progress): squid: install-deps: enable copr ceph/grpc
Ernesto Puerta
02:51 PM Ceph Bug #65184 (Resolved): install-deps: enable copr ceph/grpc
Ernesto Puerta
02:41 PM rgw Bug #65334: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
the barbican repo changed the name of this branch to @unmaintained/xena@ Casey Bodley
02:41 PM Ceph QA QA Run #65360 (QA Closed): wip-yuri-testing-2024-04-07-0902
Yuri Weinstein
09:53 AM Ceph QA QA Run #65360 (QA Approved): wip-yuri-testing-2024-04-07-0902
Ronen Friedman
09:53 AM Ceph QA QA Run #65360: wip-yuri-testing-2024-04-07-0902
Thanks, @lflores
I should have refreshed the page before going over the failures....
Ronen Friedman
02:40 PM Ceph QA QA Run #65252: wip-yuri2-testing-2024-04-01-1235-quincy
@nmordech see `QA Runs:`above Yuri Weinstein
04:57 AM Ceph QA QA Run #65252: wip-yuri2-testing-2024-04-01-1235-quincy
@lflores sure, waiting for the runs.
@yuriw do we have the link to the suites run?
Nitzan Mordechai
02:34 PM Ceph Backport #65391 (In Progress): squid: osd/scrub: "reservation requested while still reserved" error in cluster log
Ronen Friedman
02:15 PM Ceph Backport #65391 (In Progress): squid: osd/scrub: "reservation requested while still reserved" error in cluster log
Backport Bot
02:10 PM Ceph Bug #64827 (Pending Backport): osd/scrub: "reservation requested while still reserved" error in cluster log
Ronen Friedman
12:56 PM Ceph QA QA Run #65237: wip-ceph_test_rados-partial-reads
Fixed the one above and scheduled a rerun: https://pulpito.ceph.com/rzarzynski-2024-04-09_12:48:11-rados-wip-ceph_tes... Radoslaw Zarzynski
12:39 PM rgw Bug #64308: CORS Preflight Failure After Upgrading to 17.2.7
Will the backports make it into the next release of Quincy/Reef? Reid Guyett
11:54 AM Dashboard Backport #65390 (In Progress): squid: mgr/dashboard: update NVMe-oF API "listener add" sync
Ernesto Puerta
09:36 AM Dashboard Backport #65390 (Resolved): squid: mgr/dashboard: update NVMe-oF API "listener add" sync
https://github.com/ceph/ceph/pull/56783 Backport Bot
09:55 AM CephFS Bug #64977: mds spinlock due to lock contention leading to memory exaustion
Posted more logs at fed9e44e-a0ec-4692-ae23-6a1047fe9247 Abhishek Lekshmanan
09:29 AM Dashboard Feature #65268 (Pending Backport): mgr/dashboard: update NVMe-oF API "listener add" sync
Ernesto Puerta
08:41 AM Ceph Bug #61598 (Duplicate): gcc-14: FTBFS "error: call to non-'constexpr' function 'virtual unsigned int DoutPrefixProvider::get_subsys() const'"
Tim Serong
08:08 AM Ceph Feature #63801: verified mon backups
*This is really a good idea to have built-in! Thanks for taking this up!*
We have been using a custom backup scrip...
Christian Rohmann
08:02 AM CephFS Bug #65389 (Fix Under Review): The ceph_readdir function in libcephfs returns incorrect d_reclen value
When @struct dirent@ entries are returned by @ceph_readdir()@ function, the field @d_reclen@ is always 1.
Based on...
Xavi Hernandez
07:19 AM CephFS Bug #65388 (New): The MDS_SLOW_REQUEST warning is flapping even though the slow requests don't go away
I have caught a cluster in an unhealthy state - probably some MDS deadlock that results in requests being blocked (de... Alexander Patrakov
07:14 AM CephFS Bug #65171 (In Progress): Provide metrics support for the Replication Start/End Notifications
Jos Collin
03:14 AM Ceph Support #64378: Slow / Single backfilling on Reef (18.2.1-pve2)
Aha, there's a new feature in Ceph that auto-resets these values:
https://docs.ceph.com/en/quincy/rados/configurat...
Niklas Hambuechen
01:53 AM Ceph Support #64378: Slow / Single backfilling on Reef (18.2.1-pve2)
I observe the same problem on 18.2.1:... Niklas Hambuechen
02:57 AM Linux kernel client Bug #51279: kclient hangs on umount (testing branch)
I have added more debug logs and will dump why the *flushsnap_ack* was dropped directly:... Xiubo Li
01:48 AM Orchestrator Bug #65387 (New): cephadm: Unable to use gather-facts without podman/docker installed
cephadm gather-facts can be used to gather inventory across the hosts to validate hardware prior to deployment. Howev... Paul Cuzner
01:12 AM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
https://shaman.ceph.com/builds/ceph/wip-yuri5-testing-2024-03-21-0833/fbfd55d0098e16f2a4f0d8b71252fe1ef3b65d2a/ Yuri Weinstein
12:22 AM Ceph Bug #65386 (New): rados: create test to validate replica read
RADOS supports the ability to send reads to replicas rather than the primary. The primary use for this feature is to... Samuel Just
 

Also available in: Atom