Activity
From 04/09/2024 to 05/08/2024
Today
- 09:31 PM rbd Backport #65587 (Resolved): squid: insufficient randomness for group and group snapshot IDs
- 08:49 PM Infrastructure Bug #63531: Error authenticating with smithiXXX.front.sepia.ceph.com: SSHException('No existing session') (No SSH private key found!)
- /a/lflores-2024-05-08_14:59:36-rados-wip-lflores-testing-2-2024-05-07-1606-squid-distro-default-smithi/7697531
- 08:34 PM Ceph Backport #65871 (In Progress): quincy: common/StackStringStream: update pointer to newly allocated memory in overflow()
- 08:32 PM Ceph Backport #65871 (In Progress): quincy: common/StackStringStream: update pointer to newly allocated memory in overflow()
- https://github.com/ceph/ceph/pull/57363
- 08:34 PM Ceph Backport #65870 (In Progress): reef: common/StackStringStream: update pointer to newly allocated memory in overflow()
- 08:31 PM Ceph Backport #65870 (In Progress): reef: common/StackStringStream: update pointer to newly allocated memory in overflow()
- https://github.com/ceph/ceph/pull/57362
- 08:33 PM Ceph Backport #65869 (In Progress): squid: common/StackStringStream: update pointer to newly allocated memory in overflow()
- 08:31 PM Ceph Backport #65869 (In Progress): squid: common/StackStringStream: update pointer to newly allocated memory in overflow()
- https://github.com/ceph/ceph/pull/57361
- 08:32 PM Ceph Bug #65805: common/StackStringStream: update pointer to newly allocated memory in overflow()
- Rongqi Sun wrote in #note-4:
> Seems like backport bot doesn't work?
Fixed it (you were not in the "Ceph Develope... - 06:07 AM Ceph Bug #65805: common/StackStringStream: update pointer to newly allocated memory in overflow()
- Seems like backport bot doesn't work?
- 02:51 AM Ceph Bug #65805 (Pending Backport): common/StackStringStream: update pointer to newly allocated memory in overflow()
- 08:25 PM Ceph Bug #63557: NVMe-oF gateway prometheus endpoints
- @pcuzner ok to close this now?
- 08:06 PM CephFS Tasks #64165 (In Progress): Fix warnings in read_sync()
- 07:51 PM teuthology Bug #65868: [Dependencies] Ansible Galaxy: 'CustomHTTPSConnection' object has no attribute 'cert_file'. 'CustomHTTPSConnection' object has no attribute 'cert_file'
- PR that fixes this issue: https://github.com/ceph/teuthology/pull/1937
- 07:50 PM teuthology Bug #65868 (New): [Dependencies] Ansible Galaxy: 'CustomHTTPSConnection' object has no attribute 'cert_file'. 'CustomHTTPSConnection' object has no attribute 'cert_file'
- During docker-compose script....
- 07:12 PM CephFS Bug #65618 (In Progress): qa: fsstress: cannot execute binary file: Exec format error
- Looks to be related to the kclient and inline data.
- 06:39 PM Ceph QA QA Run #65867 (QA Testing): wip-pdonnell-testing-20240508.183908-debug
- * "PR #57334":https://github.com/ceph/ceph/pull/57334 -- mds: remove erroneous debug message
* "PR #57329":https://g... - 05:52 PM Ceph QA QA Run #65594 (QA Approved): wip-yuriw11-testing-20240501.200505-squid
- 05:32 PM RADOS Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686989
- 05:26 PM RADOS Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686880
- 05:18 PM CephFS Tasks #63295 (Resolved): Access semantics
- SetPolicyNonDir test is now done. It passes as it should.
- 04:47 PM rgw Backport #65821 (Resolved): squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- 04:18 PM RADOS Bug #63789: LibRadosIoEC test failure
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7687027
- 03:16 PM rgw Backport #65244 (Resolved): squid: RGW/s3select : several issues, s3select related, some caused a crash.
- 03:15 PM rgw Backport #65666 (Resolved): squid: rgw/lc: A few buckets stuck in UNINITIAL state
- 03:13 PM Ceph QA QA Run #65859 (QA Testing): wip-lflores-testing-2-2024-05-07-1606-squid
- 03:09 PM RADOS Bug #62934: unittest_osdmap (Subprocess aborted) during OSDMapTest.BUG_42485
- seeing @unittest_osdmap (Subprocess aborted)@ failures on squid too, tagged for backport
- 03:02 PM rgw Bug #65866 (New): reef: cannot build arrow with CMAKE_BUILD_TYPE=Debug
- ...
- 01:09 PM CephFS Bug #63538 (Can't reproduce): mds: src/mds/Locker.cc: 2357: FAILED ceph_assert(!cap->is_new())
- 01:09 PM CephFS Bug #61950 (Can't reproduce): mds/OpenFileTable: match MAX_ITEMS_PER_OBJ does not honor osd_deep_scrub_large_omap_object_key_threshold
- 12:55 PM CephFS Feature #63468 (Pending Backport): mds/purgequeue: add l_pq_executed_ops counter
- 12:53 PM CephFS Feature #61903: pybind/mgr/volumes: add config to turn off subvolume deletion
- Where is the fix that is under review? @rishabh-d-dave
- 12:42 PM CephFS Bug #65865 (New): MDS/MDSMonitor: retval of mds fail cmd when non-existent MDS name is passed is zero
- Passing a non-existent MDS's name to the command @ceph mds fail@ returns zero. It should be returning non-zero value ...
- 12:40 PM CephFS Bug #65864 (New): qa/cephfs: tests in TestMDSFail passes FS name to mds fail command
- 12:31 PM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
- These are the current actions tried:...
- 12:27 PM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
- Can we please bump the severity a few levels?
There is loss of production as the MDS are currently not running. Ev... - 07:49 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
- Robert Sander wrote in #note-6:
> Xiubo Li wrote in #note-4:
> > A new report from the ceph-user mail list: https:/... - 11:22 AM Dashboard Bug #65863 (New): ceph-mixin - CephPGImbalance alert not honoring osd device class
- CephPGImbalance alert expression...
- 11:08 AM CephFS Bug #65795 (Fix Under Review): cephfs_mirror: daemon status shows KeyError: 'directory_count'
- 10:49 AM rgw Bug #65862 (Fix Under Review): rgw/cloud-transition: crash with notify->publish_commit
- 10:47 AM rgw Bug #65862 (Fix Under Review): rgw/cloud-transition: crash with notify->publish_commit
- Below crash is observed while running cloud-transition tests at scale -
Thread 594 "wp_thrd: 1, 0" received signal... - 10:04 AM rgw Bug #65861 (New): notifications: report an error when persistent queue deletion failed
- assuming the topic deletion was successfull, we cannot send an error if the queue deletion failed.
since any consequ... - 09:47 AM CephFS Bug #65705 (Fix Under Review): qa: snaptest-multiple-capsnaps.sh failure
- The ceph patch link: https://patchwork.kernel.org/project/ceph-devel/list/?series=851489&archive=both
- 09:25 AM CephFS Bug #64730 (Fix Under Review): fs/misc/multiple_rsync.sh workunit times out
- 09:25 AM Dashboard Bug #65788 (Resolved): mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- 09:25 AM Dashboard Backport #65790 (Resolved): squid: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- 07:56 AM crimson Bug #65753 (Duplicate): [crimson] OSD deployment fails
- 07:53 AM crimson Bug #65857: osd: user_version is inconsistent between object_info and log entries
- I suspcet this is related to how we handle acting set changes which trigger `ClientRequest::Orderer::requeue`.
See: h... - 07:32 AM CephFS Bug #65778: qa: valgrind error: Leak_StillReachable malloc malloc strdup
- @vshankar How do I interpret this ?
I don't see any frame pointing to a ceph file apart from the initial finger poi... - 07:03 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- > Not sure if reactor stall related?
I think not, reactor stall is a warning from seastar if a continuation takes ... - 06:46 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- Yingxin Cheng wrote in #note-6:
> > Indeed, but all about seastore.
>
> Seems correct, revised the title and cate... - 06:39 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- > Indeed, but all about seastore.
Seems correct, revised the title and category. - 06:03 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- Yingxin Cheng wrote in #note-3:
> Seems it can fail anytime regardless of the specific crimson test.
>
> Examples... - 05:13 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- Seems it can fail anytime regardless of the specific crimson test.
Examples:... - 02:32 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- Attach full make check log.
- 06:19 AM CephFS Bug #65841 (Fix Under Review): qa: dead job from `tasks.cephfs.test_admin.TestFSFail.test_with_health_warn_oversize_cache`
- 05:56 AM CephFS Bug #65770: qa: failed to be set on mds daemons: {'mds.imported', 'mds.exported'}
- Jos, start by checking if the workload isn't heavy enough to trigger subtree export/import (which would then update t...
- 03:29 AM teuthology Bug #64828 (In Progress): teuthology-suite: -n option does not sync --ceph/sha1 and --suite-branch/sha1
- 02:35 AM RADOS Bug #59670: Ceph status shows PG recovering when norecover flag is set
- Radoslaw Zarzynski wrote in #note-5:
> The fix has been merged on 5 Jan 2024, so this could fit. It has been bacport... - 12:38 AM mgr Bug #65860 (New): Upgrade test re-opts into new telemetry collections too late
- The test failed from:
/a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-sm... - 12:33 AM mgr Backport #65117 (Resolved): squid: rados/upgrade/parallel: [WRN] TELEMETRY_CHANGED: Telemetry requires re-opt-in
05/07/2024
- 10:21 PM Ceph Bug #55859: Radosgw-admin: illegal instruction, running on commodity hardware
- I recently updated my cluster (AMD Opteron 6134) from Pacific to Reef and ran into this, too. I believe the problem w...
- 09:10 PM Ceph QA QA Run #65859 (QA Building): wip-lflores-testing-2-2024-05-07-1606-squid
- 09:09 PM Ceph QA QA Run #65859 (QA Testing): wip-lflores-testing-2-2024-05-07-1606-squid
- * "PR #57303":https://github.com/ceph/ceph/pull/57303 -- squid:osd/scrub: reinstate scrub reservation queuing
- 09:08 PM CephFS Bug #65389 (Fix Under Review): The ceph_readdir function in libcephfs returns incorrect d_reclen value
- 08:48 PM CephFS Bug #65858 (New): ceph.in: make `ceph tell mds.<fsname>:<rank> help` give help output
- Right now it gives an error:...
- 08:12 PM crimson Bug #65857 (New): osd: user_version is inconsistent between object_info and log entries
- Steps to reproduce:
- create vstart cluster... - 08:07 PM CephFS Backport #65854: quincy: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- https://github.com/ceph/ceph/pull/54469#issuecomment-2099212048
- 07:49 PM CephFS Backport #65854 (New): quincy: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- 08:04 PM Ceph QA QA Run #65856 (QA Testing): wip-pdonnell-testing-20240508.150423-reef
- * "PR #57357":https://github.com/ceph/ceph/pull/57357 -- reef: ceph.spec.in: remove command-with-macro line
* "PR #5... - 08:02 PM CephFS Backport #65855 (In Progress): reef: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- 07:49 PM CephFS Backport #65855 (In Progress): reef: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- https://github.com/ceph/ceph/pull/57343
- 07:59 PM CephFS Backport #65853 (In Progress): squid: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- 07:49 PM CephFS Backport #65853 (In Progress): squid: mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- https://github.com/ceph/ceph/pull/57342
- 07:56 PM CephFS Backport #65844 (In Progress): squid: qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
- 02:41 PM CephFS Backport #65844 (In Progress): squid: qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
- https://github.com/ceph/ceph/pull/57341
- 07:53 PM CephFS Backport #65843 (In Progress): squid: qa: quiesce cache/ops dump not world readable
- 02:40 PM CephFS Backport #65843 (In Progress): squid: qa: quiesce cache/ops dump not world readable
- https://github.com/ceph/ceph/pull/57340
- 07:49 PM CephFS Bug #65733 (Pending Backport): mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- 07:47 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686875
- 07:45 PM Ceph Bug #65852: ceph_test_rados command hits ceph_abort when trying to delete op
- Looks similar to https://tracker.ceph.com/issues/48764
- 07:45 PM Ceph Bug #65852 (New): ceph_test_rados command hits ceph_abort when trying to delete op
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686996...
- 05:28 PM CephFS Bug #65851 (New): MDS Squid Metadata Performance Regression
- Found during 21 MDS IO500 runs comparing v18.2.2 to v19.0.0.
| Ceph Version | Meaurement | v1... - 05:08 PM rgw-testing Backport #65850 (New): reef: notifications: test hangs when http notification fails
- 05:08 PM rgw-testing Backport #65849 (New): squid: notifications: test hangs when http notification fails
- 05:06 PM rgw-testing Bug #65848 (Pending Backport): notifications: test hangs when http notification fails
- 05:06 PM rgw-testing Bug #65848 (Pending Backport): notifications: test hangs when http notification fails
- this is a regression caused by the fix of: https://tracker.ceph.com/issues/63909
so, before doing the backport to sq... - 04:31 PM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686876
- 03:44 PM sepia Support #65847 (New): Sepia Lab Access Request
- 1) Do you just need VPN access or will you also be running teuthology jobs?
VPN access only
2) Desired Username:
... - 03:31 PM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
- While trying to export the journal the following error shows up:...
- 07:56 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
Xiubo Li wrote in #note-4:
> A new report from the ceph-user mail list: https://lists.ceph.io/hyperkitty/list/ceph...- 12:42 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
- Another one https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NDWFYV5XFDCUW5EBRWXEDQFGVFL5HAIV/:
<pr... - 12:37 AM CephFS Bug #60986: crash: void MDCache::rejoin_send_rejoins(): assert(auth >= 0)
- A new report from the ceph-user mail list: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/GOAZLA6NQH...
- 03:12 PM rgw Backport #64425 (Resolved): quincy: rgw: rados objects wrongly deleted
- 03:11 PM rgw Backport #64539 (Resolved): quincy: metadata cache races on deletes
- 03:11 PM rgw Backport #64599 (Resolved): quincy: unittest_rgw_dmclock_scheduler fails for arm64
- 03:08 PM rgw Feature #65050 (Fix Under Review): Add alternative way for providing user name/password for Kafka endpoint authentication
- 03:07 PM CephFS Bug #65846 (Fix Under Review): mds: "invalid message type: 501"
- 02:51 PM CephFS Bug #65846 (Fix Under Review): mds: "invalid message type: 501"
- ...
- 02:41 PM CephFS Backport #65845 (New): reef: qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
- 02:40 PM crimson Bug #65842: unittest-seastore (Failed) on arm64
- similar arm64 failure from @unittest-staged-fltree@ in https://jenkins.ceph.com/job/ceph-pull-requests-arm64/55826/co...
- 02:36 PM crimson Bug #65842 (New): unittest-seastore (Failed) on arm64
- from https://jenkins.ceph.com/job/ceph-pull-requests-arm64/56090/consoleFull#772176351e840cee4-f4a4-4183-81dd-4285561...
- 02:34 PM CephFS Bug #65701 (Pending Backport): qa: quiesce cache/ops dump not world readable
- 02:34 PM CephFS Bug #65700 (Pending Backport): qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
- 02:19 PM rgw Bug #65664 (Fix Under Review): Crash observed in boost::asio module related to stream.async_shutdown()
- 05:57 AM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
- Updating:
Managed to repro the crash repeatedly and verify that the fix PR does resolve the issue.
Details:
Befo... - 02:15 PM Ceph QA QA Run #65771 (QA Approved): wip-pdonnell-testing-20240503.163550-debug
- https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-05-03wip-pdonnell-testing-20240503163550-debug
- 02:06 PM CephFS Bug #65802 (In Progress): Quiesce and rename aren't properly syncrhonized
- 11:07 AM CephFS Bug #65802: Quiesce and rename aren't properly syncrhonized
- Update: having implemented the above I realized that it's just an optimization. The real issue we had was due to the ...
- 10:18 AM CephFS Bug #65802: Quiesce and rename aren't properly syncrhonized
- With the help of @kotresh we have the picture of the deadlock:
1. the dest auth mds xlocks the linklock on both th... - 02:00 PM rgw Bug #59488 (Fix Under Review): [RGW][Notification][Kafka]: event name received as "Noncurrent" instead of "NonCurrent"
- 01:29 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
- Milind Changire wrote in #note-6:
> Venky Shankar wrote in #note-5:
> > Isn't this same as: https://tracker.ceph.co... - 01:24 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
- Patrick Donnelly wrote:
> https://pulpito.ceph.com/pdonnell-2024-03-20_18:16:52-fs-wip-batrick-testing-20240320.1457... - 01:23 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
- Venky Shankar wrote in #note-5:
> Isn't this same as: https://tracker.ceph.com/issues/48562 ?
"object missing on ... - 01:00 PM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
- Isn't this same as: https://tracker.ceph.com/issues/48562 ?
- 11:17 AM CephFS Bug #65020: qa: Scrub error on inode 0x1000000356c (/volumes/qa/sv_0/2f8f6bb4-3ea9-47a0-bd79-a0f50dc149d5/client.0/tmp/clients/client7/~dmtmp/PARADOX) see mds.b log and `damage ls` output for details" in cluster log
- Patrick Donnelly wrote in #note-1:
> Maybe also related: https://pulpito.ceph.com/pdonnell-2024-03-20_18:16:52-fs-wi... - 12:50 PM rgw Bug #65794: Ceph Reef RGW error response fails to be parsed during awscli create-bucket
- Yep, this is behaviour of boto, it parses xml response with...
- 12:16 PM CephFS Bug #65841 (Fix Under Review): qa: dead job from `tasks.cephfs.test_admin.TestFSFail.test_with_health_warn_oversize_cache`
- /teuthology/pdonnell-2024-05-07_01:13:22-fs-wip-pdonnell-testing-20240503.163550-debug-distro-default-smithi/7695097/...
- 12:14 PM Dashboard Backport #65840 (New): reef: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 12:12 PM Dashboard Backport #65839 (New): reef: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 12:12 PM Dashboard Backport #65838 (New): squid: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 12:10 PM CephFS Bug #65837 (Fix Under Review): qa: dead job from waiting to unmount client on deliberately damaged fs
- 12:09 PM CephFS Bug #65837 (Fix Under Review): qa: dead job from waiting to unmount client on deliberately damaged fs
- https://pulpito.ceph.com/pdonnell-2024-05-07_01:13:22-fs-wip-pdonnell-testing-20240503.163550-debug-distro-default-sm...
- 11:59 AM CephFS Bug #65616: pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
- Milind Changire wrote in #note-4:
> @pdonnell Do you wan't me to check the continuity in the timestamps in the snap ... - 11:06 AM CephFS Bug #65616: pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
- @pdonnell Do you wan't me to check the continuity in the timestamps in the snap dir names ?
- 11:46 AM mgr Bug #65836 (New): ceph-mgr cephadm's service discovery not starting
- Hello !
I've seen the https://tracker.ceph.com/issues/63388 issue, and I'm currently facing something a bit simila... - 10:55 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- Dhairya Parmar wrote in #note-8:
> Venky Shankar wrote in #note-7:
> > Dhairya Parmar wrote in #note-6:
> > > Venk... - 10:54 AM Linux kernel client Bug #65835 (New): File reads/writes hang during ceph_llseek with misrouted OSD
- Using v17.2.7 with linux kernel version 5.15.0-105-generic, we've been having issues with rsync hanging on writes, an...
- 10:38 AM Ceph Bug #65834 (New): reef: cephadm: ceph-common package installation with cephadm fails due to the activation of OracleLinux EPEL repository
- Installation of ceph-common package on Oracle Linux 9 with cephadm fails due to the activation of the OracleLinux EPE...
- 10:36 AM Ceph Bug #65833 (New): No binaries for reef for el9 / aarch64
- Hi, I've noticed that no binaries appear to have been released for reef for el9: https://download.ceph.com/rpm-reef/e...
- 10:23 AM CephFS Bug #65829: qa: qa/suites/fs/functional/subvol_versions/ multiplies all jobs in fs:function by 2
- @pdonnell I understood the reorg bit up to v1,v2 ... but not the last "test" part. What's the "test" part ?
- 12:17 AM CephFS Bug #65829 (New): qa: qa/suites/fs/functional/subvol_versions/ multiplies all jobs in fs:function by 2
- This change:
https://github.com/ceph/ceph/pull/53999/files#diff-e00804e3b70b5d89f530c963e9dfa38f43587ae6be9d94687d... - 10:04 AM crimson Bug #65832 (New): crimson osd clone_overlap calculate error
- ...
- 09:56 AM Dashboard Bug #63686 (In Progress): mgr/dashboard: adapt service creation form to support nvmeof creation
- 09:36 AM sepia Support #65831 (New): Sepia Lab access
- 1) Do you just need VPN access or will you also be running teuthology jobs?
I need VPN access and will also be run... - 09:19 AM rgw Feature #65830 (New): rgw: allow send bucket notification to multiple brokers of kafka cluster
- Currently, rgw allow to send message to one node kafka.
add paramerter to config broker list and support send mes... - 09:04 AM crimson Bug #65752: [crimson] OSD deployment fails
> Hi Matan,
>
> Thank you for your response!
> Have these changes been introduced recently?
>
> Asking becau...- 08:59 AM crimson Bug #65752: [crimson] OSD deployment fails
- Matan Breizman wrote in #note-1:
> Hey Harsh,
> Looks like the OSD crashes in "crimson::os::AlienStore::start()".
... - 08:46 AM crimson Bug #65752 (Need More Info): [crimson] OSD deployment fails
- Hey Harsh,
Looks like the OSD crashes in "crimson::os::AlienStore::start()".
I suspect this is about missing essent... - 08:41 AM rbd Backport #65814 (In Progress): squid: [pybind] expose CLONE_FORMAT and FLATTEN image options
- 08:39 AM rbd Backport #65816 (In Progress): reef: [pybind] expose CLONE_FORMAT and FLATTEN image options
- 08:39 AM Ceph QA QA Run #65793: wip-rishabh-testing-20240503.134948
- There are more than 100 failures related to cephadm. These failures occurred again in re-run which confirms this issu...
- 08:33 AM rbd Backport #65815 (In Progress): quincy: [pybind] expose CLONE_FORMAT and FLATTEN image options
- 08:31 AM Ceph QA QA Run #65792: wip-rishabh-testing-20240501.193033
- There are plenty related failures. This PR needs to fixed and tested again. It'll take a new build for it and therefo...
- 08:31 AM Ceph QA QA Run #65792 (QA Closed): wip-rishabh-testing-20240501.193033
- 08:29 AM crimson Bug #65635: Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- https://jenkins.ceph.com/job/ceph-pull-requests-arm64/56052/consoleFull#-14170402636733401c-e9d0-4737-9832-6594c5da0a...
- 08:09 AM rbd Backport #65817 (In Progress): squid: rbd-mirror daemon in ERROR state, require manual restart
- 08:08 AM rbd Backport #65819 (In Progress): reef: rbd-mirror daemon in ERROR state, require manual restart
- 08:06 AM rbd Backport #65818 (In Progress): quincy: rbd-mirror daemon in ERROR state, require manual restart
- 07:59 AM CephFS Bug #65388: The MDS_SLOW_REQUEST warning is flapping even though the slow requests don't go away
- Venky, no, not yet. I haven't gotten back to this with the quiesce work that keeps coming. I'll try to continue where...
- 07:18 AM Ceph Bug #63494 (Pending Backport): all: daemonizing may release CephContext:: _fork_watchers_lock when its already unlocked
- 07:16 AM ceph-volume Bug #64260 (Resolved): ceph-volume lvm migrate could assert on AttributeError:'NoneType' object has no attribute 'path'
- 07:16 AM ceph-volume Backport #64356 (Resolved): quincy: ceph-volume lvm migrate could assert on AttributeError:'NoneType' object has no attribute 'path'
- 06:55 AM rgw Bug #65436: Getting Object Crashing radosgw services
- Reid Guyett wrote in #note-8:
> What did you do to fix it at the proxy layer? Strip the parameters from the URL?
... - 06:25 AM CephFS Bug #64209 (Duplicate): snaptest-multiple-capsnaps.sh fails with "got remote process result: 1"
- This is the same issue with https://tracker.ceph.com/issues/65705.
- 05:47 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
- Venky Shankar wrote in #note-4:
> Xiubo Li wrote in #note-3:
> > Venky Shankar wrote in #note-2:
> > > Xiubo, this... - 04:58 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
- Xiubo Li wrote in #note-3:
> Venky Shankar wrote in #note-2:
> > Xiubo, this is using the distro kernel. Maybe the ... - 04:01 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
- Venky Shankar wrote in #note-2:
> Xiubo, this is using the distro kernel. Maybe the relevant kclient fixes haven't ye... - 03:51 AM CephFS Bug #65705: qa: snaptest-multiple-capsnaps.sh failure
- Xiubo, this is using the distro kernel. Maybe the relevant kclient fixes haven't yet landed in the distro kernel?
- 04:38 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 04:38 AM Dashboard Bug #64321 (Fix Under Review): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 04:37 AM RADOS Bug #65768: rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
- @rzarzynski I found this during a review of a squid run that included a couple of my PRs.
I wasn't working on this, ... - 04:26 AM RADOS Bug #65737 (Fix Under Review): pg-split-merge.sh -
- 04:24 AM RADOS Bug #65737: pg-split-merge.sh -
- Radoslaw Zarzynski wrote in #note-1:
> Ni Nitzan! Are you working on this tracker maybe?
yes, I'm researching it.
05/06/2024
- 11:02 PM rgw Bug #65828 (New): radosgw process killed with "Out of memory" while executing query "select * from s3object limit 1" on a 12GB parquet file
(coppied from https://bugzilla.redhat.com/show_bug.cgi?id=2275323)
Description of problem:
radosgw process kill...- 10:35 PM rgw Bug #65794: Ceph Reef RGW error response fails to be parsed during awscli create-bucket
- Peter Razumovsky wrote:
> It seems this is due to RGW starts returning message tag with empty body since Ceph Reef:... - 10:26 PM rgw Backport #59614 (Resolved): reef: s3 error response missing Message field
- 09:32 PM RADOS Bug #56770: crash: void OSDShard::register_and_wake_split_child(PG*): assert(p != pg_slots.end())
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7686929
- 09:28 PM Orchestrator Bug #65732: rados/cephadm/osds: job times out during nvme_loop interval
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7687095
- 08:56 PM RADOS Bug #65749: osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
- Radoslaw Zarzynski wrote in #note-4:
> Is there any trace of autoscaler-induced PG splitting visible during the situ... - 07:32 PM RADOS Bug #65749: osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
- Is there any trace of autoscaler-induced PG splitting visible during the situation?
- 08:56 PM rgw Bug #59380: rados/singleton-nomsgr: test failing from "Health check failed: 1 full osd(s) (OSD_FULL)" and "Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)"
- /a/yuriw-2024-05-02_23:59:28-rados-wip-yuriw11-testing-20240501.200505-squid-distro-default-smithi/7687043
- 07:36 PM Orchestrator Bug #65827 (New): qa/tasks/cephadm: logrotation is not done every 15 minutes as in the ceph.py task
- This can lead to massive logs which are not compressed until the very end of the test:...
- 07:21 PM Ceph QA QA Run #65349 (QA Closed): wip-yuri3-testing-2024-04-05-0825
- 07:21 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @ksirivad also pls add a link into the PRS next time
like in https://github.com/ceph/ceph/pull/56515#issuecomment-... - 07:11 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @ksirivad pls assign to me and/or change the status to "QA Approved" in the future, TIA
- 06:52 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- RADOS APPROVED https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65349
@yuriw - 06:59 PM CephFS Tasks #64164 (In Progress): verify st_blocks is correct
- It appears logic the #warning is addressing matches non-fscrypt directories behavior. This logic has been here since ...
- 06:56 PM RADOS Bug #65737: pg-split-merge.sh -
- Ni Nitzan! Are you working on this tracker maybe?
- 06:47 PM RADOS Bug #65826 (New): test_default_progress_test (tasks.mgr.test_progress.TestProgress) remove_pool assert pool_name in self.pool
- /a/yuriw-2024-05-01_22:15:10-rados-wip-yuri3-testing-2024-04-05-0825-distro-default-smithi/7684757/...
- 06:46 PM RADOS Bug #53544: src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in thrash_cache_writeback_proxy_none tests
- Cache tiering is unsupported.
- 06:44 PM RADOS Bug #65765: squid: rados/test.sh: LibRadosWatchNotifyECPP.WatchNotify test of api_watch_notify_pp suite didn't complete.
- Hi Nitzan! Would you mind taking a look?
- 06:42 PM RADOS Bug #65686: ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
- Thanks for finding it, Mohit!
- 06:26 PM RADOS Bug #65825 (New): test_python.sh TestIoctx.test_locator failes: Classic
/a/yuriw-2024-05-01_22:15:10-rados-wip-yuri3-testing-2024-04-05-0825-distro-default-smithi/7684635...- 06:14 PM Orchestrator Bug #65824 (New): rados/thrash-old-clients: cluster [WRN] Health detail: HEALTH_WARN noscrub flag(s) set" in cluster log
- /a/yuriw-2024-05-01_22:15:10-rados-wip-yuri3-testing-2024-04-05-0825-distro-default-smithi/7684599/
Just need to w... - 06:05 PM CephFS Bug #65823 (Fix Under Review): qa/tasks/quiescer: dump ops in parallel
- 06:03 PM CephFS Bug #65823 (Fix Under Review): qa/tasks/quiescer: dump ops in parallel
- Since this --flags=locks takes the mds_lock and dumps thousands of ops, this
may take a long time to complete for ea... - 06:00 PM RADOS Bug #65768: rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
- Sridhar, are you working on this?
- 05:59 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- IIRC Ronen has a hypothesis this is another incarnation of https://tracker.ceph.com/issues/65185.
The has been merge... - 05:54 PM RADOS Bug #63198 (In Progress): rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
- 05:50 PM RADOS Bug #55750: mon: slow request of very long time
- Hi Dan! Do you have the corresponding mon's log by any chance?
- 05:50 PM RADOS Bug #55750: mon: slow request of very long time
- ...
- 05:49 PM CephFS Tasks #65811 (Resolved): Make dbench work on fscrypt
- dbench completes successfully without any errors.
- 01:51 PM CephFS Tasks #65811 (Resolved): Make dbench work on fscrypt
- Ensure dbench will work on top of fuse w/fscrypt.
- 05:34 PM rgw Backport #65822 (In Progress): reef: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- 05:19 PM rgw Backport #65822 (In Progress): reef: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- https://github.com/ceph/ceph/pull/57301
- 05:31 PM CephFS Tasks #65812 (Resolved): pwrite failure on overwrite
- The issue was a fix I did in an earlier commit. The reproducer should do the read in the start block. The bool for st...
- 02:00 PM CephFS Tasks #65812 (Resolved): pwrite failure on overwrite
- Failure on pwrite on overwrite when end of write is past previous end of file.
Steps to reproduce:... - 05:21 PM rgw Backport #65821 (In Progress): squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- 05:19 PM rgw Backport #65821 (Resolved): squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- https://github.com/ceph/ceph/pull/57300
- 05:19 PM rgw Bug #65746 (Pending Backport): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- 05:12 PM mgr Tasks #47108 (Resolved): mgr/restful: Document deprecation of the restful module in favor of the Ceph Dashboard REST API in the restful documentation
- Done in the parent task: https://tracker.ceph.com/issues/47066
- 05:11 PM mgr Tasks #47067 (Resolved): mgr/restful: communicate the deprecation of the restful module in favor of the Ceph Dashboard REST API
- 05:10 PM mgr Tasks #47066 (Fix Under Review): mgr/restful: Deprecate the "restful" module in favor of the Ceph Dashboard REST API
- 04:53 PM CephFS Bug #65820 (New): qa/tasks/fwd_scrub: Traceback in teuthology.log for normal exit condition
- ...
- 04:40 PM rgw Bug #65436: Getting Object Crashing radosgw services
- What did you do to fix it at the proxy layer? Strip the parameters from the URL?
- 03:45 PM rbd Backport #65819 (In Progress): reef: rbd-mirror daemon in ERROR state, require manual restart
- https://github.com/ceph/ceph/pull/57306
- 03:45 PM rbd Backport #65818 (In Progress): quincy: rbd-mirror daemon in ERROR state, require manual restart
- https://github.com/ceph/ceph/pull/57305
- 03:45 PM rbd Backport #65817 (In Progress): squid: rbd-mirror daemon in ERROR state, require manual restart
- https://github.com/ceph/ceph/pull/57307
- 03:44 PM rbd Backport #65816 (In Progress): reef: [pybind] expose CLONE_FORMAT and FLATTEN image options
- https://github.com/ceph/ceph/pull/57309
- 03:44 PM rbd Backport #65815 (In Progress): quincy: [pybind] expose CLONE_FORMAT and FLATTEN image options
- https://github.com/ceph/ceph/pull/57308
- 03:44 PM rbd Backport #65814 (In Progress): squid: [pybind] expose CLONE_FORMAT and FLATTEN image options
- https://github.com/ceph/ceph/pull/57310
- 03:44 PM rbd Feature #65624 (Pending Backport): [pybind] expose CLONE_FORMAT and FLATTEN image options
- 03:42 PM rbd Bug #65487 (Pending Backport): rbd-mirror daemon in ERROR state, require manual restart
- 03:33 PM rbd Bug #65813 (New): [test] fsx can call posix_memalign() with size == 0
- While legal, it's specified as implementation-defined:
> If the size of the space requested is 0, the behavior is im... - 03:09 PM Ceph QA QA Run #65454: wip-vshankar-testing-20240411.061452
- Dropped a couple of offending PRs and one more that got merged by another dev.
- 02:41 PM RADOS Bug #58461 (Closed): osd/scrub: replica-response timeout is handled without locking the PG
- 02:41 PM RADOS Bug #58461: osd/scrub: replica-response timeout is handled without locking the PG
- 49687 was never merged. Instead - the whole timeout implementation was discarded (Squid)
- 02:38 PM RADOS Bug #61457 (Can't reproduce): PgScrubber: shard blocked on an object for too long
- 02:36 PM RADOS Backport #63370 (Resolved): quincy: use-after-move in OSDService::build_incremental_map_msg()
- 02:35 PM RADOS Bug #63310 (Resolved): use-after-move in OSDService::build_incremental_map_msg()
- 02:32 PM RADOS Bug #63509 (Resolved): osd/scrub: some replica states specified incorrectly
- 02:31 PM RADOS Bug #63509: osd/scrub: some replica states specified incorrectly
- 54460 was made obsolete by https://github.com/ceph/ceph/pull/54482.
- 02:29 PM RADOS Bug #64346 (In Progress): TEST_dump_scrub_schedule fails from "key is query_is_future: negation:0 # expected: false # in actual: true"
- A test script error. In progress
- 02:27 PM RADOS Bug #64972 (Resolved): qa: "ceph tell 4.3a deep-scrub" command not found
- 02:25 PM RADOS Backport #65072 (Resolved): squid: rados/thrash: slow reservation response from 1 (115547ms) in cluster log
- 02:24 PM RADOS Backport #65374 (Resolved): squid: qa: "ceph tell 4.3a deep-scrub" command not found
- 02:24 PM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 02:23 PM Dashboard Bug #64321 (Resolved): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 12:08 PM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 12:08 PM Dashboard Bug #64321 (New): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 11:06 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 11:06 AM Dashboard Bug #64321 (Fix Under Review): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 08:46 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 08:46 AM Dashboard Bug #64321 (Fix Under Review): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 02:23 PM RADOS Backport #65646 (Resolved): squid: osd/scrub: must disable reservation timeout for reserver-based requests
- 02:09 PM Ceph QA QA Run #65688: wip-yuri4-testing-2024-04-29-0642
- @lflores will your PR address these?
- 01:05 PM Ceph QA QA Run #65688 (QA Needs Rerun/Rebuilt): wip-yuri4-testing-2024-04-29-0642
- @yuriw There are a high number of failures (26 nos on Rados) still
related to infra issue as mentioned in #note-3 on... - 01:51 PM Orchestrator Bug #65810 (New): mgr/cephadm: update PROMETHEUS_API_HOST if prometheus fails over to another node
- We need to update the PROMETHEUS_API_HOST in dashboard config if prometheus fails over to another node. This process ...
- 01:46 PM Ceph Feature #64335 (Resolved): Add alerts to ceph monitoring stack for the nvmeof gateways
- 01:46 PM Ceph Backport #65539 (Resolved): squid: Add alerts to ceph monitoring stack for the nvmeof gateways
- 01:45 PM Ceph Backport #65540 (Resolved): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
- 01:16 PM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- Venky Shankar wrote in #note-7:
> Dhairya Parmar wrote in #note-6:
> > Venky Shankar wrote in #note-5:
> > > Dhair... - 12:48 PM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- Dhairya Parmar wrote in #note-6:
> Venky Shankar wrote in #note-5:
> > Dhairya Parmar wrote in #note-4:
> > > @vsh... - 12:38 PM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- Venky Shankar wrote in #note-5:
> Dhairya Parmar wrote in #note-4:
> > @vshankar @patrick can you update this track... - 09:31 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- Dhairya Parmar wrote in #note-4:
> @vshankar @patrick can you update this tracker with the discussion you guys had o... - 01:10 PM CephFS Bug #65783 (Duplicate): qa: cluster [WRN] Health detail: HEALTH_WARN 1 osds down; Degraded data redundancy
- Duplicate of https://tracker.ceph.com/issues/65700
- 01:09 PM CephFS Bug #65803: mds: some asok commands wait with asok thread blocked
- The actual issue is that `flush journal` command is synchronous: it's blocking the admin socket thread in the mds:
... - 01:09 PM CephFS Bug #65777 (Duplicate): qa: error during scrub thrashing
- 01:06 PM CephFS Bug #65782: qa: test_flag_scrub_mdsdir (tasks.cephfs.test_scrub_checks.TestScrubChecks)
- This does not show up in main branch run. The test branch was testing uninline-data[0] feature.
[0]: https://githu... - 01:06 PM CephFS Bug #65780: qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
- Duplicate of 65779
- 01:05 PM CephFS Bug #65780 (Duplicate): qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
- 01:02 PM Orchestrator Bug #65809 (New): cephadm: ignore NVMEoF daemon after mons in staggered upgrade
- This PR https://github.com/ceph/ceph/pull/54671 is going to make the NVMEoF daemon dependent on the mons. That means ...
- 12:53 PM CephFS Bug #65801 (Triaged): mgr/snap_schedule: restrict retention spec multiplier set
- 12:50 PM CephFS Bug #65808 (New): Test failure: test_idem_unaffected_root_squash (tasks.cephfs.test_admin.TestFsAuthorizeUpdate)
- ...
- 12:49 PM CephFS Bug #65781: qa: ceph version 17.2.7-917.gd69ee407 was not installed, found 17.2.7-913.g8c431824.el8.
- How is this cephfs related, Milind?
- 12:44 PM CephFS Feature #65637: mds: continue sending heartbeats during recovery when MDS journal is large
- I'm taking this one (since I already own https://tracker.ceph.com/issues/61863)
- 12:18 PM CephFS Bug #65388: The MDS_SLOW_REQUEST warning is flapping even though the slow requests don't go away
- Leonid, any updates on this?
- 12:17 PM CephFS Bug #65604 (Triaged): dbench.sh workload times out after 3h when run with-quiescer
- 12:15 PM CephFS Bug #65604: dbench.sh workload times out after 3h when run with-quiescer
- Venky Shankar wrote in #note-2:
> There already a tracker for this. Will dig it up and link.
I guess the tracker was... - 10:18 AM CephFS Bug #65807 (New): qa failure: test_adding_multiple_caps (tasks.cephfs.test_admin.TestFsAuthorize)
- From https://pulpito.ceph.com/vshankar-2024-05-01_17:34:00-fs-wip-vshankar-testing-20240430.111407-debug-testing-defa...
- 09:25 AM Dashboard Backport #65758 (Resolved): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- 09:24 AM Dashboard Backport #65789 (Resolved): reef: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- 08:40 AM crimson Bug #65806 (New): IO hangs when issuing balanced/localized reads to replica crimson osds while the pg is still peering
- The IO request in the following log never ends, this is because it's waiting for the pg to be active, but the current...
- 07:27 AM Ceph QA QA Run #65680: wip-mchangir-testing-20240429.064231-main-debug
- Milind Changire wrote in #note-2:
> 41 Jobs Failed out of 206
> Most of them are known (documented issues).
> Few ... - 07:25 AM CephFS Bug #65766: qa: perm denied for runing find on cephtest dir
- The error starts right after generic/099 finishes....
- 06:46 AM CephFS Bug #63514: mds: avoid sending inode/stray counters as part of health warning for standby-replay
- Rishabh Dave wrote in #note-6:
> Venky, should we skip backporting this from Quincy? Or is it still valid?
Let's ... - 05:57 AM CephFS Bug #50719: xattr returning from the dead (sic!)
- Matthew Hutchinson wrote in #note-27:
> HI, Can I get an update on this?
Hi Matthew,
Please try to reproduce it and... - 05:55 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- It's a kclient side bug.
- 05:49 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Mykola Golub wrote in #note-9:
> Xiubo Li wrote in #note-6:
> > Venky Shankar wrote in #note-5:
>
> > > The trac... - 04:42 AM Ceph Bug #65805 (Fix Under Review): common/StackStringStream: update pointer to newly allocated memory in overflow()
- 03:14 AM Ceph Bug #65805 (Pending Backport): common/StackStringStream: update pointer to newly allocated memory in overflow()
- When sanitizer is enabled, unittest_log fails as following...
- 02:10 AM crimson Bug #65804 (New): CEPH_OSD_OP_CHECKSUM got "invalid argument"
- crimson-osd's handling of CEPH_OSD_OP_CHECKSUM is not idempotent, if a client request got interrupted and requeued af...
- 02:01 AM CephFS Bug #64572: workunits/fsx.sh failure
- This is a following fix for it https://github.com/ceph/ceph/pull/57275.
- 01:52 AM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
- See https://github.com/ceph/ceph/blob/16021434f3f18d548e35cad33faea4e5978ffe4f/src/crimson/mgr/client.cc#L102-L105
...
05/05/2024
- 09:22 PM CephFS Bug #65803 (Fix Under Review): mds: some asok commands wait with asok thread blocked
- 08:23 PM CephFS Bug #65803 (Fix Under Review): mds: some asok commands wait with asok thread blocked
- Teuthology script is often running slow, as much as it is not able to keep timed events, arriving late and finding a ...
- 08:09 PM CephFS Bug #65802 (In Progress): Quiesce and rename aren't properly syncrhonized
- Detected in this run: https://pulpito.ceph.com/pdonnell-2024-05-03_22:48:16-fs-wip-pdonnell-testing-20240503.163550-d...
- 12:48 PM Ceph Bug #65791: Enable Ceph to benefit from a faster CRC32 implementation
- Small correction to the above -- this should be the PCLMUL instruction set, and not SSE4.1.
- 09:28 AM RADOS Bug #50608 (Fix Under Review): ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Still relevant:
http://telemetry.front.sepia.ceph.com:4000/d/Nvj6XTaMk/spec-search?orgId=1&var-substr_1=PrimaryLog...
05/04/2024
- 04:48 PM Ceph QA QA Run #65798 (QA Needs Approval): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
- 02:26 PM Ceph QA QA Run #65798: wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
- retriggered centos8 https://jenkins.ceph.com/job/ceph-dev-new/80454/
- 10:57 AM CephFS Bug #65801 (Triaged): mgr/snap_schedule: restrict retention spec multiplier set
- The accepted retention spec multiplier set is a union of [a-z] and [A-Z].
This causes confusion with too many meanin... - 10:50 AM Ceph QA QA Run #65680: wip-mchangir-testing-20240429.064231-main-debug
- 41 Jobs Failed out of 206
Most of them are known (documented issues).
Few of them are new and have been added to th... - 07:59 AM rbd Documentation #65800: Improve "rbd flatten" documentation
- https://tracker.ceph.com/issues/40486 - "rbd migration" command documentation tracker
- 07:56 AM rbd Documentation #65800 (In Progress): Improve "rbd flatten" documentation
- build/src/pybind/mgr/dashboard/frontend/dist/en-US/default-src_app_ceph_block_block_module_ts.js: titleTex...
- 01:25 AM rgw Bug #65436: Getting Object Crashing radosgw services
- Reid Guyett wrote in #note-6:
> We are also blocked by https://tracker.ceph.com/issues/64308 in moving to 17.2.7.
...
05/03/2024
- 11:49 PM Ceph QA QA Run #65796 (QA Needs Approval): wip-yuriw-testing-20240503.181540-main (wrong wip name, should be wip-yuri2-testing*)
- 06:15 PM Ceph QA QA Run #65796 (QA Needs Approval): wip-yuriw-testing-20240503.181540-main (wrong wip name, should be wip-yuri2-testing*)
- * "PR #57216":https://github.com/ceph/ceph/pull/57216 -- mgr/balancer: set upmap_max_deviation to 1
* "PR #56937":ht... - 11:48 PM Ceph QA QA Run #65797 (QA Needs Approval): wip-yuriw-testing-20240503.213344-main (wip-yuri5-testing)
- 09:34 PM Ceph QA QA Run #65797 (QA Needs Approval): wip-yuriw-testing-20240503.213344-main (wip-yuri5-testing)
- * "PR #57015":https://github.com/ceph/ceph/pull/57015 -- bluefs: bluefs alloc unit should only be shrink
* "PR #5698... - 11:45 PM Ceph QA QA Run #65798 (QA Testing): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
- 11:43 PM Ceph QA QA Run #65798 (QA Needs Approval): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
- 09:35 PM Ceph QA QA Run #65798 (QA Needs Approval): wip-yuriw-testing-20240503.213524- (wip-yuri7-testing)
- * "PR #57137":https://github.com/ceph/ceph/pull/57137 -- osd: CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag is passed from...
- 11:41 PM Orchestrator Bug #65799 (Fix Under Review): cephadm: [progress WARNING root] complete: ev {UUID} does not exist
- 11:28 PM Orchestrator Bug #65799: cephadm: [progress WARNING root] complete: ev {UUID} does not exist
- some upstream users reported this issue earlier : https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VQT...
- 11:27 PM Orchestrator Bug #65799 (Fix Under Review): cephadm: [progress WARNING root] complete: ev {UUID} does not exist
- The cephadm module, while applying service specs, creates a progress event for the daemons to be added or deleted fro...
- 07:51 PM RADOS Bug #55750: mon: slow request of very long time
- Still getting this on reef 18.2.1
- 06:49 PM rgw Bug #65436: Getting Object Crashing radosgw services
- We are also blocked by https://tracker.ceph.com/issues/64308 in moving to 17.2.7.
- 12:58 PM rgw Bug #65436: Getting Object Crashing radosgw services
- So the solution is to upgrade RGW, delete and recreate the bucket?
Since we do not own or control the data being u... - 08:48 AM rgw Bug #65436: Getting Object Crashing radosgw services
- Did you try with error file on old bucket ?
Error file can't fix by upgrade ceph. You need delete error file or all ... - 04:02 PM RADOS Bug #52657 (Pending Backport): MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- 01:39 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- https://github.com/ceph/ceph/pull/49619 merged
- 03:58 PM CephFS Bug #65795 (Fix Under Review): cephfs_mirror: daemon status shows KeyError: 'directory_count'
- ceph fs snapshot mirror daemon status gives KeyError: 'directory_count' when mirroring is disabled and enabled repeat...
- 03:38 PM Ceph QA QA Run #65641 (QA Needs Approval): wip-yuriw8-testing-20240424.000125-main
- 03:12 PM rgw Backport #63857: quincy: notification: etag is missing in CompleteMultipartUpload event
- please hold off on this backport until https://tracker.ceph.com/issues/65746 is resolved. these should be backported ...
- 03:07 PM rgw Bug #65746 (Fix Under Review): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- 04:05 AM rgw Bug #65746 (Triaged): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
- i think this was a regression from https://github.com/ceph/ceph/pull/54569, which moved @meta_obj->delete_object()@ f...
- 03:05 PM rgw Bug #65794 (New): Ceph Reef RGW error response fails to be parsed during awscli create-bucket
- Using Ceph Reef v18.2.2 (with Rook v1.13.5 but it is not important here). Running tests with RGW bucket quota exceed ...
- 02:01 PM Ceph QA QA Run #65793 (QA Testing): wip-rishabh-testing-20240503.134948
- https://github.com/ceph/ceph/pull/55956
- 01:59 PM Ceph QA QA Run #65792 (QA Closed): wip-rishabh-testing-20240501.193033
- https://github.com/ceph/ceph/pull/55144
- 01:58 PM Ceph QA QA Run #65764 (QA Closed): wip-rishabh-testing-20240426.111959
- 07:09 AM Ceph QA QA Run #65764: wip-rishabh-testing-20240426.111959
- Rishabh Dave wrote in #note-2:
> There were lots of new failures along the usual ones. These new failures were cause... - 01:45 PM bluestore Bug #58274 (Pending Backport): BlueStore::collection_list becomes extremely slow due to unbounded rocksdb iteration
- 01:38 PM bluestore Bug #58274: BlueStore::collection_list becomes extremely slow due to unbounded rocksdb iteration
- https://github.com/ceph/ceph/pull/49438 merged
- 01:43 PM CephFS Bug #65246: qa/cephfs: test_multifs_single_path_rootsquash (tasks.cephfs.test_admin.TestFsAuthorize)
- Venky,no need to backport to Quincy, right? The PR that introduced this bug hasn't been backported at all.
- 01:43 PM CephFS Bug #65246 (Pending Backport): qa/cephfs: test_multifs_single_path_rootsquash (tasks.cephfs.test_admin.TestFsAuthorize)
- 01:42 PM Ceph QA QA Run #65560 (QA Closed): wip-yuri5-testing-2024-04-17-1400
- 07:16 AM Ceph QA QA Run #65560 (QA Approved): wip-yuri5-testing-2024-04-17-1400
- Rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65560
- 01:10 PM Ceph Bug #65791 (New): Enable Ceph to benefit from a faster CRC32 implementation
- ISA-L, a component in Ceph (https://github.com/ceph/isa-l/tree/bee5180a1517f8b5e70b02fcd66790c623536c5d) provides mul...
- 11:58 AM Dashboard Backport #65789 (In Progress): reef: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- 11:54 AM Dashboard Backport #65789 (Resolved): reef: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- https://github.com/ceph/ceph/pull/57255
- 11:56 AM Dashboard Backport #65790 (In Progress): squid: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- 11:54 AM Dashboard Backport #65790 (Resolved): squid: mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- https://github.com/ceph/ceph/pull/57254
- 11:49 AM Dashboard Bug #65788 (Resolved): mgr/dashboard: add prometheus federation config for mullti-cluster monitoring
- Introduce prometheus fedeartion in ceph dashboard. This is done by adding a federate job to the prometheus configurat...
- 11:16 AM Dashboard Backport #65787 (New): squid: mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
- 11:16 AM Dashboard Backport #65786 (New): reef: mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
- 11:07 AM Dashboard Backport #65785 (New): squid: mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
- 11:05 AM Dashboard Bug #65760 (Pending Backport): mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
- 10:45 AM Orchestrator Feature #53562: cephadm doesn't support osd crush_location_hook
- Any update on this feature request? We use custom crush location hooks as well.
- 09:03 AM Orchestrator Feature #65784 (Fix Under Review): bump loki/promtail to 3.0.0
- We currently ship with old versions of loki/promtail.
loki/promtail released 3.0.0 recently. - 08:56 AM Ceph QA QA Run #65454 (QA Needs Approval): wip-vshankar-testing-20240411.061452
- 08:36 AM CephFS Bug #65350 (Pending Backport): mgr/snap_schedule: restore yearly spec from uppercase Y to lowercase y
- 07:58 AM CephFS Bug #63514: mds: avoid sending inode/stray counters as part of health warning for standby-replay
- Venky, should we skip backporting this from Quincy? Or is it still valid?
- 07:57 AM CephFS Bug #63514 (Pending Backport): mds: avoid sending inode/stray counters as part of health warning for standby-replay
- 07:44 AM CephFS Feature #61866 (Pending Backport): MDSMonitor: require --yes-i-really-mean-it when failing an MDS with MDS_HEALTH_TRIM or MDS_HEALTH_CACHE_OVERSIZED health warnings
- 07:42 AM CephFS Bug #65314: valgrind error: Leak_PossiblyLost posix_memalign UnknownInlinedFun ceph::buffer::v15_2_0::list::refill_append_space(unsigned int)
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 07:40 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 07:32 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- Rishabh Dave wrote in #note-5:
> @Xiubo @Venky, The PR has been merged since QA run was successful. IMO this issue w... - 07:30 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- @Xiubo @Venky, The PR has been merged since QA run was successful. IMO this issue will need backports too. Please che...
- 07:31 AM CephFS Bug #64927: qa/cephfs: test_cephfs_mirror_blocklist raises "KeyError: 'rados_inst'"
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 07:25 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Mykola, thank you for the update. It's very likely these config setting has exposed a bug in the MDS. I did a quick s...
- 07:21 AM CephFS Bug #65783: qa: cluster [WRN] Health detail: HEALTH_WARN 1 osds down; Degraded data redundancy
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 07:07 AM CephFS Bug #65783 (Duplicate): qa: cluster [WRN] Health detail: HEALTH_WARN 1 osds down; Degraded data redundancy
- "Teuthology Job":https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-d...
- 07:19 AM teuthology Bug #62937: Command failed on smithi027 with status 3: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 07:18 AM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
- Neeraj, please take this one.
- 02:28 AM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
- Patrick Donnelly wrote in #note-2:
> Venky Shankar wrote in #note-1:
> > @pdonnell Are you talking about TestShutdo... - 07:11 AM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681133/
- 07:07 AM Orchestrator Bug #64118: cephadm: RuntimeError: Failed command: apt-get update: E: The repository 'https://download.ceph.com/debian-quincy jammy Release' does not have a Release file.
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681072/...
- 07:01 AM CephFS Bug #65564: Test failure: test_snap_schedule_subvol_and_group_arguments_08 (tasks.cephfs.test_snap_schedules.TestSnapSchedulesSubvolAndGroupArguments)
- main:
http://qa-proxy.ceph.com/teuthology/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-... - 06:56 AM CephFS Bug #65782 (New): qa: test_flag_scrub_mdsdir (tasks.cephfs.test_scrub_checks.TestScrubChecks)
- ...
- 06:50 AM teuthology Bug #61576: teuthology.exceptions.AnsibleFailedError - Failed to manage policy for boolean nagios_run_sudo
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681077/
- 06:47 AM CephFS Bug #65781 (New): qa: ceph version 17.2.7-917.gd69ee407 was not installed, found 17.2.7-913.g8c431824.el8.
- ...
- 06:44 AM Orchestrator Bug #63784: qa/standalone/mon/mkfs.sh:'mkfs/a' already exists and is not empty: monitor may already exist
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681112/
- 06:42 AM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681014
/a/yuriw-2024-04-... - 06:40 AM CephFS Bug #65780 (Duplicate): qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
- ...
- 06:38 AM CephFS Bug #65779 (New): qa: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
- ...
- 06:37 AM rgw Bug #59380: rados/singleton-nomsgr: test failing from "Health check failed: 1 full osd(s) (OSD_FULL)" and "Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)"
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681095/
/a/yuriw-2024-04-... - 06:30 AM CephFS Bug #65778 (New): qa: valgrind error: Leak_StillReachable malloc malloc strdup
- ...
- 06:25 AM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680976/
/a/yuriw-2024-04... - 06:23 AM CephFS Bug #57677: qa: "1 MDSs behind on trimming (MDS_TRIM)"
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 06:20 AM CephFS Bug #62658: error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 06:11 AM CephFS Bug #57676: qa: error during scrub thrashing: rank damage found: {'backtrace'}
- main:
https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-debug-testi... - 06:00 AM CephFS Bug #65777 (Duplicate): qa: error during scrub thrashing
- ...
- 05:40 AM RADOS Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680980/
/a/yuriw-2024-04-... - 05:38 AM devops Backport #65776 (New): reef: ceph-mgr-dashboard RPM requires python3-werkzeug
- 05:38 AM devops Backport #65775 (New): squid: ceph-mgr-dashboard RPM requires python3-werkzeug
- 05:31 AM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680989
/a/yuriw-2024-04-... - 05:30 AM devops Bug #65693 (Pending Backport): ceph-mgr-dashboard RPM requires python3-werkzeug
- 05:27 AM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680957/
/a/yuriw-2024-04... - 05:25 AM Dashboard Bug #65774 (New): mgr/dashboard: Filter alerts based on cluster fsid and do not allow to connect clusters with version less than hub cluster in multi-cluster
1.Since we have a new cluster variable in the prometheus metrics , we need to filter the alerts based on the cluste...- 05:20 AM Orchestrator Bug #64871: rados/cephadm/workunits: Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)" in cluster log
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7681108/
- 05:16 AM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-04-30_14:17:59-rados-wip-yuri5-testing-2024-04-17-1400-distro-default-smithi/7680978
- 04:34 AM bluestore Bug #65735: OSDs failed to restart when doing crimson-osd:thrash tests
- This was caused by a miss handling of CEPH_OSD_OP_CREATE in crimson, a new issue has been created: https://tracker.ce...
- 04:30 AM crimson Bug #65773 (New): OSDs failed to restart when doing crimson-rados:thrash tests
- ...
- 03:56 AM rgw Bug #65772 (New): Bucket lifecycle not working while bucket versioning is suspended
- How to procedure :
1. Create bucket, enable bucket versioning then suspend bucket versioning.
2. Upload file to b... - 01:07 AM Ceph QA QA Run #65771 (QA Approved): wip-pdonnell-testing-20240503.163550-debug
- * "PR #57226":https://github.com/ceph/ceph/pull/57226 -- common: mark assert-only variables as unused
* "PR #57192":...
05/02/2024
- 11:59 PM Ceph QA QA Run #65594 (QA Needs Approval): wip-yuriw11-testing-20240501.200505-squid
- 08:09 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- https://shaman.ceph.com/builds/ceph/wip-yuriw11-testing-20240501.200505-squid/f273489c6dcd4bc88409993babd09dd99491162c/
- 07:44 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- rebasing
- 07:39 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- Laura Flores wrote in #note-12:
> @yuriw I'm sorry to request another rebase- there has been a new commit added to h... - 07:06 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- @yuriw I'm sorry to request another rebase- there has been a new commit added to https://github.com/ceph/ceph/pull/57...
- 08:50 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- I did 400 deployments (2 runs of 200 deployments on separate vms) with 0 ceph issues.
Using your image pushed to m... - 08:18 PM rgw Backport #65636 (Resolved): squid: release note for rgw_realm init
- 07:12 PM rgw Bug #65668 (Fix Under Review): Notification: Persistent queue not deleted when topic is deleted via radosgw-admin
- 07:09 PM Ceph QA QA Run #65688 (QA Needs Approval): wip-yuri4-testing-2024-04-29-0642
- 07:09 PM Ceph QA QA Run #65688: wip-yuri4-testing-2024-04-29-0642
- @sseshasa rerrunnig
Pls assign it back to me in the future if you need a rerun or else - 03:56 PM Ceph QA QA Run #65688 (QA Needs Rerun/Rebuilt): wip-yuri4-testing-2024-04-29-0642
- @yuriw I updated the Rados analysis here: https://tracker.ceph.com/projects/rados/wiki/SQUID#httpstrackercephcomissue...
- 05:40 PM rgw Bug #65436: Getting Object Crashing radosgw services
- Hello,
I was able to test in 17.2.7 and the rgw service is still crashing with the same error message.... - 05:27 PM rgw Backport #65640 (Resolved): squid: [rgw][accounts] bucket quota management at account-level
- 05:17 PM CephFS Bug #65770 (New): qa: failed to be set on mds daemons: {'mds.imported', 'mds.exported'}
- This issue has been seen in QA runs for a couple of months but it incorrectly got marked as known issue. https://trac...
- 03:47 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/7680132 - 03:44 PM Ceph Backport #65391: squid: osd/scrub: "reservation requested while still reserved" error in cluster log
- Just for tracking - Lot's of failures reported on the following squid run (22 failures):
https://pulpito.ceph.com/yu... - 03:40 PM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/768019... - 03:40 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/768015... - 03:39 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Observed on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-distro-default-smithi/768024... - 03:37 PM rgw Feature #65769: rgw: make incomplete multipart upload part of bucket check efficient
- quincy backport ready to go -- https://github.com/ceph/ceph/pull/57244
- 03:36 PM rgw Feature #65769 (Fix Under Review): rgw: make incomplete multipart upload part of bucket check efficient
- Previously the incomplete multipart portion of bucket check would list all entries in the multipart namespace across ...
- 03:35 PM RADOS Bug #65768 (New): rados/verify: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
- This is observed on squid. I couldn't find a tracker on main related to this test.
A more proper analysis on whether... - 03:31 PM bluestore Backport #65358 (Fix Under Review): quincy: BlueFS log runway space exhausted
- 03:31 PM bluestore Backport #65356 (Fix Under Review): reef: BlueFS log runway space exhausted
- 03:30 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- See also:...
- 03:27 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- ...
- 08:52 AM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- @pdonnell I'm not ready to investigate why the freezing takes so long. Maybe it's one of those cases you mentioned to...
- 08:31 AM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- I made some progress.
the directory represented by the inode 0x1000000003 is owned by rank 0, we are the replica (... - 03:29 PM rgw Backport #65767 (In Progress): squid: rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
- 02:58 PM rgw Backport #65767 (In Progress): squid: rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
- https://github.com/ceph/ceph/pull/57242
- 03:16 PM bluestore Backport #65357 (Resolved): squid: BlueFS log runway space exhausted
- 02:58 PM rgw Bug #65590 (Pending Backport): rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
- 02:39 PM Dashboard Feature #50327 (In Progress): mgr/dashboard: add/edit lifecycle policy
- 02:38 PM RADOS Bug #65686 (Fix Under Review): ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
- 02:16 PM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
- This issue impacts most test runs lately, bumping up.
- 02:16 PM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
- Looking at:
> OSDs 2 and 3:
> https://pulpito.ceph.com/matan-2024-05-02_11:41:00-crimson-rados-wip-crimson-only-coh... - 01:26 PM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
- OSDs 2 and 3:
https://pulpito.ceph.com/matan-2024-05-02_11:41:00-crimson-rados-wip-crimson-only-coherent-log-and-at_... - 01:52 PM CephFS Bug #65766 (New): qa: perm denied for runing find on cephtest dir
- After the test suite finishes running, during teardown/unwinding running @find /home/ubuntu/cephtest@ unexpectedly pr...
- 01:51 PM CephFS Bug #50719: xattr returning from the dead (sic!)
- HI, Can I get an update on this?
- 01:50 PM CephFS Bug #62664 (Fix Under Review): ceph-fuse: failed to remount for kernel dentry trimming; quitting!
- Jakob Haufe wrote in #note-4:
> I created a minimal PR implementing this: https://github.com/ceph/ceph/pull/57170
>... - 01:17 PM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
- Venky Shankar wrote in #note-1:
> @pdonnell Are you talking about TestShutdownKillpoints() in test_failover? If yes,... - 10:17 AM CephFS Fix #65579: mds: use _exit for QA killpoints rather than SIGABRT
- @pdonnell Are you talking about TestShutdownKillpoints() in test_failover? If yes, you are suggesting changing, e.g.:...
- 01:13 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- Venky Shankar wrote in #note-40:
> It's not the metrics stuff but a bug when the MDS sends back a client_session(ope... - 12:08 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- It's not the metrics stuff but a bug when the MDS sends back a client_session(open) during client reconnect (post mds...
- 12:55 PM RADOS Bug #65765 (New): squid: rados/test.sh: LibRadosWatchNotifyECPP.WatchNotify test of api_watch_notify_pp suite didn't complete.
- The following failure was seen on Squid:
/a/yuriw-2024-04-30_03:21:19-rados-wip-yuri4-testing-2024-04-29-0642-dist... - 12:47 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
- Suggest updating your tearDown/setUp procedures to mirror what CephFSTestCase is doing.
- 12:29 PM Orchestrator Backport #65763 (In Progress): reef: cephadm: set "osd - profile rbd" for nvmeof service
- 12:09 PM Orchestrator Backport #65763 (In Progress): reef: cephadm: set "osd - profile rbd" for nvmeof service
- https://github.com/ceph/ceph/pull/57234
- 12:28 PM Orchestrator Backport #65762 (In Progress): squid: cephadm: set "osd - profile rbd" for nvmeof service
- 12:09 PM Orchestrator Backport #65762 (In Progress): squid: cephadm: set "osd - profile rbd" for nvmeof service
- https://github.com/ceph/ceph/pull/57233
- 12:26 PM CephFS Bug #65364: Provide metrics support for the Target Cluster Disconnection status
- Copying from the bz update:
I had a chat about this with Greg. Unfortunately, the messenger layer isn't the most a... - 12:25 PM CephFS Bug #65564 (Fix Under Review): Test failure: test_snap_schedule_subvol_and_group_arguments_08 (tasks.cephfs.test_snap_schedules.TestSnapSchedulesSubvolAndGroupArguments)
- 12:12 PM Ceph QA QA Run #65764: wip-rishabh-testing-20240426.111959
- There were lots of new failures along the usual ones. These new failures were caused by - https://github.com/ceph/cep...
- 12:10 PM Ceph QA QA Run #65764 (QA Closed): wip-rishabh-testing-20240426.111959
- * https://github.com/ceph/ceph/pull/56981
* https://github.com/ceph/ceph/pull/56846
* https://github.com/ceph/ceph/... - 12:04 PM Orchestrator Bug #65691 (Pending Backport): cephadm: set "osd - profile rbd" for nvmeof service
- 12:02 PM CephFS Bug #65761: valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
- https://pulpito.ceph.com/rishabh-2024-04-28_11:41:23-fs-wip-rishabh-testing-20240426.111959-testing-default-smithi/76...
- 12:00 PM CephFS Bug #65761 (New): valgrind error: Leak_StillReachable calloc calloc _dl_check_map_versions
- First saw these failures in a QA run for CephFS PRs, when I ran failed jobs from that run against main branch version...
- 12:00 PM Orchestrator Bug #65739 (In Progress): Cephadm adopt doesn't support "--no-cgroups-split" flag
- 11:47 AM Dashboard Bug #65760 (Pending Backport): mgr/dashboard: fix cluster filter typo in multi-cluster-overview grafana dashboard
- Fix extra braces in multi cluster overview grafana json
- 11:00 AM rgw Backport #65244 (In Progress): squid: RGW/s3select : several issues, s3select related, some caused a crash.
- 10:45 AM CephFS Bug #65604: dbench.sh workload times out after 3h when run with-quiescer
- There already a tracker for this. Will dig it up and link.
- 10:45 AM rgw Backport #65245 (In Progress): reef: RGW/s3select : several issues, s3select related, some caused a crash.
- 10:19 AM CephFS Bug #65572: Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi155 with status 1
- Lowered prior since this seems to be infra noise.
- 10:13 AM RADOS Bug #65227 (Need More Info): noscrub cluster flag prevents deep-scrubs from starting
- I am not able to reproduce the problem. Can you attach debug logs (including of the commands used to recreate the sce...
- 10:06 AM Dashboard Bug #65218 (Pending Backport): mgr/dashboard: Grafana ceph-cluster.json doesn't support cluster label
- 10:02 AM CephFS Backport #65406 (In Progress): quincy: mds: Reduce log level for messages when mds is stopping
- 09:54 AM CephFS Backport #65405 (In Progress): reef: mds: Reduce log level for messages when mds is stopping
- 09:49 AM bluestore Fix #58759: BlueFS log runway space exhausted
- I found that commits has been processed in squid. In that case we can remove it.
- 09:23 AM Dashboard Bug #64321 (Pending Backport): mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
- 07:43 AM CephFS Backport #65404 (In Progress): squid: mds: Reduce log level for messages when mds is stopping
- 07:39 AM CephFS Bug #65660: mds: drop client metrics during recovery
- Venky Shankar wrote in #note-5:
> Patrick Donnelly wrote in #note-4:
> > Christopher Hoffman wrote in #note-2:
> >... - 05:53 AM CephFS Bug #65660: mds: drop client metrics during recovery
- Patrick Donnelly wrote in #note-4:
> Christopher Hoffman wrote in #note-2:
> > >there's little reason to record his... - 07:31 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- @vshankar @patrick can you update this tracker with the discussion you guys had on the call post standup on tuesday? ...
- 06:38 AM Dashboard Backport #65759 (In Progress): quincy: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- 06:28 AM Dashboard Backport #65759 (In Progress): quincy: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- https://github.com/ceph/ceph/pull/57221
- 06:34 AM Dashboard Backport #65758 (In Progress): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- 06:28 AM Dashboard Backport #65758 (Resolved): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- https://github.com/ceph/ceph/pull/57220
- 06:33 AM Dashboard Backport #65756 (In Progress): reef: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- 06:17 AM Dashboard Backport #65756 (In Progress): reef: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- https://github.com/ceph/ceph/pull/57219
- 06:17 AM Dashboard Backport #65757 (New): squid: mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- 06:06 AM Dashboard Bug #65698 (Pending Backport): mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- 05:56 AM Dashboard Backport #65755 (New): reef: mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
- 05:55 AM Dashboard Backport #65754 (New): squid: mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
- 05:52 AM Dashboard Bug #64080 (Resolved): mgr/dashboard: In rgw multisite, during zone creation acess/secret key should not be compulsory provide an edit option to set these keys
- 05:52 AM Dashboard Backport #64791 (Resolved): squid: mgr/dashboard: In rgw multisite, during zone creation acess/secret key should not be compulsory provide an edit option to set these keys
- 05:49 AM Dashboard Bug #65534 (Pending Backport): mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
- 02:37 AM crimson Bug #65753: [crimson] OSD deployment fails
- Duplicate of https://tracker.ceph.com/issues/65752
Do not have permission to delete, please ignore - 02:34 AM crimson Bug #65753 (Duplicate): [crimson] OSD deployment fails
- While deploying OSDs on a Crimson cluster, the following error was observed...
- 02:33 AM crimson Bug #65752 (Need More Info): [crimson] OSD deployment fails
- While deploying OSDs on a Crimson cluster, the following error was observed...
05/01/2024
- 11:39 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- Partial fix for some of the warnings: https://github.com/ceph/ceph/pull/57218
- 08:01 AM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664127
/a/yuriw-2024... - 10:48 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
- rebased and pushed a new branch wip-yuri8-testing-2024-05-01-1547 meanwhile
https://shaman.ceph.com/builds/ceph/wip-... - 10:13 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
- still can't schedule
asked for help https://ceph-storage.slack.com/archives/C04SYTAN25P/p1714601019211669 - 10:47 PM RADOS Tasks #65751 (New): Add OS type and version to "ceph tell mon.* sessions" json dump
- Related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2154808
Context from BZ:... - 10:26 PM teuthology Bug #65750 (New): "RuntimeError: Read beyond file size detected, file is corrupted."
- I have seen lately errors, see below during suites scheduling.
Here is the run and a log snippet:
https://pulpi... - 10:17 PM Ceph QA QA Run #65349 (QA Needs Approval): wip-yuri3-testing-2024-04-05-0825
- @ksirivad pls review when ready
- 08:26 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- https://shaman.ceph.com/builds/ceph/wip-yuri3-testing-2024-04-05-0825/a53b05d03701e4d0ba0c9aadc7431842129aabf9/
- 03:17 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- Kamoltat (Junior) Sirivadhna wrote in #note-17:
> @yuriw Rebase and re-run please, I think now infra might be fixed... - 03:16 PM Ceph QA QA Run #65349 (QA Needs Rerun/Rebuilt): wip-yuri3-testing-2024-04-05-0825
- 02:56 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @yuriw Rebase and re-run please, I think now infra might be fixed, but rebasing because it has been while since the ...
- 10:12 PM RADOS Bug #65749 (Fix Under Review): osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
- 10:00 PM RADOS Bug #65749 (In Progress): osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
- 09:56 PM RADOS Bug #65749 (Fix Under Review): osd_max_pg_per_osd_hard_ratio 3 is set too low for real life
- In the field this issue comes up very often. It is quite disruptive because PGs are stuck in activating state and the...
- 09:44 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- The problem here doesn't seem to be quiesce-related.
We can see that the path traversal didn't attempt to authpin ... - 12:31 PM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- Leonid Usov wrote in #note-1:
> @pdonnell, so is this a deadlock between the operations, or just an unfortunate timi... - 10:34 AM CephFS Bug #65716: mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- @pdonnell, so is this a deadlock between the operations, or just an unfortunate timing of the quiesce which would suc...
- 09:38 PM mgr Bug #65748 (Fix Under Review): Change default upmap_max_deviation to 1
- 09:07 PM mgr Bug #65748 (Fix Under Review): Change default upmap_max_deviation to 1
- Field experience shows that default upmax_max_deviation 5 is not effective to reach well a balanced cluster. This is ...
- 07:37 PM Ceph QA QA Run #65594 (QA Needs Rerun/Rebuilt): wip-yuriw11-testing-20240501.200505-squid
- @yuriw can you rebase this branch? There are way too many failures due to:...
- 02:58 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- I am rescheduling the rados suite because everything died in the last run. :/
- 07:11 PM Ceph Feature #65747 (In Progress): common/admin_socket: support saving json output to a file local to the daemon
- 07:08 PM Ceph Feature #65747 (In Progress): common/admin_socket: support saving json output to a file local to the daemon
- The @ceph tell mds.X cache dump@ and @ceph tell mds.X ops@ commands have a useful @--path@ argument that directs the ...
- 06:26 PM RADOS Backport #63400 (Resolved): reef: pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
- 02:28 PM RADOS Backport #63400: reef: pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
- Igor Fedotov wrote in #note-2:
> https://github.com/ceph/ceph/pull/54358
merged - 06:04 PM rgw Bug #65746 (Pending Backport): rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)
+ # XXXX re-trying the complete is failing in RGW due to an internal error that appears not caused
+ # check...- 05:50 PM CephFS Tasks #64133 (Resolved): Make pjd work on fscrypt
- pjd completes and passes all tests.
- 05:38 PM CephFS Tasks #65745 (Resolved): RMW fail when on end of block or file
- ...
- 05:35 PM CephFS Tasks #65745 (Resolved): RMW fail when on end of block or file
- A RMW will fail when at end boundary of block or file.
See:... - 05:33 PM rbd Feature #65624: [pybind] expose CLONE_FORMAT and FLATTEN image options
- While working on this, https://tracker.ceph.com/issues/65743 and https://tracker.ceph.com/issues/65744 were discovered.
- 05:27 PM rbd Feature #65624 (Fix Under Review): [pybind] expose CLONE_FORMAT and FLATTEN image options
- 05:31 PM rbd Bug #65744 (New): FORMAT and CLONE_FORMAT image options accept bogus values
- In particular, for RBD_IMAGE_OPTION_CLONE_FORMAT, 1 selects clone v1 and everything else (i.e. 0, 2, 3, ...) selects ...
- 05:10 PM rbd Bug #65743 (New): migration of a clone with --flatten doesn't fully detach from the parent
- ...
- 05:05 PM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
- as discussed, we'll revert this for main/squid until we have a chance to validate the fix. the reverts are tracked in...
- 05:03 PM rgw Bug #65742 (Fix Under Review): beast: revert changes to ssl async_shutdown()
- 04:47 PM rgw Bug #65742 (Fix Under Review): beast: revert changes to ssl async_shutdown()
- the crash tracked in https://tracker.ceph.com/issues/65664 was introduced by https://github.com/ceph/ceph/pull/55967....
- 04:31 PM rgw Feature #65741: rgw: implement RestrictPublicBuckets from Blocking public access
- RFC: https://github.com/ceph/ceph/pull/57206
- 04:22 PM rgw Feature #65741 (New): rgw: implement RestrictPublicBuckets from Blocking public access
- Currently setting RestrictPublicBuckets has no effects on the bucket.
ref. https://docs.aws.amazon.com/AmazonS3/late... - 04:27 PM rgw Documentation #50084 (Won't Fix): notifications: document behavior in case of multisite
- 03:54 PM rgw Documentation #50084: notifications: document behavior in case of multisite
- starting from squid, topics and notifications are replicated between sites.
- 04:27 PM rgw Documentation #49649 (Won't Fix): add information on the system objects holding notifications
- 03:52 PM rgw Documentation #49649: add information on the system objects holding notifications
- the object format was changed as part of the squid release.
we should probably not document that. - 03:43 PM CephFS Backport #65740 (New): squid: mds: missing policylock acquisition for quiesce
- 03:41 PM Orchestrator Bug #65739: Cephadm adopt doesn't support "--no-cgroups-split" flag
- PR
https://github.com/ceph/ceph/pull/57205
waiting for review - 02:28 PM Orchestrator Bug #65739: Cephadm adopt doesn't support "--no-cgroups-split" flag
- working on a PR, its a very simple fix
- 02:09 PM Orchestrator Bug #65739 (In Progress): Cephadm adopt doesn't support "--no-cgroups-split" flag
- Attempting to adopt a legacy daemon to cephadm with '--no-cgroups-split' fails due to
"cephadm: error: unrecognized ... - 03:36 PM CephFS Bug #65595 (Pending Backport): mds: missing policylock acquisition for quiesce
- 03:31 PM rgw Bug #65656 (Fix Under Review): Reduce default thread pool size
- 03:30 PM Infrastructure Bug #65727: ntpq: command not found
- I would suggest using "chronyc sources" instead of ntpq, chrony is the newer tool that is used
- 11:47 AM Infrastructure Bug #65727 (In Progress): ntpq: command not found
- 03:17 PM Infrastructure Bug #65734 (Closed): Expected OS to be centos 8 but found ubuntu 22.04
- This error was around the same time Centos image was recaptured
- 08:41 AM Infrastructure Bug #65734 (Closed): Expected OS to be centos 8 but found ubuntu 22.04
- ...
- 03:02 PM CephFS Bug #62664: ceph-fuse: failed to remount for kernel dentry trimming; quitting!
- I created a minimal PR implementing this: https://github.com/ceph/ceph/pull/57170
Is there any jenkins test to run... - 03:00 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
- I was about to open an issue and got to this one with a search. I suspect that this should be handled by the cephfs t...
- 02:43 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
- https://jenkins.ceph.com/job/ceph-api/73295/...
- 02:36 PM CephFS Backport #65710 (Fix Under Review): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- 01:26 PM CephFS Backport #65710 (In Progress): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- 01:25 PM CephFS Backport #65710 (Fix Under Review): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- 02:34 PM RADOS Bug #62338 (Resolved): osd: choose_async_recovery_ec may select an acting set < min_size
- 02:33 PM RADOS Backport #62819 (Resolved): reef: osd: choose_async_recovery_ec may select an acting set < min_size
- 02:30 PM RADOS Backport #62819: reef: osd: choose_async_recovery_ec may select an acting set < min_size
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54550
merged - 02:32 PM Ceph QA QA Run #65655 (QA Closed): wip-yuri2-testing-2024-04-24-0914-squid
- 06:39 AM Ceph QA QA Run #65655 (QA Approved): wip-yuri2-testing-2024-04-24-0914-squid
- 06:33 AM Ceph QA QA Run #65655: wip-yuri2-testing-2024-04-24-0914-squid
- @yuriw - approved.
One interesting Scrub-related bug (delayed status reporting), but unrelated. Possibly a test issue... - 02:31 PM Ceph QA QA Run #65574 (QA Closed): wip-yuri7-testing-2024-04-18-1351-reef
- 08:43 AM Ceph QA QA Run #65574 (QA Approved): wip-yuri7-testing-2024-04-18-1351-reef
- 7664087,7664152, 7664219, 7664256, 7664282 - https://tracker.ceph.com/issues/65183 (Overriding an EC pool needs the "...
- 02:29 PM RADOS Backport #63559: reef: Heartbeat crash in osd
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54527
merged - 02:27 PM CephFS Backport #63363: reef: mds: create an admin socket command for raising a signal
- Leonid Usov wrote in #note-2:
> https://github.com/ceph/ceph/pull/54357
merged - 02:26 PM RADOS Backport #63289: reef: mon: segfault on rocksdb opening
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54150
merged - 01:30 PM CephFS Backport #65738 (In Progress): squid: mds: quiesce timeout due to a freezing directory
- 01:07 PM CephFS Backport #65738 (In Progress): squid: mds: quiesce timeout due to a freezing directory
- https://github.com/ceph/ceph/pull/57203
- 01:13 PM rgw Bug #23953 (Rejected): rgw: bucket index delete cleanup
- 01:05 PM RADOS Bug #65737 (Fix Under Review): pg-split-merge.sh -
/a/nmordech-2024-04-30_10:14:02-rados:standalone-main-distro-default-smithi/7680852
/a/nmordech-2024-04-30_10:14:0...- 01:04 PM cephsqlite Backport #65736 (In Progress): quincy: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- 12:57 PM cephsqlite Backport #65736 (In Progress): quincy: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- https://github.com/ceph/ceph/pull/57199
- 12:58 PM CephFS Bug #65603 (Pending Backport): mds: quiesce timeout due to a freezing directory
- 12:34 PM RADOS Bug #61832: Restoring #61785: osd-scrub-dump.sh: ERROR: Extra scrubs after test completion...not expected
- Note: the test is disabled for now (with https://github.com/ceph/ceph/pull/54482).
No point in updating, until the f... - 10:32 AM bluestore Backport #63316 (In Progress): quincy: crash: ZonedAllocator::ZonedAllocator
- 10:32 AM bluestore Backport #63315 (In Progress): reef: crash: ZonedAllocator::ZonedAllocator
- 10:29 AM bluestore Backport #64592 (In Progress): quincy: BlueFS: l_bluefs_log_compactions is counted twice in sync log compaction
- 10:29 AM bluestore Backport #64591 (In Progress): squid: BlueFS: l_bluefs_log_compactions is counted twice in sync log compaction
- 10:27 AM bluestore Backport #64590 (In Progress): reef: BlueFS: l_bluefs_log_compactions is counted twice in sync log compaction
- 10:25 AM bluestore Bug #64511: kv/RocksDBStore: rocksdb_cf_compact_on_deletion has no effect on the default column family
- Steven Goodliff wrote in #note-4:
> Hi,
>
> Will this fix get into 18.2.3 ?, thanks
Seems, backport bot is bro... - 10:09 AM bluestore Bug #64511: kv/RocksDBStore: rocksdb_cf_compact_on_deletion has no effect on the default column family
- Hi,
Will this fix get into 18.2.3 ?, thanks - 10:06 AM bluestore Bug #65735 (New): OSDs failed to restart when doing crimson-osd:thrash tests
- ...
- 08:34 AM rgw Backport #59730: quincy: S3 CompleteMultipartUploadResult has empty ETag element
- Wout van Heeswijk wrote in #note-3:
> Is something needed to merge this?
Only developer with merge permissions - 08:24 AM rgw Backport #59730: quincy: S3 CompleteMultipartUploadResult has empty ETag element
- @konstantin
Is something needed to merge this? main, pacific and reef have been patched already. Is something bloc... - 08:32 AM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/teuthology/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664305
- 08:27 AM Orchestrator Bug #64208: test_cephadm.sh: Container version mismatch causes job to fail.
- /a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664211
- 08:24 AM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664183
- 08:13 AM CephFS Bug #65261: qa/cephfs: cephadm related failure on fs/upgrade job
- /a//teuthology/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664176
- 08:09 AM RADOS Bug #53767: qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing causes timeout
- /a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664155
- 08:03 AM RADOS Bug #64725: rados/singleton: application not enabled on pool 'rbd'
- /a/https://pulpito.ceph.com/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smi...
- 07:57 AM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- /a/yuriw-2024-04-20_01:10:46-rados-wip-yuri7-testing-2024-04-18-1351-reef-distro-default-smithi/7664087
- 07:09 AM crimson Bug #63647: SnapTrimEvent AddressSanitizer: heap-use-after-free
- https://pulpito.ceph.com/matan-2024-04-30_07:11:13-crimson-rados-wip-matanb-crimson-testing-user-modify-distro-crimso...
- 07:08 AM crimson Bug #64206: obc->is_loaded_and_valid() assertion
- osd.3
https://pulpito.ceph.com/matan-2024-04-30_07:11:13-crimson-rados-wip-matanb-crimson-testing-user-modify-distro... - 06:44 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Xiubo Li wrote in #note-6:
> Venky Shankar wrote in #note-5:
> > The tracker description mentions @denied reconne... - 06:25 AM RADOS Bug #44510 (Fix Under Review): osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- 01:45 AM CephFS Bug #65733 (Fix Under Review): mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
04/30/2024
- 11:45 PM CephFS Bug #65733 (Pending Backport): mds: upgrade to MDS enforcing CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK with client having root_squash in any MDS cap causes eviction for all file systems the client has caps for
- 11:43 PM CephFS Bug #56067 (Resolved): Cephfs data loss with root_squash enabled
- Resolved via #57154
- 11:33 PM Ceph QA QA Run #65592 (QA Closed): wip-yuriw-testing-20240419.185239-main
- 11:01 PM Ceph QA QA Run #65592 (QA Approved): wip-yuriw-testing-20240419.185239-main
- @yuriw rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65592
- 11:32 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- https://github.com/ceph/ceph/pull/56995 merged
- 10:59 PM CephFS Bug #64707: suites/fsstress.sh hangs on one client - test times out
- /a/lflores-2024-04-29_20:31:53-upgrade-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7679200
- 10:55 PM cephsqlite Bug #59335: Found coredumps on smithi related to sqlite3
- /a/teuthology-2024-04-28_20:00:15-rados-main-distro-default-smithi/7677031
- 10:46 PM RADOS Bug #53544: src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in thrash_cache_writeback_proxy_none tests
- /a/yuriw-2024-04-21_14:00:04-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7666787
- 09:37 PM mgr Bug #65627: Centos 9 stream ceph container iscsi test failure
- Logs are in ~dmick/c9.iscsi.archive.tgz on the teuthology node.
- 09:36 PM rgw Bug #65654: run-bucket-check.sh: failed assert len(json_out) == len(unlinked_keys)
- It looks like the test reproduced and caught another scenario where unlinked objects get left behind. Still looking a...
- 09:36 PM Ceph Bug #55461 (Fix Under Review): ceph osd crush swap-bucket {old_host} {new_host} where {old_host}={new_host} crashes monitors
- 09:14 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
- UPD: this is not a containerized installation, so the above guess is wrong
- 08:07 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
- > The alarming problem is that the beacon upkeep thread apparently slept for about 60 seconds! This should only be po...
- 07:24 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
- Leonid Usov wrote in #note-4:
> Looking at @remote/smithi096/log/ceph-mds.d.log.gz@, I get an impression that the no... - 07:16 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
- Looking at @remote/smithi096/log/ceph-mds.d.log.gz@, I get an impression that the node is cut from network for a few ...
- 05:10 PM CephFS Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds
- ...
- 03:58 PM CephFS Bug #65704 (New): mds+valgrind: beacon thread blocked for 60+ seconds
- This one is really weird and my working theory is that this is related to the quiesce database. Test symptom:
<pre... - 08:58 PM cephsqlite Backport #65731 (In Progress): reef: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- 08:49 PM cephsqlite Backport #65731 (In Progress): reef: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- https://github.com/ceph/ceph/pull/57190
- 08:58 PM cephsqlite Backport #65730 (In Progress): squid: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- 08:49 PM cephsqlite Backport #65730 (In Progress): squid: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- https://github.com/ceph/ceph/pull/57189
- 08:53 PM Orchestrator Bug #65732 (New): rados/cephadm/osds: job times out during nvme_loop interval
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664906...
- 08:46 PM cephsqlite Bug #65494 (Pending Backport): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- 08:37 PM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7665003
- 08:34 PM RADOS Bug #65729 (New): thrash_cache_writeback_proxy_none: command failed when setting target_max_objects
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664981...
- 08:26 PM Orchestrator Bug #63784: qa/standalone/mon/mkfs.sh:'mkfs/a' already exists and is not empty: monitor may already exist
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664969
- 08:23 PM Orchestrator Bug #65728 (New): Alertmanager in an unknown state
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664960/remote/smithi...
- 08:01 PM Infrastructure Bug #65727 (In Progress): ntpq: command not found
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664955...
- 07:57 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664940
OSD_DOWN - 07:11 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664903...
- 06:15 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664854
/a/yuriw-2024... - 06:13 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664765
/a/yuriw-2024... - 06:03 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664686...
- 07:56 PM RADOS Bug #63198: rados/thrash: AssertionError: wait_for_recovery: failed before timeout expired
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664923...
- 07:47 PM Orchestrator Backport #65417 (Resolved): squid: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
- 07:46 PM Orchestrator Bug #65234 (Resolved): upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
- 07:44 PM Orchestrator Backport #65381 (Resolved): squid: upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
- 07:43 PM Orchestrator Backport #65726 (New): quincy: cephadm: anonymous_access: false is dropped from grafana spec after apply
- 07:43 PM Orchestrator Backport #65725 (New): reef: cephadm: anonymous_access: false is dropped from grafana spec after apply
- 07:42 PM Orchestrator Backport #65724 (New): squid: cephadm: anonymous_access: false is dropped from grafana spec after apply
- 07:42 PM Orchestrator Backport #65723 (New): reef: cephadm: agent tries to json load response payload before checking for errors
- 07:42 PM Orchestrator Backport #65722 (New): squid: cephadm: agent tries to json load response payload before checking for errors
- 07:41 PM Orchestrator Bug #65553 (Pending Backport): cephadm: agent tries to json load response payload before checking for errors
- 07:36 PM Orchestrator Bug #65511 (Pending Backport): cephadm: anonymous_access: false is dropped from grafana spec after apply
- 07:27 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-20:
> Hi Nir,
>
> I built a container based on 18.2.3 (an upcoming release). It woul... - 12:51 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-20:
> Hi Nir,
>
> I built a container based on 18.2.3 (an upcoming release). It woul... - 07:16 PM RADOS Bug #65721 (New): [MON] Connection Scores: peers become dead after ~5mins, However quorum seems fine
- All ranks reports that everyone is alive and well...
- 07:03 PM RADOS Bug #65695 (Fix Under Review): [MON] ConnectionTracker dumps duplicate keys
- 02:26 AM RADOS Bug #65695 (Fix Under Review): [MON] ConnectionTracker dumps duplicate keys
- Problem:
Currently, the ConnectionTracker::dump()
will dump a duplicate key which is not
ideal when you want to ... - 06:48 PM rbd Feature #65720 (New): diff-iterate should allow passing the "from snapshot" by snap ID
- If e.g. RBD_SNAP_NAMESPACE_TYPE_TRASH snapshots pile up, a lot of space can go missing/unaccounted for from the user'...
- 06:24 PM Infrastructure Bug #65719 (New): debian-17.2.6 jammy repository does not have a Release file
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664896...
- 06:12 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664739
- 06:07 PM Orchestrator Bug #65718 (In Progress): cephadm: nvmeof daemon omap_file_lock_retry_sleep_interval default causes daemon to fail to start
- ...
- 05:52 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @ksirivad I don't know what you want to do
PLMK - 04:07 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @yuriw
Hmm in the most recent re-run you scheduled https://pulpito.ceph.com/yuriw-2024-04-26_18:18:24-rados-wip-y... - 05:47 PM Orchestrator Bug #65717 (In Progress): cephadm: iscsi and nvme auth keyring are not cleaned up
- If you move/remove an iscsi daemon, the keyring for the removed daemon is left behind unless the user cleans up the k...
- 05:38 PM CephFS Backport #65711 (In Progress): squid: mds: regular file inode flags are not replicated by the policylock
- 04:26 PM CephFS Backport #65711 (In Progress): squid: mds: regular file inode flags are not replicated by the policylock
- https://github.com/ceph/ceph/pull/57179
- 05:36 PM CephFS Backport #65713 (In Progress): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 05:29 PM CephFS Backport #65713 (New): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 05:28 PM CephFS Backport #65713 (Rejected): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 04:27 PM CephFS Backport #65713 (In Progress): quincy: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- https://github.com/ceph/ceph/pull/57178
- 05:28 PM CephFS Backport #65715 (In Progress): reef: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 04:27 PM CephFS Backport #65715 (In Progress): reef: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- https://github.com/ceph/ceph/pull/57177
- 05:20 PM CephFS Backport #65714 (In Progress): squid: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 04:27 PM CephFS Backport #65714 (In Progress): squid: mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- https://github.com/ceph/ceph/pull/57176
- 05:19 PM CephFS Backport #65712 (In Progress): squid: qa: lockup not long enough to for test_quiesce_authpin_wait
- 04:26 PM CephFS Backport #65712 (In Progress): squid: qa: lockup not long enough to for test_quiesce_authpin_wait
- https://github.com/ceph/ceph/pull/57175
- 05:18 PM CephFS Backport #65709 (In Progress): reef: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- 04:25 PM CephFS Backport #65709 (In Progress): reef: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- https://github.com/ceph/ceph/pull/57174
- 05:18 PM CephFS Backport #65708 (In Progress): squid: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- 04:25 PM CephFS Backport #65708 (In Progress): squid: client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- https://github.com/ceph/ceph/pull/57173
- 05:17 PM CephFS Backport #65707 (In Progress): reef: qa: increase debugging for snap_schedule
- 04:25 PM CephFS Backport #65707 (In Progress): reef: qa: increase debugging for snap_schedule
- https://github.com/ceph/ceph/pull/57172
- 05:17 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- Patrick Donnelly wrote in #note-36:
> It sounds more and more to me like there is some kind of request the client co... - 10:30 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- Venky Shankar wrote in #note-37:
> Patrick Donnelly wrote in #note-36:
> > It sounds more and more to me like there... - 05:29 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- Patrick Donnelly wrote in #note-36:
> It sounds more and more to me like there is some kind of request the client co... - 05:17 PM CephFS Backport #65706 (In Progress): squid: qa: increase debugging for snap_schedule
- 04:25 PM CephFS Backport #65706 (In Progress): squid: qa: increase debugging for snap_schedule
- https://github.com/ceph/ceph/pull/57171
- 05:11 PM CephFS Bug #65716 (In Progress): mds: dir merge can't progress due to fragment nested pins, blocking the quiesce_path and causing a quiesce timeout
- ...
- 04:26 PM CephFS Backport #65710 (Fix Under Review): squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- 04:22 PM CephFS Fix #65617 (Pending Backport): qa: increase debugging for snap_schedule
- 04:22 PM CephFS Bug #65614 (Pending Backport): client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- 04:21 PM CephFS Bug #65606 (Pending Backport): workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- 04:20 PM CephFS Bug #65518 (Pending Backport): mds: regular file inode flags are not replicated by the policylock
- 04:20 PM CephFS Bug #65496 (Pending Backport): mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 04:19 PM CephFS Bug #65508 (Pending Backport): qa: lockup not long enough to for test_quiesce_authpin_wait
- 04:13 PM Ceph QA QA Run #65694 (QA Closed): wip-pdonnell-testing-20240429.210911-debug
- 04:13 PM Ceph QA QA Run #65694 (QA Approved): wip-pdonnell-testing-20240429.210911-debug
- https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-04-30
- 04:13 PM CephFS Bug #65705 (Fix Under Review): qa: snaptest-multiple-capsnaps.sh failure
- ...
- 04:10 PM Linux kernel client Bug #64471: kernel: upgrades from quincy/v18.2.[01]/reef to main|squid fail with kernel oops
- /teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680637/...
- 03:34 PM CephFS Bug #62664: ceph-fuse: failed to remount for kernel dentry trimming; quitting!
- This is related to https://github.com/util-linux/util-linux/issues/2576 and will happen on any system with util-linux...
- 03:12 PM Orchestrator Bug #65703 (New): qa/suites/fs/upgrade: Command failed ... ceph orch upgrade check quay.ceph.io/ceph-ci/ceph:$sha1
- ...
- 02:57 PM Infrastructure Bug #65639 (In Progress): smithi139 unable to be reached over ssh
- 02:48 PM Infrastructure Bug #65639 (Resolved): smithi139 unable to be reached over ssh
- I checked for hardware issues on smithi139 but I didn't found anything , I can leave this ticket on hold and check ag...
- 11:27 AM Infrastructure Bug #65639 (In Progress): smithi139 unable to be reached over ssh
- 02:54 PM rgw Feature #61887 (Fix Under Review): s3: GetBucketLocation should also return placement target
- 02:39 PM Orchestrator Bug #65702 (In Progress): cephadm: ganesha-rados-grace tool sometimes fails the first time it is run, causing a health warning
-
Failures look like... - 02:31 PM CephFS Bug #65618: qa: fsstress: cannot execute binary file: Exec format error
- Other runs:...
- 01:02 PM CephFS Bug #65618 (Triaged): qa: fsstress: cannot execute binary file: Exec format error
- 02:28 PM Infrastructure Bug #65682 (Resolved): Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
- 08:40 AM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
- Fixed, magna006 reimaged with rhel 9.3
- 02:24 PM CephFS Bug #65701 (Fix Under Review): qa: quiesce cache/ops dump not world readable
- 02:22 PM CephFS Bug #65701 (Pending Backport): qa: quiesce cache/ops dump not world readable
- /teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680431/...
- 02:18 PM Ceph QA QA Run #65560: wip-yuri5-testing-2024-04-17-1400
- Aishwarya Mathuria wrote in #note-6:
> @yuriw can we please re-run this? There are 172 dead jobs.
rerunning - 02:18 PM Ceph QA QA Run #65560 (QA Needs Approval): wip-yuri5-testing-2024-04-17-1400
- 04:23 AM Ceph QA QA Run #65560 (QA Needs Rerun/Rebuilt): wip-yuri5-testing-2024-04-17-1400
- @yuriw can we please re-run this? There are 172 dead jobs.
- 02:06 AM Ceph QA QA Run #65560: wip-yuri5-testing-2024-04-17-1400
- @lflores sure
- 02:04 PM CephFS Bug #65700 (Fix Under Review): qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
- 02:02 PM CephFS Bug #65700 (Pending Backport): qa: Health detail: HEALTH_WARN Degraded data redundancy: 40/348 objects degraded (11.494%), 9 pgs degraded" in cluster log
- https://pulpito.ceph.com/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-sm...
- 01:08 PM CephFS Bug #65580 (Triaged): mds/client: add dummy client feature to test client eviction
- 11:36 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- BTW we do have this [0] which is run as part of [1] so we can run this after the MDS upgrade (with some minor tweaks ...
- 11:29 AM CephFS Bug #65580: mds/client: add dummy client feature to test client eviction
- So this would be something like adding `CEPHFS_FEATURE_DUMMY` to `include/ceph_features.h`, then post MDS upgrade hav...
- 01:08 PM ceph-volume Bug #65584 (Fix Under Review): ceph-volume: use os.makedirs to implement mkdir_p
- 01:04 PM CephFS Bug #65572: Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi155 with status 1
- There even no any ceph side logs. It seemed the cluster dead suddenly.
- 12:58 PM CephFS Bug #65647 (Triaged): Evicted kernel client may get stuck after reconnect
- 12:44 PM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Venky Shankar wrote in #note-5:
> Mykola Golub wrote in #note-4:
> > Xiubo Li wrote in #note-3:
> >
> > > Is tha... - 12:24 PM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Mykola Golub wrote in #note-4:
> Xiubo Li wrote in #note-3:
>
> > Is that possible to enable the mds debug logs, ... - 12:36 PM Ceph QA QA Run #65699 (QA Testing): wip-pdonnell-testing-20240430.185512-reef-debug
- * "PR #57162":https://github.com/ceph/ceph/pull/57162 -- reef: qa: add support/qa for cephfs-shell on CentOS 9 / RHEL9
- 12:04 PM CephFS Bug #65423 (Rejected): Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
- Please follow suggestion in note-3.
- 12:02 PM CephFS Bug #65455 (Rejected): read operation hung in Client::get_caps
- Please reopen the ticket if this is reproducible supported ceph versions.
- 12:02 PM CephFS Bug #65616 (Triaged): pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
- 11:58 AM sepia Support #65633 (In Progress): Sepia Lab Access Request
- 11:57 AM sepia Support #65633: Sepia Lab Access Request
- Hey Neha,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn and ssh neh... - 11:48 AM Ceph QA QA Run #65454: wip-vshankar-testing-20240411.061452
- Had to rebuild the (debug) branch since some fixes were merged for knows issues.
- 11:13 AM Ceph QA QA Run #65454 (QA Needs Rerun/Rebuilt): wip-vshankar-testing-20240411.061452
- 11:41 AM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
- ACK,
testing with:... - 11:37 AM Ceph QA QA Run #65680: wip-mchangir-testing-20240429.064231-main-debug
- "Teuthology Jobs":https://pulpito.ceph.com/mchangir-2024-04-30_01:08:25-fs-wip-mchangir-testing-20240429.064231-main-...
- 09:54 AM RADOS Bug #44510 (In Progress): osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- from /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659542
we can see t... - 09:20 AM rbd Feature #65624 (In Progress): [pybind] expose CLONE_FORMAT and FLATTEN image options
- Limiting the scope of this tracker to options related to cloning.
- 09:09 AM sepia Support #65685: Sepia Lab Access Request
- Hey Srinivasa Bharath Kanta, Are these new/additional or replacement credentials?
- 08:41 AM sepia Support #65685 (In Progress): Sepia Lab Access Request
- 08:32 AM Dashboard Bug #65698 (Pending Backport): mgr/dashboard: RBD snapshots cloned (format v2) and then deleted causes Not Found/404 in the source RBD Image
- Starting with mimic/RBD clone format 2, snapshots don't require protection to be cloned. What it happens under the ho...
- 08:30 AM rgw Bug #48358: rgw: qlen and qactive perf counters leak
- Is there any progress on this ticket?
We still have a performance issue when active connections get high.
- 08:16 AM crimson Bug #62162: local_shared_foreign_ptr: Assertion `ptr && *ptr' failed
- https://pulpito.ceph.com/matan-2024-04-28_14:58:28-crimson-rados-wip-crimson-coherent-log-and-at_version-distro-crims...
- 07:54 AM mgr Bug #47537 (Resolved): Prometheus rbd metrics absent by default
- 07:53 AM Ceph Bug #19242 (Resolved): Ownership of /var/run/ceph not set with sysv-init under Jewel
- 07:48 AM rgw Backport #57658 (In Progress): quincy: fail to set requestPayment in slave zone
- 07:46 AM crimson Bug #65697 (New): fix _do_rollback_to clone_overlaps
- Bring https://github.com/ceph/ceph/pull/56696 to Crimson.
- 07:44 AM rgw Bug #50261 (Fix Under Review): rgw: system users can't issue role policy related ops without explicit user policy
- 07:42 AM rgw Bug #49313 (Resolved): RGW ops log is not logging bucket listing operations
- 07:41 AM mgr Bug #43897 (Fix Under Review): crash module reports UTC timestamps but doesn't format timestamps accordingly
- 07:40 AM mgr Bug #46703 (Resolved): mgr/prometheus: introduce metric for collection time
- 06:47 AM crimson Bug #65451 (Resolved): tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
- 06:44 AM crimson Bug #65451: tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
- Tested *before* 56844 was merged
https://pulpito.ceph.com/matan-2024-04-25_08:01:21-crimson-rados-wip-matanb-crimson... - 06:11 AM RADOS Bug #65517 (Fix Under Review): rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- 05:14 AM rgw Bug #64999: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone
Hi All,
We are eagerly awaiting the resolution of the mentioned issue.
Any guidance or insight would be greatl...- 03:32 AM crimson Bug #65696 (New): osd crashes when recovering PGs that have unfound objects
- Crimson OSD doesn't handle unfound objects during recovery/backfill....
- 03:26 AM Ceph QA QA Run #65688 (QA Needs Approval): wip-yuri4-testing-2024-04-29-0642
- @sseshasa can you review this when done pls?
- 03:01 AM RADOS Bug #65686: ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
- The
Radoslaw Zarzynski wrote in #note-1:
> A note from bug scrub: Mohit might want to take a look. Pinged him in... - 01:53 AM crimson Bug #65628 (Resolved): unittest-seastore (Timeout)
- 01:52 AM crimson Bug #65585 (Resolved): unittest-seastore (Timeout)
- 01:14 AM CephFS Bug #65157: cephfs-mirror: set layout.pool_name xattr of destination subvol correctly
- I arrived at this issue only via code review.
I haven't attempted reproducing the issue. - 12:35 AM CephFS Feature #61866: MDSMonitor: require --yes-i-really-mean-it when failing an MDS with MDS_HEALTH_TRIM or MDS_HEALTH_CACHE_OVERSIZED health warnings
- Rishabh Dave wrote in #note-7:
> Patrick, should we include other health warnings too? I didn't include it in PR bec...
04/29/2024
- 09:34 PM Orchestrator Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664699
More on http... - 09:30 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
- Sure Laura
- 05:59 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
- @badone perhaps we can address just the warnings reported in this Tracker, and address additional warnings in a Part 2.
- 09:14 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- In this one, we are intentionally setting OSDs down, so the warning is expected.
/a/yuriw-2024-04-20_15:32:38-rado... - 09:12 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664685...
- 09:12 PM Ceph QA QA Run #65594 (QA Needs Approval): wip-yuriw11-testing-20240501.200505-squid
- 09:11 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- Laura Flores wrote in #note-5:
> @yuriw can you rerun this, and also schedule an upgrade suite?
done - 08:57 PM Ceph QA QA Run #65594 (QA Needs Rerun/Rebuilt): wip-yuriw11-testing-20240501.200505-squid
- @yuriw can you rerun this, and also schedule an upgrade suite?
- 09:09 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- /a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664685
- 09:09 PM Ceph QA QA Run #65694 (QA Closed): wip-pdonnell-testing-20240429.210911-debug
- * "PR #57059":https://github.com/ceph/ceph/pull/57059 -- mds: abort fragment/export when quiesced
* "PR #57044":http... - 08:58 PM Ceph QA QA Run #65592: wip-yuriw-testing-20240419.185239-main
- Note to self: I also scheduled an upgrade suite.
- 08:54 PM Ceph QA QA Run #65661 (QA Closed): wip-pdonnell-testing-20240425.015853-debug
- Broken by https://github.com/ceph/ceph/pull/57059
- 08:53 PM devops Bug #65693 (In Progress): ceph-mgr-dashboard RPM requires python3-werkzeug
- 08:26 PM devops Bug #65693 (Pending Backport): ceph-mgr-dashboard RPM requires python3-werkzeug
- The @ceph-mgr-dashboard@ RPM has a requirement on the @python3-werkzeug@ RPM package, but no code in @rpm -qf ceph-mg...
- 08:41 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- It sounds more and more to me like there is some kind of request the client continues to wait on that is blocking the...
- 08:12 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
- Laura Flores wrote in #note-4:
> @yuriw this PR caused the build failure: https://github.com/ceph/ceph/pull/55592/
> ... - 07:58 PM Ceph QA QA Run #65641 (QA Needs Rerun/Rebuilt): wip-yuriw8-testing-20240424.000125-main
- @yuriw this PR caused the build failure: https://github.com/ceph/ceph/pull/55592/
Can you drop it and rebuild? - 07:54 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- I scheduled another rerun since there were too many infra failures:
https://pulpito.ceph.com/lflores-2024-04-29_19:4... - 07:39 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- ...
- 06:55 PM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
- magna006 node is not reachable now.
tmathew@magna002:~$ ping magna006
PING magna006.ceph.redhat.com (10.8.128.6) ... - 09:13 AM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
- https://ocs3xstage-jenkins-csb-rhgsocs3x.apps.ocp-c1.prod.psi.redhat.com/view/Baremetal/job/qe-reef-rados-baremetal/3...
- 09:13 AM Infrastructure Bug #65682: Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
- From the pipeline when we are trying to execute podman run command it is failing with following error and we are not ...
- 09:07 AM Infrastructure Bug #65682 (Resolved): Getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal: bus error (core dumped) issue on magna006 node
- While running podman ps command on magna006 node, we are getting WARN[0019] Conmon at /usr/bin/conmon invalid: signal...
- 06:10 PM RADOS Bug #42519 (Closed): During deployment of the ceph,when the main node starts slower than the other nodes.It may lead to generate a core by assert.
- No touch for 4 years means rather low priority. Feel free to reopen if needed.
- 06:08 PM RADOS Bug #65591: Pool MAX_AVAIL goes UP when an OSD is marked down+in
- Bump up.
- 06:04 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- QA Review in progress.
- 06:02 PM RADOS Bug #53000: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called
- Still looking into this one.
- 05:56 PM RADOS Bug #54515: mon/health-mute.sh: TEST_mute: return 1 (HEALTH WARN 3 mgr modules have failed dependencies)
- I'll take a look at the latest re-occurrence to see if it needs a new tracker.
- 05:53 PM RADOS Bug #65670 (Closed): src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
- 05:48 PM RADOS Bug #47813 (Closed): osd op age is 4294967296
- 05:46 PM RADOS Bug #44631: ceph pg dump error code 124
- Hmm, looks this bug last time occurred in 2021. Is it still replicating?
- 05:43 PM RADOS Bug #65686: ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
- A note from bug scrub: Mohit might want to take a look. Pinged him in a side-channel.
- 12:07 PM RADOS Bug #65686 (Fix Under Review): ECBackend doesn't pass CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag when scrubbing
- A while ago https://github.com/ceph/ceph/pull/23629 introduced CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE flag for deep-scru...
- 04:59 PM Ceph Bug #65692 (New): Body of POST requests is dumped into the RGW log file (at level 10) which might contain sensitive info such as user name/password.
- Body of POST requests is dumped into the RGW log file (at level 10) which might contain sensitive info such as user n...
- 04:13 PM Orchestrator Bug #65691 (Pending Backport): cephadm: set "osd - profile rbd" for nvmeof service
- Currently Cephadm does set "profile rbd" for "mon" scope, but not for "osd". This basically results in rbd snapshot f...
- 03:21 PM CephFS Cleanup #65690 (New): mds: move specialized cleanup for export_dir to MDCache::request_cleanup
- https://github.com/ceph/ceph/blob/14f956b95eb2902d7a33b1026c450cb388ada113/src/mds/Migrator.cc#L245
We should also... - 03:20 PM CephFS Cleanup #65689 (New): mds: move specialized cleanup for fragment_dir to MDCache::request_cleanup
- https://github.com/ceph/ceph/blob/14f956b95eb2902d7a33b1026c450cb388ada113/src/mds/MDCache.cc#L12101-L12112
We sho... - 03:19 PM nvme-of Backport #65124 (Resolved): squid: cephadm - make changes to ceph-nvmeof.conf template
- 01:43 PM Ceph QA QA Run #65688 (QA Needs Rerun/Rebuilt): wip-yuri4-testing-2024-04-29-0642
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/56477 - squid: qa: Add benign cluster warning...- 11:37 AM CephFS Bug #50260: pacific: qa: "rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty"
- Venky Shankar wrote in #note-3:
> Xiubo Li wrote in #note-2:
> > /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-tes... - 09:47 AM CephFS Bug #50260: pacific: qa: "rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty"
- Xiubo Li wrote in #note-2:
> /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-s... - 10:39 AM sepia Support #65685 (In Progress): Sepia Lab Access Request
- Hi Adam,
Please provide the access to sepia lab.
*Here are the details:*
root:~# sudo ./sepia/new-client... - 10:37 AM Dashboard Bug #61720 (Pending Backport): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
- 10:36 AM Dashboard Bug #61720 (Fix Under Review): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
- 10:06 AM Dashboard Bug #61720 (Pending Backport): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
- 10:06 AM Dashboard Bug #61720 (Fix Under Review): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
- 08:40 AM Dashboard Bug #61720 (Pending Backport): mgr/dashboard: embedded grafana dashboards are still using the old 'graph' panel type.
- 09:41 AM RADOS Bug #64373 (Fix Under Review): osd: Segmentation fault on OSD shutdown
- 09:26 AM Orchestrator Bug #65683 (New): cephadmin wrongly assumes that lines in /sys/kernel/security/apparmor/profiles contain exactly one space
- The code in line
https://github.com/ceph/ceph/blob/1680e466aab77cdf9ba07394bea664106580b32b/src/cephadm/cephadmlib/h... - 08:52 AM bluestore Bug #65678: Cannot use BtreeAllocator for blustore or bluefs
- NB: before backporting please resolve https://tracker.ceph.com/issues/61949.
- 08:45 AM bluestore Bug #65678 (Fix Under Review): Cannot use BtreeAllocator for blustore or bluefs
- 12:44 AM bluestore Bug #65678 (Fix Under Review): Cannot use BtreeAllocator for blustore or bluefs
- BtreeAllocator was added in the commit https://github.com/ceph/ceph/pull/41828.
Its performance has advantages in so... - 08:47 AM Dashboard Feature #65681 (New): mgr/dashboard: add support for smb service
- Since smb service deployement has been added with cephadm, the dashboard should support the smb service mangement
- 07:19 AM CephFS Bug #50821 (Fix Under Review): qa: untar_snap_rm failure during mds thrashing
- 12:52 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- Yeah, really this time we hit another case. The local *MDS* was in *up:active* state but not others, so in this case ...
- 12:23 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It ...
- 07:18 AM rgw Bug #59471 (Resolved): Object Ownership Inconsistent
- 07:17 AM rgw Backport #61353 (Resolved): quincy: Object Ownership Inconsistent
- 07:16 AM CephFS Bug #57591 (Resolved): cephfs: qa enables kclient for newop test
- 07:16 AM rgw Backport #59376 (In Progress): quincy: rgw/s3 transfer encoding problems.
- 07:15 AM CephFS Bug #58018 (Resolved): mount.ceph: will fail with old kernels
- 07:15 AM CephFS Backport #58251 (Resolved): quincy: mount.ceph: will fail with old kernels
- 07:14 AM CephFS Bug #16745 (Resolved): mon: prevent allocating snapids allocated for CephFS
- 07:13 AM rgw Backport #59144 (In Progress): quincy: rgw: request QUERY_STRING is duplicated into ops-log uri element
- 07:12 AM rgw Backport #58327 (In Progress): quincy: ListOpenIDConnectProviders XML format error
- 07:11 AM rgw Bug #57784 (Resolved): beast frontend crashes on exception from socket.local_endpoint()
- 07:11 AM rgw Backport #58235 (In Progress): quincy: multisite sync process block after long time running
- 07:08 AM rgw Backport #58237 (Resolved): quincy: beast frontend crashes on exception from socket.local_endpoint()
- 07:08 AM rgw Bug #56572 (Resolved): pubsub test failures
- 07:08 AM rgw Backport #57561 (Resolved): quincy: pubsub test failures
- 06:55 AM rgw Bug #59048 (Resolved): DeleteObjects response does not include DeleteMarker/DeleteMarkerVersionId
- 06:55 AM rgw Backport #59132 (Resolved): quincy: DeleteObjects response does not include DeleteMarker/DeleteMarkerVersionId
- 06:55 AM rgw Bug #57881 (Resolved): LDAP invalid password resource leak fix
- 06:54 AM Ceph Bug #55079 (Resolved): rpm: remove contents of build directory at end of %install section
- 06:49 AM crimson Bug #65585: unittest-seastore (Timeout)
- Xuehan Xu wrote in #note-9:
> Yingxin Cheng wrote in #note-8:
> > Seem to me the blocking issue of test-seastore re... - 06:46 AM crimson Bug #65585: unittest-seastore (Timeout)
- Yingxin Cheng wrote in #note-8:
> Seem to me the blocking issue of test-seastore reveals a deadlock from background ... - 06:37 AM crimson Bug #65585: unittest-seastore (Timeout)
- Seem to me the blocking issue of test-seastore reveals a deadlock from background cleaning -- the IO transaction didn...
- 06:43 AM Ceph QA QA Run #65680 (QA Testing): wip-mchangir-testing-20240429.064231-main-debug
- * "PR #44359":https://github.com/ceph/ceph/pull/44359 -- mds: un-inline data on scrub
- 06:27 AM RADOS Backport #63879 (Resolved): quincy: tools/ceph_objectstore_tool: Support get/set/superblock
- 06:27 AM rgw Backport #64426 (Resolved): reef: rgw: rados objects wrongly deleted
- 05:17 AM rgw Backport #64325 (In Progress): reef: multisite: avoid writing multipart parts to the bucket index log
- 04:53 AM rgw Backport #64324 (In Progress): quincy: multisite: avoid writing multipart parts to the bucket index log
- 03:24 AM crimson Bug #65679 (New): osd crashes due to inconsistency between the in-memory cache and on disk data of the snap mapper
- Operations in crimson can be interrupted, which is different from classic osds. The implementation of SnapMapper foll...
04/28/2024
- 05:02 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It ...
- 04:45 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- Okay, finally it was because the *mds.b* crashed and this was why it wasn't brought up:...
- 04:40 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- It seems the *mds.b* daemon wasn't brought up in *300s* and then the watchdog barked and then all the daemons were ki...
- 02:56 AM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- Patrick Donnelly wrote in #note-10:
> [...]
>
> From: /teuthology/pdonnell-2024-04-20_23:33:17-fs-wip-pdonnell-te... - 03:21 AM crimson Bug #65673 (Fix Under Review): the main branch fails gcc-13 compilation
- 01:41 AM CephFS Backport #65363 (Rejected): reef: qa/cephfs: test_idem_unaffected_root_squash fails
- The depending patches were not backported to reef: https://tracker.ceph.com/issues/47264.
- 01:41 AM CephFS Backport #65362 (Rejected): quincy: qa/cephfs: test_idem_unaffected_root_squash fails
- The depending patches were not backported to quincy: https://tracker.ceph.com/issues/47264.
- 01:28 AM CephFS Backport #65361 (Fix Under Review): squid: qa/cephfs: test_idem_unaffected_root_squash fails
- 01:15 AM CephFS Backport #65323 (Fix Under Review): squid: src/mds/MDCache.cc: 5131: FAILED ceph_assert(isolated_inodes.empty())
- 01:15 AM CephFS Backport #65321 (Fix Under Review): reef: src/mds/MDCache.cc: 5131: FAILED ceph_assert(isolated_inodes.empty())
- 01:15 AM CephFS Backport #65322 (Fix Under Review): quincy: src/mds/MDCache.cc: 5131: FAILED ceph_assert(isolated_inodes.empty())
- 01:14 AM CephFS Backport #65677 (Fix Under Review): quincy: mds: the name and descriptions of the inotable testing only options need to be fixed
- 01:02 AM CephFS Backport #65677 (Fix Under Review): quincy: mds: the name and descriptions of the inotable testing only options need to be fixed
- *mds_kill_skip_replaying_inotable* and *mds_inject_skip_replaying_inotable* are exactly the same, which is incorrect....
- 01:14 AM CephFS Backport #65676 (Fix Under Review): reef: mds: the name and descriptions of the inotable testing only options need to be fixed
- 01:01 AM CephFS Backport #65676 (Fix Under Review): reef: mds: the name and descriptions of the inotable testing only options need to be fixed
- *mds_kill_skip_replaying_inotable* and *mds_inject_skip_replaying_inotable* are exactly the same, which is incorrect....
- 01:13 AM CephFS Backport #65675 (Fix Under Review): squid: mds: the name and descriptions of the inotable testing only options need to be fixed
- 01:00 AM CephFS Backport #65675 (Fix Under Review): squid: mds: the name and descriptions of the inotable testing only options need to be fixed
- *mds_kill_skip_replaying_inotable* and *mds_inject_skip_replaying_inotable* are exactly the same, which is incorrect....
04/27/2024
- 03:51 PM bluestore Backport #65485 (In Progress): squid: bluestore/bluestore_types: check 'it' valid before using
- 03:50 PM bluestore Backport #65484 (In Progress): reef: bluestore/bluestore_types: check 'it' valid before using
- 03:49 PM bluestore Backport #65483 (Resolved): quincy: bluestore/bluestore_types: check 'it' valid before using
- 01:32 PM bluestore Bug #64511 (Pending Backport): kv/RocksDBStore: rocksdb_cf_compact_on_deletion has no effect on the default column family
- 03:12 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- s/good/off/g
- 03:11 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- Something still looks good though with the numbers. It's like Ceph isn't balanced? Sure, OSD 3 RAW USE and DATA are t...
- 01:59 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- ...
- 01:46 AM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- Yeah, I @kill -9@ the OSD process in the pod and it fixed resolved the problem. Thanks!
04/26/2024
- 11:16 PM RADOS Bug #57061 (Pending Backport): Use single cluster log level (mon_cluster_log_level) config to control verbosity of cluster logs while logging to external entities
- 10:46 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- Generally what you need is to shutdown OSD process in a non-graceful manner. And let it rebuild allocmap during the f...
- 06:45 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- I read https://tracker.ceph.com/issues/63858#note-7, but I'm not sure how to apply the workaround. I have tried delet...
- 04:24 PM bluestore Bug #65659 (Triaged): OSD Resize Increases Used Capacity Not Available Capacity
- 04:24 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- Hi James,
so I'm pretty sure this is a duplicate of https://tracker.ceph.com/issues/63858
Please see https://trac... - 08:19 PM rbd Backport #65547 (Resolved): quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 12:48 PM rbd Backport #65547: quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/57029
merged - 06:18 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- Kamoltat (Junior) Sirivadhna wrote in #note-13:
> @yuriw Please rerun this we have to many failed and dead jobs due ... - 05:00 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @yuriw Please rerun this we have to many failed and dead jobs due to infrastructure failures
- 03:00 PM mgr Cleanup #55835: mgr: mute/hide NOTIFY_TYPES log errors
- I made a draft pr for this: https://github.com/ceph/ceph/pull/57106
I didn't feel like verifying the unit tests pa... - 02:58 PM Ceph QA QA Run #65674 (QA Testing): wip-rishabh-testing-20240426.111959
- https://github.com/ceph/ceph/pull/56981
https://github.com/ceph/ceph/pull/56846
https://github.com/ceph/ceph/pull/5... - 02:31 PM Ceph QA QA Run #65516 (QA Closed): wip-rishabh-testing-20240416.193735
- 02:31 PM Ceph QA QA Run #65516: wip-rishabh-testing-20240416.193735
- Testing was significantly slowed down. First due to new and persistent infra failures on CentOS 9 that caused ~95 dea...
- 12:48 PM Ceph QA QA Run #65638 (QA Closed): wip-yuriw4-testing-20240423.151325-quincy
- 07:07 AM Ceph QA QA Run #65638 (QA Approved): wip-yuriw4-testing-20240423.151325-quincy
- 12:06 PM Dashboard Bug #61312 (Fix Under Review): The command "ceph config set mgr mgr/dashboard/redirect_resolve_ip_addr True" fails
- 11:43 AM rgw Backport #65666 (Fix Under Review): squid: rgw/lc: A few buckets stuck in UNINITIAL state
- 11:04 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Patrick Donnelly wrote in #note-23:
> Dhairya Parmar wrote in #note-21:
> > Dhairya Parmar wrote in #note-20:
> > ... - 10:57 AM CephFS Bug #61660 (Pending Backport): mds: the name and descriptions of the inotable testing only options need to be fixed
- 10:28 AM crimson Bug #65673 (Fix Under Review): the main branch fails gcc-13 compilation
- ...
- 10:10 AM crimson Bug #65672 (New): Rados write requests user_version set to 0 when pg interval changes lead to duplicated client requests.
- Client logs:...
- 09:47 AM Orchestrator Bug #65671: Add node-exporter using ceph orch
- Vahideh Alinouri wrote:
> I think there is a functionality issue in below command because cephadm log printed succ... - 09:30 AM Orchestrator Bug #65671: Add node-exporter using ceph orch
- Vahideh Alinouri wrote:
I have tried to add node-exporter to new host in ceph cluster by the command mentioned in do... - 09:28 AM Orchestrator Bug #65671: Add node-exporter using ceph orch
- I have tried to add node-exporter to new host in ceph cluster by the command mentioned in docuemnt
- 09:26 AM Orchestrator Bug #65671 (New): Add node-exporter using ceph orch
- ...
- 09:44 AM RADOS Bug #65670: src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
- This seems to be wrong..
- 08:44 AM RADOS Bug #65670: src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
- https://github.com/ceph/ceph/pull/57100
- 08:42 AM RADOS Bug #65670 (Closed): src/test/osd/RadosModel.h inappriopriate erasing items while iterating std::set caused test case failures.
- ...
- 08:57 AM CephFS Bug #65669 (Fix Under Review): QuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
- 03:27 AM CephFS Bug #65669 (In Progress): QuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
- 01:12 AM CephFS Bug #65669 (Fix Under Review): QuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
- This design decision appears counterintuitive after having seen it in the wild.
Here the --await was sent with a d... - 07:38 AM CephFS Bug #63906: Inconsistent file mode across two clients
- Hi Leonid,
Thanks. Good to hear. - 02:43 AM CephFS Bug #63906: Inconsistent file mode across two clients
- Tao, I appreciate your quick and detailed response. I reviewed the client code for setxattr, and the code doesn't use...
- 02:21 AM crimson Bug #65610 (Resolved): unittest-object-data-handler crashes testing object_data_handler_test_t.overwrite_then_read_within_transaction
- 01:25 AM Ceph QA QA Run #65655 (QA Needs Approval): wip-yuri2-testing-2024-04-24-0914-squid
- @rfriedma can you pls review when done?
04/25/2024
- 11:05 PM rgw Bug #65668 (Fix Under Review): Notification: Persistent queue not deleted when topic is deleted via radosgw-admin
- Post this "commit":https://github.com/ceph/ceph/commit/4c50ad69c37110d42f1f68f6e567cdf5ac506a32, the logic to remove ...
- 07:36 PM CephFS Bug #63906: Inconsistent file mode across two clients
- Hi Leonid,
Thanks for your question. I answer you confusions below:
> I'd like to suggest that the issue here i... - 06:03 PM CephFS Bug #63906: Inconsistent file mode across two clients
- Hi, I came here from the PR.
I'd like to suggest that the issue here is not a bug in the MDS. While I can't find a... - 07:28 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- I reached a point where I could resize another OSD to get the output from @ceph tell osd.N perf dump bluefs@. I follo...
- 04:17 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- I'm actively working on this cluster. I have already replaced osd.0 to move forward with my work. I'll need to perfor...
- 04:08 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- And please be aware of https://tracker.ceph.com/issues/63858
- 03:59 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- James,
would you please share the output of 'ceph tell osd.N perf dump bluefs" after such an expansion then?
- 02:04 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- Igor Fedotov wrote in #note-1:
> Hi James!
> I presume you haven't run ceph-bluestore-tool's bluefs-bdev-expand com... - 01:56 PM bluestore Bug #65659: OSD Resize Increases Used Capacity Not Available Capacity
- Hi James!
I presume you haven't run ceph-bluestore-tool's bluefs-bdev-expand command against expanded OSD(s), have y... - 06:27 PM CephFS Bug #65660: mds: drop client metrics during recovery
- Christopher Hoffman wrote in #note-2:
> >there's little reason to record historical metrics from the clients
>
> ... - 06:19 PM CephFS Bug #65660: mds: drop client metrics during recovery
- Xiubo Li wrote in #note-1:
> Is this new in the upstream master ? As I remembered we have improved this and the clie... - 04:49 PM CephFS Bug #65660: mds: drop client metrics during recovery
- >there's little reason to record historical metrics from the clients
Can you expand on this? Are we losing anythin... - 12:38 AM CephFS Bug #65660: mds: drop client metrics during recovery
- Is this new in the upstream master ? As I remembered we have improved this and the clients will only send the metrics...
- 12:34 AM CephFS Bug #65660 (In Progress): mds: drop client metrics during recovery
- When the rank is coming up, there's little reason to record historical metrics from the clients. We've also seen floo...
- 06:22 PM rgw Backport #65667 (New): reef: rgw/lc: A few buckets stuck in UNINITIAL state
- 06:21 PM rgw Backport #65666 (Resolved): squid: rgw/lc: A few buckets stuck in UNINITIAL state
- 06:21 PM rgw Backport #65665 (New): quincy: rgw/lc: A few buckets stuck in UNINITIAL state
- 06:21 PM rgw Bug #65160: rgw/lc: A few buckets stuck in UNINITIAL state
- quincy and reef backports also need https://github.com/ceph/ceph/pull/47595
- 02:35 PM rgw Bug #65160 (Pending Backport): rgw/lc: A few buckets stuck in UNINITIAL state
- 03:31 PM rgw Bug #63791: RGW: a subuser with no permission can still list buckets and create buckets
- I believe this is also an issue for subusers with read permissions: they can still create buckets (at least on Quincy...
- 03:02 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya Parmar wrote in #note-21:
> Dhairya Parmar wrote in #note-20:
> > Patrick Donnelly wrote in #note-19:
> > ... - 03:02 PM CephFS Bug #65265 (Fix Under Review): qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya Parmar wrote in #note-20:
> Patrick Donnelly wrote in #note-19:
> > Dhairya Parmar wrote in #note-18:
> > ... - 03:00 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya Parmar wrote in #note-20:
> Patrick Donnelly wrote in #note-19:
> > Dhairya Parmar wrote in #note-18:
> > ... - 02:34 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Patrick Donnelly wrote in #note-19:
> Dhairya Parmar wrote in #note-18:
> > I ran a couple of NFS jobs, no `MGR_DOWN`... - 11:41 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya Parmar wrote in #note-18:
> I ran a couple of NFS jobs, no `MGR_DOWN` reported
>
> https://pulpito.ceph.c... - 10:19 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- I ran a couple of NFS jobs, no `MGR_DOWN` reported
https://pulpito.ceph.com/dparmar-2024-04-10_06:37:26-fs:nfs-wip... - 09:21 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- main branch - https://pulpito.ceph.com/rishabh-2024-04-24_07:32:23-fs-rishabh-main-apr17-a654945-testing-default-smit...
- 02:45 PM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
- quoting https://www.boost.org/doc/libs/1_82_0/doc/html/boost_asio/reference/ssl__error__stream_errors.html:...
- 01:52 PM rgw Bug #65664: Crash observed in boost::asio module related to stream.async_shutdown()
- following testing with openssl s_client, s3cmd and Warp, the following ec's occur under normal conditions:...
- 01:51 PM rgw Bug #65664 (Fix Under Review): Crash observed in boost::asio module related to stream.async_shutdown()
- continuing from downstream BZ#2275284
call stack:... - 02:45 PM Ceph QA QA Run #65655: wip-yuri2-testing-2024-04-24-0914-squid
- jammy failed, retriggered
- 02:32 PM rgw Bug #65590 (Fix Under Review): rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
- 02:21 PM rgw Bug #65626 (Fix Under Review): rgw: false assumption on vault bucket key deletion
- 02:19 PM Ceph QA QA Run #65638 (QA Needs Approval): wip-yuriw4-testing-20240423.151325-quincy
- 02:19 PM Ceph QA QA Run #65638: wip-yuriw4-testing-20240423.151325-quincy
- Ilya Dryomov wrote in #note-7:
> Hi Yuri,
>
> This needs a rerun for krbd since krbd_rxbounce job died on reboot ... - 11:37 AM Ceph QA QA Run #65638 (QA Needs Rerun/Rebuilt): wip-yuriw4-testing-20240423.151325-quincy
- 11:36 AM Ceph QA QA Run #65638: wip-yuriw4-testing-20240423.151325-quincy
- Hi Yuri,
This needs a rerun for krbd since krbd_rxbounce job died on reboot for some reason. - 02:19 PM Ceph Bug #65652: vstart.sh can not start
- it happen on compiling main branch(recent) on ubuntu 22,nm libec_jerasure.so,see jerasure_init is "U" symbol
but r... - 01:50 PM bluestore Fix #65600 (Fix Under Review): bluefs alloc unit should only be shrink
- 01:14 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Hi Nir,
I built a container based on 18.2.3 (an upcoming release). It would be great if you could try it: podman ... - 08:35 AM rbd Bug #65487 (Fix Under Review): rbd-mirror daemon in ERROR state, require manual restart
- 01:10 PM rgw Bug #65462 (Pending Backport): rgw: eliminate ssl enforcement for sse-s3 encryption
- 01:09 PM rgw Bug #65473 (Pending Backport): rgw: exclude logging of request payer for 403 requests
- 12:06 PM rbd Backport #65587 (In Progress): squid: insufficient randomness for group and group snapshot IDs
- 12:04 PM rbd Backport #65586 (In Progress): reef: insufficient randomness for group and group snapshot IDs
- 12:03 PM rbd Backport #65588 (In Progress): quincy: insufficient randomness for group and group snapshot IDs
- 11:45 AM rbd Bug #65653 (Duplicate): run-rbd-unit-tests-0.sh: TestMigration.StressLive failure
- 11:45 AM rbd Bug #65653: run-rbd-unit-tests-0.sh: TestMigration.StressLive failure
- This is with RBD_FEATURES=0 and a slightly different mismatch, but still too similar to track separately.
- 11:31 AM crimson Bug #65663 (New): Enable LibRadosSnapshotsSelfManagedPP.RollbackPP
- The test is currently disabled by `SKIP_IF_CRIMSON()`.
LibRadosSnapshotsPP.RollbackPP is supported and so should Lib... - 10:33 AM RADOS Bug #54744: crash: void MonMap::add(const mon_info_t&): assert(addr_mons.count(a) == 0)
- Seems same story here for Pacific 16.2.15
My prev monmap before change:... - 09:03 AM rgw Bug #62136: "test pushing kafka s3 notification on master" - no events are sent
- created a separate tracker: https://tracker.ceph.com/issues/65662
- 04:31 AM rgw Bug #62136: "test pushing kafka s3 notification on master" - no events are sent
- could be another issue with the kafka consumer (on top of what was fixed in PR 54637):...
- 09:02 AM rgw-testing Bug #65662 (New): kafka: no creation event found for key
- test is still failing even after the fix from https://github.com/ceph/ceph/pull/54637.
see: https://tracker.ceph.com... - 07:42 AM RADOS Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- Reef backport is in QA
- 07:34 AM crimson Feature #65478: Support SnapMapper::Scrubber
- this issue is hard for me, so need more time.
- 06:45 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Xiubo Li wrote in #note-3:
> Is that possible to enable the mds debug logs, let's see whether there are other logs... - 01:10 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Mykola Golub wrote:
> Our customer were observing sporadic "client isn't responding to mclientcaps(revoke)" issue so... - 01:00 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- Xiubo Li wrote in #note-1:
> I think you have enabled *recover_session* in kclient ?
>
> [...]
>
> More detail... - 12:48 AM CephFS Bug #65647: Evicted kernel client may get stuck after reconnect
- I think you have enabled *recover_session* in kclient ?...
- 04:07 AM CephFS Bug #65630 (Fix Under Review): mds: rename request was deadlocked between two different MDSs
- 01:45 AM Ceph QA QA Run #65661 (QA Closed): wip-pdonnell-testing-20240425.015853-debug
- * "PR #57059":https://github.com/ceph/ceph/pull/57059 -- mds: abort fragment/export when quiesced
* "PR #57044":http... - 01:33 AM CephFS Bug #65603 (Fix Under Review): mds: quiesce timeout due to a freezing directory
04/24/2024
- 11:16 PM bluestore Bug #65659 (Triaged): OSD Resize Increases Used Capacity Not Available Capacity
- h1. Deviation from expected behavior
After resizing the underlying disk at the hypervisor and OS level *resizing t... - 07:49 PM rbd Bug #46875: TestLibRBD.TestPendingAio: test_librbd.cc:4539: Failure or SIGSEGV
- from https://jenkins.ceph.com/job/ceph-pull-requests/133893/consoleFull...
- 07:41 PM CephFS Bug #65658 (Fix Under Review): mds: MetricAggregator::ms_can_fast_dispatch2 acquires locks
- 07:33 PM CephFS Bug #65658 (Fix Under Review): mds: MetricAggregator::ms_can_fast_dispatch2 acquires locks
- There was a lot of discussion surrounding this in
https://github.com/ceph/ceph/pull/26004/
but circling back we... - 06:21 PM Orchestrator Bug #65657 (New): doc: lack of clarity for explicit placement analogue in yaml spec
- https://docs.ceph.com/en/latest/cephadm/services/#explicit-placements
Specifically, I'm wondering if "host:[ip]=na... - 05:45 PM CephFS Tasks #65615 (Resolved): lchown corrupts symlink entry
- The code was using parent dir ent fscrypt info/key. Using an incorrect key to decrypt, will yield incorrect plaintext...
- 05:05 PM Ceph QA QA Run #65641 (QA Building): wip-yuriw8-testing-20240424.000125-main
- build failed
https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=cent... - 02:03 PM Ceph QA QA Run #65641: wip-yuriw8-testing-20240424.000125-main
- repushed
- 12:01 AM Ceph QA QA Run #65641 (QA Needs Approval): wip-yuriw8-testing-20240424.000125-main
- --- done. these PRs were included:
https://github.com/ceph/ceph/pull/51171 - osd/scrub: Change scrub cost to average... - 04:58 PM rgw Bug #65656: Reduce default thread pool size
- Test env:
---------
3x MON/MGR nodes
Dell R630
2x E5-2683 v3 (28 total cores, 56 threads)
128 GB RAM
8x... - 04:41 PM rgw Bug #65656 (Fix Under Review): Reduce default thread pool size
- Our recent RGW thread pool size profiling (RHEL 9.2, Ceph 18.2.0-131) revealed that for both smaller (max 256KB) and ...
- 04:15 PM Ceph QA QA Run #65655 (QA Closed): wip-yuri2-testing-2024-04-24-0914-squid
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/56777 - squid: osd/scrub: implement reservat...- 03:59 PM Orchestrator Backport #64844 (Resolved): reef: Regression: Permanent KeyError: 'TYPE' : return self.blkid_api['TYPE'] == 'part'
- 03:58 PM Orchestrator Bug #65035 (Duplicate): ERROR: required file missing from config-json: idmap.conf
- duplicate of https://tracker.ceph.com/issues/65155
- 03:54 PM Orchestrator Bug #64118 (Resolved): cephadm: RuntimeError: Failed command: apt-get update: E: The repository 'https://download.ceph.com/debian-quincy jammy Release' does not have a Release file.
- I think this should be fixed now that we have quincy jammy builds
- 03:34 PM Orchestrator Backport #65378 (Resolved): squid: cephadm: client-keyring also overwrites ceph.conf
- 03:15 PM rgw Bug #65654 (New): run-bucket-check.sh: failed assert len(json_out) == len(unlinked_keys)
- https://qa-proxy.ceph.com/teuthology/suriarte-2024-04-23_15:04:03-rgw-rgw-update-boost-redis-distro-default-smithi/76...
- 03:13 PM nvme-of Backport #65649 (In Progress): squid: Change some default values for OMAP lock parameters in nvmeof conf file
- 01:45 PM nvme-of Backport #65649 (In Progress): squid: Change some default values for OMAP lock parameters in nvmeof conf file
- https://github.com/ceph/ceph/pull/56497
- 03:05 PM Ceph QA QA Run #65638 (QA Needs Approval): wip-yuriw4-testing-20240423.151325-quincy
- @idryomov rgw PRs will be removed, so this is only one rbd PR
- 02:21 PM Ceph QA QA Run #65638: wip-yuriw4-testing-20240423.151325-quincy
- @cbodley Can't schedule rgw
yuriw@teuthology ~ [14:09:41]> teuthology-suite -v --ceph-repo $CEPH_REPO -c $CEPH_B... - 02:56 PM nvme-of Backport #65650 (In Progress): reef: Change some default values for OMAP lock parameters in nvmeof conf file
- 01:46 PM nvme-of Backport #65650 (In Progress): reef: Change some default values for OMAP lock parameters in nvmeof conf file
- https://github.com/ceph/ceph/pull/56498
- 02:52 PM rbd Bug #65653 (Duplicate): run-rbd-unit-tests-0.sh: TestMigration.StressLive failure
- from https://jenkins.ceph.com/job/ceph-pull-requests/133815/consoleFull on a squid pr:...
- 02:52 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
- Sorry the last two weeks have been much busier than usual and this slipped my mind. I discussed this with Adam King a...
- 12:55 AM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
- Any thoughts on this, John? I have to install a cluster from a private repo tomorrow, and it reminded me we'd had th...
- 02:44 PM Ceph Bug #65652: vstart.sh can not start
- https://github.com/ceph/ceph/pull/57077
- 02:12 PM Ceph Bug #65652 (New): vstart.sh can not start
2024-04-24T21:26:01.158+0800 7f4ec09ffd40 -1 load dlopen(/home/ecs-assist-user/ceph/build/lib/libec_jerasure.so): /...- 02:40 PM rgw Bug #64841 (Triaged): java_s3tests: testObjectCreateBadExpectMismatch failure
- 02:32 PM rgw Bug #62136: "test pushing kafka s3 notification on master" - no events are sent
- this says resolved, but i still see failures like this on main:...
- 02:12 PM rgw Bug #65651 (New): s3select: test_true_false_in_expressions s3test failure
- from a rgw/sts job based on recent main
https://qa-proxy.ceph.com/teuthology/cbodley-2024-04-24_12:59:55-rgw-wip-cbo... - 01:40 PM nvme-of Feature #65566 (Pending Backport): Change some default values for OMAP lock parameters in nvmeof conf file
- 01:39 PM rgw Feature #18621 (Resolved): rgw: change default chunk size
- 12:28 PM rgw Bug #65648 (New): TestAMQP.MaxConnections FAILED ceph_assert(!conn->state)
- ...
- 11:56 AM Dashboard Bug #61312: The command "ceph config set mgr mgr/dashboard/redirect_resolve_ip_addr True" fails
- Nizamudeen tells me the following through Slack:
BEGIN QUOTED TEXT
this particular configuration is introduced ... - 11:54 AM Ceph Documentation #65631 (Resolved): clarify dual-stack mode
- 11:53 AM RADOS Backport #65646 (Fix Under Review): squid: osd/scrub: must disable reservation timeout for reserver-based requests
- 11:12 AM RADOS Backport #65646 (Resolved): squid: osd/scrub: must disable reservation timeout for reserver-based requests
- 11:18 AM CephFS Bug #65647 (Triaged): Evicted kernel client may get stuck after reconnect
- Our customer were observing sporadic "client isn't responding to mclientcaps(revoke)" issue so they configured auto e...
- 11:04 AM RADOS Bug #65044 (Pending Backport): osd/scrub: must disable reservation timeout for reserver-based requests
- 10:03 AM rgw Bug #65645 (New): lifecycle notifications are sent from radosgw-admin
- when "radosgw-admin lc process" is called, and there are buckets that have bucket notification events set with "Objec...
- 09:43 AM CephFS Backport #65644 (Fix Under Review): quincy: qa/cephfs: absence of e03331e causes test_nfs to fail
- @tasks.cephfs.test_nfs.TestNFS.test_non_existent_cluster@ failed on here - https://pulpito.ceph.com/vshankar-2024-03-...
- 09:33 AM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-17:
> However, the "unable to connect to remote cluster" error isn't cleared and you cont... - 09:31 AM rgw Bug #64999: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone
- Hi All,
I just wanted to quick follow-up on my previous query about "Slow RGW multisite sync
due to '304 Not Modi... - 08:36 AM Dashboard Bug #65643 (New): mgr/dashboard: dashboard landing page cant be seen as readonly
- As a read only user you should be able to view the landing page, but it is not possible
- 06:54 AM Dashboard Cleanup #65070 (Resolved): mgr/dashboard: use alertmanager v2 APIs mgr/dashboard: short_description
- 06:54 AM Dashboard Backport #65255 (Resolved): squid: mgr/dashboard: use alertmanager v2 APIs mgr/dashboard: short_description
- 06:22 AM crimson Bug #65585: unittest-seastore (Timeout)
- If each test's execution time was correct, timeout is caused by "stuck in one of tests".
e.g. https://jenkins.ceph... - 02:07 AM CephFS Tasks #65613: truncate failing when using path
- Greg Farnum wrote in #note-2:
> Hmm, I'm surprised you found missing Server logic here. Shouldn't that have turned u...
04/23/2024
- 11:35 PM Ceph QA QA Run #65126 (QA Closed): wip-yuri8-testing-2024-03-25-1419
- 10:47 PM Ceph QA QA Run #65126 (QA Approved): wip-yuri8-testing-2024-03-25-1419
- @yuriw rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65126
- 10:03 PM Ceph QA QA Run #65126 (QA Needs Approval): wip-yuri8-testing-2024-03-25-1419
- 11:18 PM bluestore Bug #56262: crash: BlueStore::_txc_create(BlueStore::Collection*, BlueStore::OpSequencer*, std::list<Context*, std::allocator<Context*> >*, boost::intrusive_ptr<TrackedOp>)
- There seems to be some race condition at the time of OSD shutdown. The kv db handle was destroyed and one of OSD thre...
- 10:27 PM RADOS Bug #54515: mon/health-mute.sh: TEST_mute: return 1 (HEALTH WARN 3 mgr modules have failed dependencies)
- /a/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7634102
- 07:17 PM Ceph QA QA Run #65552 (QA Closed): wip-yuri2-testing-2024-04-17-0823-reef
- 04:02 PM Ceph QA QA Run #65552 (QA Approved): wip-yuri2-testing-2024-04-17-0823-reef
- @yuriw rados approved! https://tracker.ceph.com/projects/rados/wiki/REEF#httpstrackercephcomissues65552
- 06:38 PM Dashboard Bug #62972: ERROR: test_list_enabled_module (tasks.mgr.dashboard.test_mgr_module.MgrModuleTest)
- https://jenkins.ceph.com/job/ceph-api/72895/ on main
- 06:30 PM RADOS Backport #65376 (In Progress): quincy: crash: void PaxosService::propose_pending(): assert(have_pending)
- 06:29 PM RADOS Backport #65377 (In Progress): reef: crash: void PaxosService::propose_pending(): assert(have_pending)
- 06:28 PM mgr Backport #65621 (In Progress): quincy: mgr: update cluster state for new maps from the mons before notifying modules
- 06:28 PM mgr Backport #65623 (In Progress): reef: mgr: update cluster state for new maps from the mons before notifying modules
- 06:27 PM mgr Backport #65622 (In Progress): squid: mgr: update cluster state for new maps from the mons before notifying modules
- 06:27 PM CephFS Backport #65620 (In Progress): squid: qa: test_max_items_per_obj open procs not fully cleaned up
- 06:26 PM CephFS Backport #65619 (In Progress): squid: mds: quiesce_counter decay rate initialized from wrong config
- 06:23 PM CephFS Backport #65273 (In Progress): squid: PG_DEGRADED warnings during cluster creation via cephadm: "Health check failed: Degraded data redundancy: 2/192 objects degraded (1.042%), 1 pg degraded (PG_DEGRADED)"
- 06:20 PM Ceph Bug #64095 (Resolved): ceph-exporter is not included in the deb packages
- 06:19 PM Ceph Bug #63637 (Resolved): debian packaging is missing bcrypt dependency for ceph-mgr's .requires file
- 06:18 PM Ceph Backport #63638 (Resolved): reef: debian packaging is missing bcrypt dependency for ceph-mgr's .requires file
- 06:12 PM Ceph Backport #63638: reef: debian packaging is missing bcrypt dependency for ceph-mgr's .requires file
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54662
merged - 06:18 PM Ceph Backport #65172 (Resolved): reef: ceph-exporter is not included in the deb packages
- 06:13 PM Ceph Backport #65172: reef: ceph-exporter is not included in the deb packages
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56541
merged - 05:52 PM CephFS Bug #65603 (In Progress): mds: quiesce timeout due to a freezing directory
- 04:37 PM rgw Backport #65640 (In Progress): squid: [rgw][accounts] bucket quota management at account-level
- 04:35 PM rgw Backport #65640 (Resolved): squid: [rgw][accounts] bucket quota management at account-level
- https://github.com/ceph/ceph/pull/57058
- 04:35 PM rgw Feature #65551 (Pending Backport): [rgw][accounts] bucket quota management at account-level
- 04:23 PM bluestore Bug #65482 (Fix Under Review): bluestore/bluestore_types: check 'it' valid before using
- 04:22 PM rgw Backport #65002 (Resolved): quincy: [CVE-2023-46159] RGW crash upon misconfigured CORS rule
- 04:07 PM Ceph QA QA Run #65574: wip-yuri7-testing-2024-04-18-1351-reef
- @matan can you review this run? Thought you'd be a good candidate since two of the PRs are yours.
- 04:06 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Greg Farnum wrote in #note-33:
> Venky Shankar wrote in #note-30:
> > OK. I'll elaborate. Generally, clients are no... - 04:02 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Venky Shankar wrote in #note-32:
> Dhairya Parmar wrote in #note-28:
> > as mentioned in yesterday's standup - some... - 03:25 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Venky Shankar wrote in #note-30:
> OK. I'll elaborate. Generally, clients are not trustable - someone can hook up a ... - 04:05 PM Ceph QA QA Run #65560: wip-yuri5-testing-2024-04-17-1400
- Hey @amathuri can you review this batch?
- 04:04 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- @ksirivad back to you!
- 02:14 PM Ceph QA QA Run #65349 (QA Needs Approval): wip-yuri3-testing-2024-04-05-0825
- 03:59 PM RADOS Bug #61832: Restoring #61785: osd-scrub-dump.sh: ERROR: Extra scrubs after test completion...not expected
- /a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668423
- 03:55 PM Infrastructure Bug #47690: RuntimeError: Stale jobs detected, aborting.
- /a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668439
- 03:53 PM bluestore Bug #56788: crash: void KernelDevice::_aio_thread(): abort
- /a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668449...
- 03:49 PM CephFS Tasks #65613: truncate failing when using path
- Hmm, I'm surprised you found missing Server logic here. Shouldn't that have turned up in kernel fscrypt testing? Xiub...
- 03:48 PM RADOS Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668452
- 03:44 PM Orchestrator Bug #64208: test_cephadm.sh: Container version mismatch causes job to fail.
- /a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668470
- 03:43 PM Infrastructure Bug #65639: smithi139 unable to be reached over ssh
- Affects centos 9 stream, if that turns out to be relevant.
- 03:42 PM Infrastructure Bug #65639 (In Progress): smithi139 unable to be reached over ssh
- /a/yuriw-2024-04-22_18:19:58-rados-wip-yuri2-testing-2024-04-17-0823-reef-distro-default-smithi/7668463...
- 03:13 PM Ceph QA QA Run #65638 (QA Closed): wip-yuriw4-testing-20240423.151325-quincy
- * "PR #57029":https://github.com/ceph/ceph/pull/57029 -- quincy: qa: fix krbd_msgr_segments and krbd_rxbounce failing...
- 01:49 PM CephFS Feature #65637 (New): mds: continue sending heartbeats during recovery when MDS journal is large
- When the MDS reaches up:rejoin / up:resolve after spending a long time (hours) in up:replay, it often gets in an loop...
- 01:40 PM rgw Backport #65636 (In Progress): squid: release note for rgw_realm init
- 01:39 PM rgw Backport #65636 (Resolved): squid: release note for rgw_realm init
- https://github.com/ceph/ceph/pull/57055
- 01:39 PM rgw Bug #65575 (Pending Backport): release note for rgw_realm init
- 01:34 PM Dashboard Backport #65255 (In Progress): squid: mgr/dashboard: use alertmanager v2 APIs mgr/dashboard: short_description
- 01:29 PM Infrastructure Bug #63831 (Closed): "make check" fails on docs-related PRs sometimes
- 01:19 PM Ceph Feature #63703: If a prefix is available, allow it be used to narrow the bounds of OMAP iterator
- Xiang Li wrote in #note-1:
> Is anyone trying out this new feature? Can I give it a try?
I don't think anyone has... - 02:12 AM Ceph Feature #63703: If a prefix is available, allow it be used to narrow the bounds of OMAP iterator
- Is anyone trying out this new feature? Can I give it a try?
- 01:05 PM crimson Bug #65635 (New): Crimson seastore unit test random failure on AARCH64 (DEADLYSIGNAL by caused by a READ memory access)
- [ RUN ] omap_manager_test/omap_manager_test_t.force_leafnode_split_merge_fullandbalanced/0
INFO 2024-04-23 08:... - 11:58 AM Ceph Bug #65634 (New): rbd-mirror user does not have enough permissions to obtain (daemon) health status information
- We are testing rbd-mirroring. There seems to be a permission error with the rbd-mirror user. Using this user to query...
- 09:45 AM sepia Support #65633 (In Progress): Sepia Lab Access Request
- Hi Team,
I am from ceph qe team and Requesting sepia lab access for the first time. Please do the needful.
De... - 09:13 AM crimson Bug #65585: unittest-seastore (Timeout)
- https://jenkins.ceph.com/job/ceph-pull-requests-arm64/55512/console...
- 07:34 AM Messengers Bug #65401: msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
- the periodically connection fault can be found in log by following steps:
1. set ms_connection_idle_timeout=60; debu... - 06:41 AM Ceph Documentation #65631 (Fix Under Review): clarify dual-stack mode
- 04:42 AM Ceph Documentation #65631 (Resolved): clarify dual-stack mode
- Robert Sander asks whether Ceph supports dual-stack mode. Dual-stack mode is when both IPv4 and IPv6 networks are use...
- 06:21 AM crimson Bug #65632 (New): crimson osd crashes due to daggling pointers of operation blockers
- There are time gaps between the destruction of OSDMapBlockers and OSDMapBlockers unreferencing from BlockingEvents. I...
- 05:45 AM Ceph Bug #65629 (Fix Under Review): cephfs_mirror: display 'sync_bytes' in peer status
- 03:56 AM Ceph Bug #65629 (In Progress): cephfs_mirror: display 'sync_bytes' in peer status
- 03:55 AM Ceph Bug #65629 (Fix Under Review): cephfs_mirror: display 'sync_bytes' in peer status
- Display 'sync_bytes' for the 'last_synced_snap' in the 'peer status' command output. This is analogous with the perf ...
- 04:56 AM RADOS Feature #65583: mon store data should be available depending on the user keyring
- > My understanding is the idea is restrict the visibility of configurables' values.
Yes, that's right, but can you... - 04:32 AM Ceph Documentation #65609 (Resolved): Documentation of maximum port number is incorrect
- 04:10 AM CephFS Bug #65630 (Fix Under Review): mds: rename request was deadlocked between two different MDSs
- This is reported by Nigel, more detail please see https://www.mail-archive.com/ceph-users@ceph.io/msg24587.html
In... - 03:04 AM crimson Bug #65628 (Resolved): unittest-seastore (Timeout)
- There is a certain probability of timeout happened on both ARM and X86 CI.
e.g.
1. https://jenkins.ceph.com/job/cep...
04/22/2024
- 11:00 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
- Unfortunately, noscrub and nodeep-scrub are not the only warnings we would need to mask for the thrashosds-health tes...
- 10:51 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- More in this run: https://pulpito.ceph.com/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro...
- 10:46 PM Orchestrator Bug #64374: Error ENOENT: module 'cephadm' reports that it cannot run on the active manager daemon: No module named 'mgr_module' (pass --force to force enablement)
- /a/lflores-2024-04-01_18:07:25-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7634080
- 09:33 PM mgr Bug #65627 (New): Centos 9 stream ceph container iscsi test failure
- h3. Missing k8sevents module
While waiting for the mgr to start, we get this traceback message:
teuthology.log
<... - 09:19 PM rgw Bug #65626: rgw: false assumption on vault bucket key deletion
- PR: https://github.com/ceph/ceph/pull/57046
- 09:16 PM rgw Bug #65626 (Fix Under Review): rgw: false assumption on vault bucket key deletion
- On bucket key deletion when the request to change the property of the key for deletion_allowed to true, it is expecte...
- 09:17 PM Ceph QA QA Run #65558 (QA Closed): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- 09:05 PM Ceph QA QA Run #65558 (QA Approved): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- 08:31 PM Ceph QA QA Run #65558: wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- approved overall, except for two prs:
> https://github.com/ceph/ceph/pull/54172 - quincy: prevent anonymous topic ... - 09:14 PM rgw Backport #65409: quincy: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56818
merged - 09:13 PM rgw Backport #65341: quincy: rgw: update options yaml file so LDAP uri isn't an invalid example
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56722
merged - 09:08 PM rgw Backport #63961: quincy: rgw: lack of headers in 304 response
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/55095
merged - 09:07 PM rgw Backport #63253: quincy: Add bucket versioning info to radosgw-admin bucket stats output
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/54190
merged - 08:37 PM rbd Bug #65487 (In Progress): rbd-mirror daemon in ERROR state, require manual restart
- Hi Nir,
Thanks for providing verbose logs. For now, I have all the information I need.
Due to rbd-mirror daemo... - 08:34 PM rgw Backport #65625 (In Progress): quincy: rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
- 08:33 PM rgw Backport #65625 (In Progress): quincy: rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
- https://github.com/ceph/ceph/pull/57045
- 08:26 PM rgw Bug #61772 (Pending Backport): rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
- 08:12 PM rbd Feature #65624 (Pending Backport): [pybind] expose CLONE_FORMAT and FLATTEN image options
- C/C++ API:...
- 07:53 PM rgw Backport #64766 (Resolved): reef: SSL session id reuse speedup mechanism of the SSL_CTX_set_session_id_context is not working
- 07:50 PM rgw Bug #62063 (New): notification tests fail on 'radosgw-admin -n client.0 user rm --uid foo.client.0 --purge-data --cluster ceph'
- happening on quincy: https://qa-proxy.ceph.com/teuthology/yuriw-2024-04-20_15:31:09-rgw-wip-yuri4-testing-2024-04-19-...
- 07:30 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- Bump up./
- 07:18 PM RADOS Bug #56393: failed to complete snap trimming before timeout
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648606 was on fbfd55d0098...
- 07:12 PM mgr Backport #65623 (In Progress): reef: mgr: update cluster state for new maps from the mons before notifying modules
- https://github.com/ceph/ceph/pull/57065
- 07:12 PM mgr Backport #65622 (In Progress): squid: mgr: update cluster state for new maps from the mons before notifying modules
- https://github.com/ceph/ceph/pull/57064
- 07:12 PM mgr Backport #65621 (In Progress): quincy: mgr: update cluster state for new maps from the mons before notifying modules
- https://github.com/ceph/ceph/pull/57066
- 07:11 PM CephFS Backport #65620 (In Progress): squid: qa: test_max_items_per_obj open procs not fully cleaned up
- https://github.com/ceph/ceph/pull/57063
- 07:11 PM CephFS Backport #65619 (In Progress): squid: mds: quiesce_counter decay rate initialized from wrong config
- https://github.com/ceph/ceph/pull/57062
- 07:07 PM mgr Bug #64799 (Pending Backport): mgr: update cluster state for new maps from the mons before notifying modules
- I'll sit on the backports for a while.
- 07:06 PM CephFS Bug #65022 (Pending Backport): qa: test_max_items_per_obj open procs not fully cleaned up
- 07:04 PM CephFS Bug #65342 (Pending Backport): mds: quiesce_counter decay rate initialized from wrong config
- 06:52 PM Ceph QA QA Run #65596 (QA Approved): wip-pdonnell-testing-20240420.180737-debug
- https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-04-20
- 06:47 PM CephFS Bug #50821: qa: untar_snap_rm failure during mds thrashing
- ...
- 06:42 PM CephFS Bug #65618 (In Progress): qa: fsstress: cannot execute binary file: Exec format error
- ...
- 06:40 PM RADOS Bug #53768 (Closed): timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
- 06:39 PM CephFS Fix #65617 (Fix Under Review): qa: increase debugging for snap_schedule
- 06:36 PM CephFS Fix #65617 (Pending Backport): qa: increase debugging for snap_schedule
- 06:39 PM rgw Bug #65567: admin_socket_output: signal: Terminated from term radosgw
- note from tracker scrub: looks like a duplicate of https://tracker.ceph.com/issues/59380.
- 06:31 PM rgw Bug #65567 (Duplicate): admin_socket_output: signal: Terminated from term radosgw
- 06:33 PM CephFS Bug #65616 (Triaged): pybind/mgr/snap_schedule: 1m scheduled snaps not reliably executed (RuntimeError: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'})
- Check timestamps:...
- 06:28 PM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- Update: Still working to understand why my local reproducer worked with the latest fix but not in teuthology.
- 06:27 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- https://shaman.ceph.com/builds/ceph/wip-yuri3-testing-2024-04-05-0825/5d349943c59c9485df060d6adb0594f3940ec0eb/
- 06:15 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- Laura Flores wrote in #note-6:
> @yuriw can you rebase/rerun? One of the PRs got more changes.
removed https://gi... - 05:47 PM Ceph QA QA Run #65349 (QA Needs Rerun/Rebuilt): wip-yuri3-testing-2024-04-05-0825
- @yuriw can you rebase/rerun? One of the PRs got more changes.
- 06:23 PM RADOS Bug #62839 (Closed): Teuthology failure in LibRadosTwoPoolsPP.HitSetWrite
- Cache tiering is deprecated, sorry.
- 06:20 PM Ceph QA QA Run #65552 (QA Needs Approval): wip-yuri2-testing-2024-04-17-0823-reef
- Laura Flores wrote in #note-3:
> @yuriw can you try rerunning this?
rerunning failed - 05:32 PM Ceph QA QA Run #65552 (QA Needs Rerun/Rebuilt): wip-yuri2-testing-2024-04-17-0823-reef
- @yuriw can you try rerunning this?
- 06:19 PM RADOS Bug #65186: OSDs unreachable in upgrade test
- In QA. Pinged.
- 06:17 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Still in QA.
- 06:16 PM RADOS Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- Hi Nitzan, would you mind taking a look?
- 06:14 PM CephFS Tasks #65615 (Resolved): lchown corrupts symlink entry
- lchown corrupts symlink entry:...
- 06:12 PM RADOS Bug #65449: NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
- In review.
- 06:09 PM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
- note from scrub: bump up.
- 06:07 PM RADOS Feature #65583: mon store data should be available depending on the user keyring
- This sounds like a feature request, not a bug.
My understanding is the idea is restrict the visibility of configurab... - 05:56 PM CephFS Bug #65614 (Fix Under Review): client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- 05:46 PM CephFS Bug #65614 (Pending Backport): client: resends request to same MDS it just received a forward from if it does not have an open session with the target
- ...
- 05:53 PM RADOS Documentation #16258: ceph audit logs are not logging to ceph.audit.log if we specify "mon cluster log file" option
- If something stays in tracker, without huge attention, for 8+ years, it's probably not a high prio...
- 08:04 AM RADOS Documentation #16258: ceph audit logs are not logging to ceph.audit.log if we specify "mon cluster log file" option
- No idea if this is still applicable. Unassigning from me because it hasn't been touched for almost a decade, and I'll...
- 05:47 PM RADOS Bug #53240: full-object read crc is mismatch, because truncate modify oi.size and forget to clear data_digest
- New changes in the PR (a unit test fix). Need to reQA.
- 05:39 PM RADOS Bug #65371: rados: PeeringState::calc_replicated_acting_stretch populate acting set before checking if < bucket_max
- In review.
- 05:38 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Yuri provided an update. Still in QA.
- 05:32 PM rgw Backport #64496 (Resolved): squid: keystone admin token is not invalidated on http 401 response
- 05:32 PM rgw Backport #65353 (Resolved): squid: rgwlc: Executing radosgw-admin lc process --bucket <bkt-name> without setting lc rule results in Segmentation fault
- 05:32 PM rgw Backport #64552 (Resolved): squid: rgw/multisite: objects named "." or ".." are not replicated
- 04:55 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
- Another one: https://pulpito.ceph.com/leonidus-2024-04-22_12:36:42-fs-wip-lusov-quiescer-distro-default-smithi/766829...
- 04:41 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
- Another case: https://pulpito.ceph.com/leonidus-2024-04-22_12:36:42-fs-wip-lusov-quiescer-distro-default-smithi/76682...
- 02:44 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
- ...
- 02:43 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
- Another instance of the same at https://pulpito.ceph.com/leonidus-2024-04-22_12:36:42-fs-wip-lusov-quiescer-distro-de...
- 04:54 PM CephFS Tasks #64133: Make pjd work on fscrypt
- Make pjd tests pass that are failing:...
- 04:50 PM CephFS Tasks #65613 (Resolved): truncate failing when using path
- The fix:...
- 04:44 PM CephFS Tasks #65613 (Resolved): truncate failing when using path
- Reproducer:...
- 04:19 PM Ceph Bug #65612 (New): qa: logrotate fails when state file is already locked
- ...
- 04:16 PM rgw Bug #65160: rgw/lc: A few buckets stuck in UNINITIAL state
- Can this be backported to Squid?
- 03:23 PM RADOS Bug #49158 (Resolved): doc: ceph-monstore-tools might create wrong monitor store
- 03:21 PM Ceph Documentation #57125 (Resolved): Improve wording of /doc/rados/*
- 03:21 PM Ceph Documentation #57108 (Resolved): add ".. prompt :: bash $" to /doc/rados
- 03:15 PM Ceph Bug #64446 (Resolved): Backport PR#55540 to Squid (and only Squid) when its commits are merged to main
- 03:14 PM Ceph Documentation #65161 (Resolved): Update Zabbix Documentation
- 03:12 PM Ceph Documentation #65599 (Resolved): "ceph osd crush rename bucket" command missing
- 03:06 PM Ceph Bug #65249 (Resolved): peering_graph.generated.dot renders weird
- I used these instructions to build an SVG file of the peering graph:
$ git clone https://github.com/ceph/ceph.git
... - 11:14 AM rbd Backport #65550 (In Progress): squid: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 11:14 AM rbd Backport #65549 (In Progress): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 11:12 AM rbd Backport #65547 (In Progress): quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 10:45 AM Ceph Bug #65611 (New): Segmentation fault in upkeep_main
- ...
- 10:27 AM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- Another instance of this issue: https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-de...
- 09:42 AM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- fixing the @request_drop_foreign_locks@ method uncovered another crash due to the same reason, this time when droppin...
- 10:08 AM crimson Bug #65610 (Resolved): unittest-object-data-handler crashes testing object_data_handler_test_t.overwrite_then_read_within_transaction
- ...
- 09:28 AM crimson Bug #65491: recover_missing: racing read got wrong version
- > *Hypothesis 2:*
> See: 'Version bump'. Version was bumped to 12 and then both requests were requeued (requeueing c... - 09:23 AM Ceph Documentation #65609 (Resolved): Documentation of maximum port number is incorrect
- The highest port number used by OSD or MDS daemons was increased from 7300 to 7568 in https://github.com/ceph/ceph/pu...
- 08:43 AM crimson Bug #64206: obc->is_loaded_and_valid() assertion
- https://pulpito.ceph.com/matan-2024-04-21_15:36:23-crimson-rados-wip-matanb-crimson-testing-snap-overlap-distro-crims...
- 08:06 AM RADOS Cleanup #10506 (Rejected): mon: get rid of QuorumServices
- I hope this might have been addressed at some point. If not, it probably no longer makes sense to mess with the monit...
- 08:03 AM RADOS Bug #42519: During deployment of the ceph,when the main node starts slower than the other nodes.It may lead to generate a core by assert.
- No idea if this is still applicable. Unassigning from me because it hasn't been touched for 4 years now, and I'll lik...
- 06:20 AM CephFS Bug #50260: pacific: qa: "rmdir: failed to remove '/home/ubuntu/cephtest': Directory not empty"
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648870
The *mnt.0* was... - 06:13 AM CephFS Bug #64707: suites/fsstress.sh hangs on one client - test times out
- Laura Flores wrote in #note-16:
> /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-defa... - 06:05 AM Ceph Bug #65608 (New): Mirroring mode of rbd image changes when migrated between pools
- When an rbd image is mirrored and migrated between pools (rbd migration) the mirroring mode changes from "snapshot" (...
- 05:37 AM rgw Bug #64999: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone
Hi Shilpa,
We are eagerly waiting for your direction to resolve it.
I appreciate your attention to this matter....- 05:12 AM CephFS Bug #65607: mds deadlock between 'lookup' and the 'rename/create, etc' requests
- This possibly caused by the lock order issue as in https://tracker.ceph.com/issues/62123.
- 05:09 AM CephFS Bug #65607 (Need More Info): mds deadlock between 'lookup' and the 'rename/create, etc' requests
- Have suggested Erich to make *max_mds = 1* to reproduce it to get rid of the noises.
- 04:51 AM CephFS Bug #65607: mds deadlock between 'lookup' and the 'rename/create, etc' requests
- As Erich mentioned he enabled multiple active MDSs, but he only updated the block ops from on MDS. I guess maybe anot...
- 04:33 AM CephFS Bug #65607 (Need More Info): mds deadlock between 'lookup' and the 'rename/create, etc' requests
- This is reported by Eric, more detail please see https://www.mail-archive.com/ceph-users@ceph.io/msg24587.html
The... - 04:12 AM Dashboard Bug #65571 (Resolved): mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
- 04:12 AM Dashboard Backport #65581 (Resolved): squid: mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
04/21/2024
- 07:27 PM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- The incorrect behavior of the method that stripped the local quiesce lock from the request resulted in the crash when...
- 07:17 PM CephFS Bug #65606 (Fix Under Review): workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- 06:57 PM CephFS Bug #65606: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
- We had a successful quiesce on the mds.0 followed by the said export dir request. The export dir request has failed t...
- 06:30 PM CephFS Bug #65606 (Pending Backport): workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666598/
The f...- 06:08 PM CephFS Bug #65605 (Duplicate): fsx.sh workload fails with status 2 due to a makefile error
- Duplicate of https://tracker.ceph.com/issues/64572
- 06:06 PM CephFS Bug #65605: fsx.sh workload fails with status 2 due to a makefile error
- another instance of the same failure https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-dist...
- 06:05 PM CephFS Bug #65605 (Duplicate): fsx.sh workload fails with status 2 due to a makefile error
- https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666610/...
- 05:46 PM CephFS Bug #65604 (Triaged): dbench.sh workload times out after 3h when run with-quiescer
- https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666604/
No quie... - 05:14 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
- https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666602/...
- 04:24 PM CephFS Bug #65603: mds: quiesce timeout due to a freezing directory
- The directory appears to be fragmenting, as we see from a few messages in the log...
- 04:09 PM CephFS Bug #65603 (Pending Backport): mds: quiesce timeout due to a freezing directory
- Analyzing one of the ETIMEDOUT error for a quiesce, looking at
https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13... - 02:02 PM Ceph QA QA Run #65592: wip-yuriw-testing-20240419.185239-main
- https://pulpito.ceph.com/?branch=wip-yuriw-testing-20240419.185239-main
- 02:01 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- https://pulpito.ceph.com/?branch=wip-yuriw-testing-20240419.202307-squid
- 01:21 PM crimson Bug #65532 (Fix Under Review): osd crashes due to invalid clone_range ops
- 01:17 PM crimson Bug #64782 (Resolved): test_python.sh TestIoctx.test_locator failes in cases of SeaStore
- 01:12 PM crimson Bug #65531 (In Progress): crimson-osd: dump_historic_slow_ops command not correctly run
- 01:12 PM sepia Support #65535 (In Progress): Sepia Lab Access Request
- Hey Kalpesh Pandya,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn a... - 01:00 PM crimson Support #65602 (New): Support RBD mirror testing
- See: qa/suites/rbd/mirror-thrash and qa/suites/rbd/mirror
- 12:57 PM crimson Bug #65601 (New): rados_python.yaml enable tests
- Currently some of rados_python tests are disabled:...
- 12:27 PM bluestore Fix #65600 (Fix Under Review): bluefs alloc unit should only be shrink
- The alloc unit has already forbidden changed for bluestore, what's more, it should forbidden increased in bluefs. Oth...
- 09:43 AM crimson Bug #65474 (Resolved): mgr crash due to corrupted incremental osdmap sent by crimson-osds
- 09:43 AM crimson Bug #65200 (Resolved): PeeringState::get_peer_info(pg_shard_t) const: Assertion `it != peer_info.end()' failed.
- 09:42 AM crimson Bug #59242 (Resolved): [crimson] Pool compression does not take effect
- 09:25 AM CephFS Backport #65556 (Fix Under Review): squid: mds: avoid recalling Fb when quiescing file
- 09:21 AM CephFS Backport #65556 (In Progress): squid: mds: avoid recalling Fb when quiescing file
- 09:08 AM crimson Bug #63647: SnapTrimEvent AddressSanitizer: heap-use-after-free
- https://pulpito.ceph.com/matan-2024-04-21_07:41:30-crimson-rados-wip-matanb-crimson-only-testing-april-17-distro-crim...
- 08:12 AM rgw Feature #53662: rgw: radosgw-admin can list and remove bucket notification topics; it must also be able to create them
- agree we should close.
* topic creation by an admin will mess up the topic ownership logic
* we can create notifica... - 07:58 AM Ceph Documentation #65599: "ceph osd crush rename bucket" command missing
- Eugen Block, as usual, to the rescue:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/IQUPWQZ5ZIQ... - 07:52 AM Ceph Documentation #65599 (Resolved): "ceph osd crush rename bucket" command missing
- https://docs.ceph.com/en/latest/rados/operations/crush-map/
The "ceph osd crush rename bucket" command is not list... - 12:13 AM Ceph Bug #65598 (New): github v18.2.2 tag removed
- I have some automation that looks for git tags on github that broke recently because the v18.2.2 tag was removed from...
04/20/2024
- 06:07 PM Ceph QA QA Run #65596 (QA Approved): wip-pdonnell-testing-20240420.180737-debug
- * "PR #57010":https://github.com/ceph/ceph/pull/57010 -- mds: add missing policylock to test F_QUIESCE_BLOCK
* "PR #... - 06:07 PM Ceph QA QA Run #65562 (QA Closed): wip-pdonnell-testing-20240418.004638-debug
- 03:41 PM Ceph QA QA Run #65594 (QA Needs Approval): wip-yuriw11-testing-20240501.200505-squid
- 03:39 PM Ceph QA QA Run #65594: wip-yuriw11-testing-20240501.200505-squid
- Note to self: this batch is by mistake is labeled "yuri" instead of "yuri11"
- 03:33 PM Ceph QA QA Run #65592 (QA Needs Approval): wip-yuriw-testing-20240419.185239-main
- 03:32 PM Ceph QA QA Run #65558 (QA Needs Approval): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- 03:23 PM Ceph QA QA Run #65126: wip-yuri8-testing-2024-03-25-1419
- @pdvian if you approved this batch pls change the status and assign to me for merge
- 03:21 PM Ceph QA QA Run #65574 (QA Needs Approval): wip-yuri7-testing-2024-04-18-1351-reef
- 01:14 AM Ceph QA QA Run #65574 (QA Approved): wip-yuri7-testing-2024-04-18-1351-reef
- 03:07 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- I reproduced the issue again with debug logs.
Tested flow:
- Configure rbd mirroring on both clusters
- Wait for...
04/19/2024
- 11:31 PM Ceph QA QA Run #65126: wip-yuri8-testing-2024-03-25-1419
- Failure, unrelated :
1. cephadm: Health detail: HEALTH_WARN 1/3 mons down, quorum a,c in cluster log
2. cephadm: Heal... - 11:30 PM CephFS Bug #65595 (Fix Under Review): mds: missing policylock acquisition for quiesce
- 11:28 PM CephFS Bug #65595 (Pending Backport): mds: missing policylock acquisition for quiesce
- In order to check an inode's F_QUIESCE_BLOCK, the quiesce_inode op must acquire the policylock. Furthermore, to ensur...
- 11:20 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
- Update on this bug:
We are pretty close to getting the fix out for this. Thanks all for waiting so long. In additi... - 08:23 PM Ceph QA QA Run #65594 (QA Approved): wip-yuriw11-testing-20240501.200505-squid
- * "PR #57006":https://github.com/ceph/ceph/pull/57006 -- squid: osd/PGBackend::be_scan_list: only call stat, getattrs...
- 07:21 PM RADOS Backport #65593: squid: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- https://github.com/ceph/ceph/pull/57006
- 06:55 PM RADOS Backport #65593 (New): squid: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098/log/b1f19696-e81a-11ee...
- 06:52 PM Ceph QA QA Run #65592 (QA Closed): wip-yuriw-testing-20240419.185239-main
- * "PR #56995":https://github.com/ceph/ceph/pull/56995 -- osd: only call stat/getattrs once per object during deep-scrub
- 06:51 PM RADOS Bug #65185 (Fix Under Review): OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- 05:57 PM rgw Feature #20094 (Resolved): RFW: make civetweb max request size configurable to allow larger s3 object metadata
- 05:56 PM rgw Feature #19917 (Closed): radosgw access log is lacking useful information
- 05:56 PM rgw Feature #19510 (Resolved): per-object storage class
- 05:55 PM rgw Feature #20398 (Resolved): rgw: Swift TempURL does not support prefix-based scope
- 05:54 PM rgw Feature #20733 (Closed): RGW bucket limits
- 05:53 PM rgw Feature #20650 (Resolved): Support webhook for authentication
- 05:52 PM rgw Feature #20795 (Resolved): rgw: the TempURL implementation should support ISO8601 in temp_url_expires
- 05:52 PM rgw Feature #20883 (Resolved): rgw: responses for HEAD/GET on Swift's container should contain Last-Modified
- 05:51 PM rgw Feature #21334 (Resolved): support log response header “x-amz-request-id ”
- 05:50 PM rgw Feature #21799 (Rejected): multisite: sync parts of multipart uploads
- 05:49 PM rgw Feature #22565 (Resolved): Multiple Data Pool Support for a Bucket
- this is supported through storage classes: https://docs.ceph.com/en/latest/radosgw/placement/
- 05:48 PM rgw Feature #24335 (Resolved): Get the user metadata of the user used to sign the request
- 05:46 PM rgw Feature #24493 (Resolved): rgw does not implement list_object_v2 in S3
- 05:43 PM rgw Feature #24507 (Resolved): [rfe] rgw: relaxed region constraint enforcement
- 05:41 PM rgw Feature #39084 (Resolved): ability to control user op mask via admin apis
- 05:40 PM rgw Feature #40241 (Rejected): radosgw: ldap groups
- 05:39 PM rgw Feature #40242 (Rejected): radosgw-admin: export & import buckets
- 05:37 PM rgw Feature #40392 (Rejected): radosgw-admin: create bucket
- 05:35 PM rgw Feature #40714 (Closed): usage log differ from civetweb and beast
- 05:35 PM rgw Feature #41062 (Resolved): Extend SSE-KMS in Rados Gateway to support HashiCorp Vault
- 05:35 PM rgw Feature #41222 (Rejected): multisite: delay sync data to non-master zone
- 05:34 PM rgw Feature #42513 (Resolved): rgw: radosgw-admin command line parsing cleanup and improvements
- 05:34 PM rgw Feature #42627 (Resolved): rgw: bucket granularity sync: bucket dependency index
- 05:33 PM rgw Feature #42626 (Resolved): rgw: bucket granularity sync: core sync changes
- 05:33 PM rgw Feature #42625 (Resolved): rgw: bucket granularity sync: sync policy
- 05:33 PM rgw Feature #42272 (Resolved): rgw set cpu affinity at startup
- 05:33 PM rgw Feature #42493 (Rejected): Simplify Login Radosgw-admin API
- ceph provides a shell script in https://github.com/ceph/ceph/blob/main/examples/rgw/rgw_admin_curl.sh that adds sigv2...
- 05:31 PM rgw Feature #45444 (Resolved): Add bucket name to bucket stats error logging
- 05:30 PM rgw Feature #45568 (Resolved): Swift Extract Archive Operation
- 05:30 PM rgw Feature #45748 (Closed): recommended max number of buckets....
- we don't intend there to be any scaling limit to the number of total buckets in the system. there are limitations on ...
- 05:27 PM rgw Feature #46028 (Resolved): RGW User Policy
- 05:25 PM rgw Feature #48402 (Resolved): multisite option to enable keepalive
- 05:25 PM rgw Feature #48513 (Rejected): uses librgw2 to directly access the rados cluster for hadoop
- 05:24 PM rgw Feature #48798 (Resolved): RGW:Multisite: Verify if the synced object is identical to source
- 05:24 PM rgw Feature #49227 (Resolved): rgw: register daemon in service map with more details
- 05:22 PM rgw Feature #50262 (Duplicate): rgw header size limit should configurable
- 05:20 PM rgw Feature #53546 (Resolved): rgw/beast: add max_header_size option with 16k default, up from 4k
- 05:09 PM rgw Feature #55016 (Resolved): radosgw-admin should allow setting user policy
- 05:07 PM rgw Bug #23264 (In Progress): Server side encryption support for s3 COPY operation
- 05:07 PM rgw Feature #55481 (Resolved): The latest version of server encryption does not support "aes256" as kms encryption method
- 05:04 PM rgw Feature #55640 (Rejected): make lua scripting optional
- the attached pull request closed a year ago
i personally don't see much benefit to disabling lua at compile time. ... - 04:56 PM rgw Feature #53662 (Need More Info): rgw: radosgw-admin can list and remove bucket notification topics; it must also be able to create them
- trying to scrub some old feature requests. is there still interest in this?
in general, i don't think radosgw-admi... - 04:36 PM rgw Feature #59593 (Closed): The capability of resetting an empty bucket to the clean-slate state in multi-site environment
- 03:46 PM rgw Feature #63930 (Duplicate): s3: implement GetObjectAttributes
- 03:06 PM rgw Feature #64190 (Resolved): support lifecycle NewerNoncurrentVersions in NoncurrentVersionExpiration
- already backported to squid with https://github.com/ceph/ceph/pull/56144
- 02:50 PM RADOS Bug #65591 (New): Pool MAX_AVAIL goes UP when an OSD is marked down+in
- Example:
* Cluster with 4 OSD nodes, 10 OSDs each
* 3x replicated pool
* `max_avail` from `ceph df detail --format... - 02:28 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- @lflores rerun => https://pulpito.ceph.com/yuriw-2024-04-19_14:26:57-rados-wip-yuri6-testing-2024-04-02-1310-distro-d...
- 02:25 PM Ceph QA QA Run #65270 (QA Needs Approval): wip-yuri6-testing-2024-04-02-1310
- 02:00 PM Ceph QA QA Run #65574: wip-yuri7-testing-2024-04-18-1351-reef
- repushed
- 01:32 PM rgw Bug #65590 (Pending Backport): rgw_multi.tests.test_topic_notification_sync: PutBucketNotificationConfiguration fails with ConcurrentModification
- ...
- 12:38 PM sepia Support #64967: Sepia Lab Access Request
- Hi Adam,
Please update if these new creds have been granted access. Thanks! - 12:09 PM Ceph Support #65589 (New): is there any method to restore deleted rbd images
- Hi, there
We're running a very old ceph rbd cluster. Today a team deleted a bunch of (about 1.5k images and 12TiB ... - 10:39 AM rbd Backport #65588 (In Progress): quincy: insufficient randomness for group and group snapshot IDs
- https://github.com/ceph/ceph/pull/57090
- 10:38 AM rbd Backport #65587 (Resolved): squid: insufficient randomness for group and group snapshot IDs
- https://github.com/ceph/ceph/pull/57092
- 10:38 AM rbd Backport #65586 (In Progress): reef: insufficient randomness for group and group snapshot IDs
- https://github.com/ceph/ceph/pull/57091
- 10:34 AM rbd Bug #65573 (Pending Backport): insufficient randomness for group and group snapshot IDs
- 09:39 AM Ceph Bug #65176: BlueFS: _estimate_log_size_N calculates the log size incorrectly
- What is calculated here should be the total bytes occupied by the names of all files.@ Igor Fedotov
- 09:29 AM crimson Bug #65531: crimson-osd: dump_historic_slow_ops command not correctly run
- https://github.com/ceph/ceph/pull/56994
- 07:50 AM crimson Bug #65585: unittest-seastore (Timeout)
- https://github.com/ceph/ceph/pull/56979...
- 07:47 AM crimson Bug #65585: unittest-seastore (Timeout)
- https://github.com/ceph/ceph/pull/56982...
- 07:35 AM crimson Bug #65585: unittest-seastore (Timeout)
- The pasted log is from https://github.com/ceph/ceph/pull/56998#issuecomment-2065880693
- 07:33 AM crimson Bug #65585 (Resolved): unittest-seastore (Timeout)
- ...
- 07:18 AM ceph-volume Bug #65584 (Fix Under Review): ceph-volume: use os.makedirs to implement mkdir_p
- ceph-volume failed if /var/lib/ceph/osd/ does not exist...
- 06:47 AM RADOS Feature #65583 (New): mon store data should be available depending on the user keyring
- For the specific ceph user data should be restricted on the mon store.
Let's say if client.user1 store data `clien... - 06:00 AM Ceph Backport #65582 (New): squid: qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
- 05:41 AM Ceph Bug #65565 (Pending Backport): qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
- 05:40 AM Dashboard Backport #65581 (In Progress): squid: mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
- 05:29 AM Dashboard Backport #65581 (Resolved): squid: mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
- https://github.com/ceph/ceph/pull/56999
- 05:23 AM Dashboard Bug #65571 (Pending Backport): mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
- 05:17 AM CephFS Bug #65580 (Triaged): mds/client: add dummy client feature to test client eviction
- Currently, fs:upgrade:featureful_client:old_client uses octopus client with a newer MDS. The octopus client lacks a p...
- 03:49 AM CephFS Fix #65579 (New): mds: use _exit for QA killpoints rather than SIGABRT
- Using signals to abruptly kill the MDS has a few issues:
- teuthology logs are polluted with stacktraces
- coredu... - 03:37 AM cephsqlite Bug #65494 (Fix Under Review): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- 02:48 AM devops Backport #65578 (In Progress): reef: ccache is always miss in confusa14
- 02:38 AM devops Backport #65578 (In Progress): reef: ccache is always miss in confusa14
- https://github.com/ceph/ceph/pull/56993
- 02:47 AM devops Backport #65577 (In Progress): squid: ccache is always miss in confusa14
- 02:38 AM devops Backport #65577 (In Progress): squid: ccache is always miss in confusa14
- https://github.com/ceph/ceph/pull/56992
- 02:47 AM devops Backport #65576 (In Progress): quincy: ccache is always miss in confusa14
- 02:38 AM devops Backport #65576 (In Progress): quincy: ccache is always miss in confusa14
- https://github.com/ceph/ceph/pull/56991
- 02:30 AM devops Bug #65175 (Pending Backport): ccache is always miss in confusa14
- 12:45 AM Ceph Bug #65249: peering_graph.generated.dot renders weird
- size="7,7" in peering_graph_generated.dot causes the peering_graph_generated.svg file to look the (wrong) way that ca...
- 12:37 AM Ceph Bug #65249: peering_graph.generated.dot renders weird
- dot -Tsvg doc/dev/peering_graph.generated.dot > doc/dev/peering_graph.generated.svg
The above command as of today ... - 12:34 AM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- I have fix the kernel call trace in kernel space, the patch like is https://patchwork.kernel.org/project/ceph-devel/l...
04/18/2024
- 11:05 PM rbd Bug #54292: run-rbd-unit-tests-127.sh times out on Jenkins "make check" runs
- sorry to pile on, but it's hard to know which tracker issue is related to which crash. from squid pr https://jenkins....
- 10:52 PM Ceph QA QA Run #65237: wip-ceph_test_rados-partial-reads
- @rzarzynski ping?
- 10:51 PM Ceph QA QA Run #65560 (QA Needs Approval): wip-yuri5-testing-2024-04-17-1400
- 10:12 PM rgw Bug #65575 (Pending Backport): release note for rgw_realm init
- 09:25 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- ...
- 08:19 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- https://github.com/rzarzynski/ceph/commit/1a4d3f01816cedb15106fe2cdb52322029482827 changed ScrubMap::object::attrs to...
- 09:24 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
- i tried running the python reproducer from https://tracker.ceph.com/issues/58286, but it doesn't reproduce the @bad m...
- 09:08 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
- thanks Ali, that's super helpful. i came across https://tracker.ceph.com/issues/58286 which looks like the exact same...
- 08:55 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
- Here is a snippet with two of those "bad method" statements from the log I referenced in the last comment.
https:/... - 08:33 PM rgw Bug #64841: java_s3tests: testObjectCreateBadExpectMismatch failure
- After having radosgw under valgrind and running the java s3tests I was able to reproduce the "failed to read header: ...
- 08:52 PM Ceph QA QA Run #65574 (QA Closed): wip-yuri7-testing-2024-04-18-1351-reef
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/54150 - reef: ceph_mon: Fix MonitorDBStore us...- 05:21 PM rbd Bug #65573 (Fix Under Review): insufficient randomness for group and group snapshot IDs
- 05:12 PM rbd Bug #65573 (Pending Backport): insufficient randomness for group and group snapshot IDs
- Nithya noticed that group IDs end up being very similar:...
- 05:13 PM Dashboard Feature #56429: mgr/dashboard: Remote user authentication (e.g. via apache2)
- If SSO should be the primary login method, and the local login is only needed for emergencies (Network/IdP down), the...
- 05:07 PM Dashboard Feature #56429: mgr/dashboard: Remote user authentication (e.g. via apache2)
- Hello Ernesto,
This interface seems to imply that a username and password is entered on a login page and passed to... - 05:06 PM rbd Backport #65548 (Duplicate): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- The bot created two reef backport tickets for some reason.
- 04:18 PM rgw Feature #65551 (Fix Under Review): [rgw][accounts] bucket quota management at account-level
- 04:00 PM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
- Eugen Block wrote in #note-10:
> Thanks, Matan! It sounds very promising. I talked to the customer and they are will... - 03:34 PM Ceph QA QA Run #65510 (QA Closed): wip-yuriw-testing-20240416.150233
- 03:34 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
- @matan @lflores thx a million!
- 03:27 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
- Laura Flores wrote in #note-6:
> @matan I forgot to say, can you also include a "Rados approved: <link to your summa... - 03:18 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
- @matan I forgot to say, can you also include a "Rados approved: <link to your summary>" message on the PRs now that t...
- 12:10 PM Ceph QA QA Run #65510 (QA Approved): wip-yuriw-testing-20240416.150233
- 7659275, 7659345, 7659406, 7659407, 7659470 - https://tracker.ceph.com/issues/61774
7659280 - https://tracker.ceph.c... - 03:34 PM RADOS Backport #65306: squid: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56814
merged - 03:33 PM RADOS Backport #65312: squid: decoding chunk_refs_by_hash_t return wrong values
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56697
merged - 03:33 PM RADOS Backport #65072: squid: rados/thrash: slow reservation response from 1 (115547ms) in cluster log
- https://github.com/ceph/ceph/pull/56482 merged
- 03:31 PM RADOS Backport #65140: squid: osd: modify PG deletion cost for mClock scheduler
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56474
merged - 03:31 PM mgr Backport #65117: squid: rados/upgrade/parallel: [WRN] TELEMETRY_CHANGED: Telemetry requires re-opt-in
- Laura Flores wrote:
> https://github.com/ceph/ceph/pull/56457
merged - 03:30 PM RADOS Backport #65097: squid: ceph osd pool rmsnap clone object leak
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56432
merged - 03:03 PM Ceph QA QA Run #65330 (QA Closed): wip-yuri7-testing-2024-04-04-0800
- 02:49 PM Ceph QA QA Run #65330 (QA Approved): wip-yuri7-testing-2024-04-04-0800
- 01:06 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
- fs approve. failures are - https://tracker.ceph.com/projects/cephfs/wiki/Squid#2024-04-18
NOTE: A couple of PRs ha... - 05:38 AM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
- Yuri Weinstein wrote in #note-7:
> @vshankar ping!
Apologies - on it now! - 03:03 PM CephFS Backport #65295: squid: High cephfs MDS latency and CPU load with snapshots and unlink operations
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56671
merged - 03:03 PM CephFS Backport #65106: squid: qa: probabilistically ignore PG_AVAILABILITY/PG_DEGRADED
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56665
merged - 03:02 PM CephFS Backport #65275: squid: mds: some request errors come from errno.h rather than fs_types.h
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56663
merged - 02:21 PM Linux kernel client Bug #65563: WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- Venky, IMO this also should be the same issue with this:
https://pulpito.ceph.com/vshankar-2024-03-13_13:59:32-fs... - 10:19 AM Linux kernel client Bug #65563 (Fix Under Review): WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- 07:45 AM Linux kernel client Bug #65563 (In Progress): WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- The mds sent out the open session reply with *cap_auths [MDSCapAuth( uid=1000 gids=1301readable=1, writeable=1),MDSCa...
- 07:26 AM Linux kernel client Bug #65563 (Fix Under Review): WARNING: CPU: 7 PID: 40807 at mm/page_alloc.c:4545 __alloc_pages+0x1e7/0x270
- https://pulpito.ceph.com/yuriw-2024-04-05_22:36:11-fs-wip-yuri7-testing-2024-04-04-0800-distro-default-smithi/7642062...
- 02:21 PM rgw Bug #64971 (New): Rgw lifecycle skip
- 02:20 PM rgw Bug #64983 (Fix Under Review): multisite: two-zonegroup tests get stuck in redirect loops
- 02:18 PM Ceph QA QA Run #65558: wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- retriggered centos8
- 02:17 PM rgw Bug #65216 (In Progress): rgw: only accept valid ipv4 from host header
- 02:17 PM Ceph QA QA Run #65552 (QA Needs Approval): wip-yuri2-testing-2024-04-17-0823-reef
- 02:16 PM Ceph QA QA Run #65552: wip-yuri2-testing-2024-04-17-0823-reef
- @lflores running with -p 75
- 02:16 PM rgw Bug #65369 (Fix Under Review): rgw: allow disabling bucket stats on head bucket
- 02:16 PM rgw Bug #65397 (Fix Under Review): rgw: allow disabling mdsearch APIs
- 02:15 PM rgw Bug #65436 (Need More Info): Getting Object Crashing radosgw services
- > After upgrade to 17.2.7, this bug gone
it sounds like this bug is fixed in later point release, can you please t... - 02:10 PM rgw Bug #65462 (Fix Under Review): rgw: eliminate ssl enforcement for sse-s3 encryption
- 02:09 PM rgw Bug #65468 (Fix Under Review): rgw: set correct requestId and hostId on s3select error
- 02:03 PM rgw Bug #65337 (Fix Under Review): rgw: Segmentation fault in rgw::notify::Manager during realm reload
- 01:48 PM CephFS Bug #65572 (New): Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi155 with status 1
- This has started to show up again (with fs/thrash). See: https://pulpito.ceph.com/yuriw-2024-04-05_22:36:11-fs-wip-yu...
- 01:28 PM CephFS Backport #65570 (Fix Under Review): squid: Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
- 12:40 PM CephFS Backport #65570 (Fix Under Review): squid: Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
- 01:22 PM Dashboard Bug #65571 (Resolved): mgr/dashboard: run-tox-mgr-dashboard-py3 failure in make check
- ...
- 01:19 PM sepia Bug #65475: folio03 install
- Thanks @akraitma , i am able to connect and use folio03
- 01:02 PM Dashboard Bug #62972: ERROR: test_list_enabled_module (tasks.mgr.dashboard.test_mgr_module.MgrModuleTest)
- https://jenkins.ceph.com/job/ceph-api/72585/
- 01:02 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-14:
> log_to_file gets set to true by Rook as part of enabling the log collector:
>
>... - 07:23 AM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- log_to_file gets set to true by Rook as part of enabling the log collector:
https://github.com/rook/rook/blob/a9fd... - 12:58 PM RADOS Bug #65449 (Fix Under Review): NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
- 12:09 PM RADOS Bug #65449: NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659537
- 12:35 PM CephFS Bug #65545 (Pending Backport): Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
- 12:34 PM sepia Support #65535: Sepia Lab Access Request
- Public ssh key:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCXMJ5WoP5k7wk5XuRZjjUESEuH38UoIDWmYqb0e7VPEy3c05whYa2ctuLj+/... - 12:10 PM RADOS Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659542
- 12:09 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659395
/a/yuriw-2024-04-... - 12:08 PM RADOS Bug #65186: OSDs unreachable in upgrade test
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659312
/a/yuriw-2024-04-... - 12:07 PM RADOS Cleanup #65521: Add expected warnings in cluster log to ignorelists
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659305
- 12:07 PM Infrastructure Bug #65448: Teuthology unable to find the "ceph-radosgw" package
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659304
- 12:06 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659300
/a/yuriw-2024-04-... - 12:06 PM Orchestrator Bug #52109: test_cephadm.sh: Timeout('Port 8443 not free on 127.0.0.1.',)
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659292
- 12:05 PM RADOS Bug #62839: Teuthology failure in LibRadosTwoPoolsPP.HitSetWrite
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659285
- 12:05 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659280
- 12:04 PM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659275/
/a/yuriw-2024-04... - 11:19 AM CephFS Bug #64659: mds: switch to using xlists instead of elists
- @vshankar any thoughts on this?
- 11:00 AM Dashboard Bug #65569 (New): exporter: allow all zone names pattern for sync counters
- Currently exporter only supports zone name which have `-`'s in between for rgw sync metrics. Adopt the regex to also...
- 10:28 AM crimson Bug #65568 (New): osd crashes when trimming snaps involves unrecovered objects
- The current crimson implementation doesn't recover objects when trimming snaps. So, if we are trimming a snapshot, an...
- 09:58 AM Ceph Bug #65228 (Fix Under Review): class:device-class config database mask does not work for osd_compact_on_start
- 09:52 AM rgw Bug #65567 (Duplicate): admin_socket_output: signal: Terminated from term radosgw
- ...
- 09:49 AM Dashboard Bug #65506 (Resolved): rgw roles e2e tests failure
- 09:49 AM Dashboard Backport #65542 (Resolved): squid: rgw roles e2e tests failure
- 09:44 AM Ceph Bug #65565: qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
- The commit has been cherry-picked to a different PR for a faster merge and to avoid circular dependency for CI to be ...
- 09:24 AM Ceph Bug #65565 (Pending Backport): qa/vstart_runner: increase timeout for sake of "Ceph API tests" CI job
- 09:37 AM nvme-of Feature #65566 (Pending Backport): Change some default values for OMAP lock parameters in nvmeof conf file
- We want to change some default values in the OMAP lock parameters in the nvmeof conf file generated by cephadm:
* ... - 09:35 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Dhairya Parmar wrote in #note-28:
> as mentioned in yesterday's standup - some of the PRs (https://github.com/ceph/c... - 09:19 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- apart from the discussion about MDS verifying clients OSDs set, https://tracker.ceph.com/issues/64563#note-28 also ne...
- 09:23 AM Ceph Bug #65533 (Resolved): qa/vstart_runner.py: don't let command run after timeout
- 09:15 AM CephFS Bug #65564 (Fix Under Review): Test failure: test_snap_schedule_subvol_and_group_arguments_08 (tasks.cephfs.test_snap_schedules.TestSnapSchedulesSubvolAndGroupArguments)
- /a/yuriw-2024-04-05_22:36:11-fs-wip-yuri7-testing-2024-04-04-0800-distro-default-smithi/7642196...
- 08:33 AM CephFS Bug #64977: mds spinlock due to lock contention leading to memory exaustion
- We've uploaded a new set of logs with debug_ms 1 at 20d8ba67-8bb0-4cfc-a986-b72ec250728d
- 07:03 AM CephFS Bug #54404 (Closed): snap-schedule retention not working as expected
- Closing tracker due to lack of info.
If no valid retention is found during pruning phase, then all snapshots are imm... - 01:23 AM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- @lflores rebuilding (note it's very slow :()
- 12:46 AM Ceph QA QA Run #65562 (QA Closed): wip-pdonnell-testing-20240418.004638-debug
- * "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli...
04/17/2024
- 10:25 PM Ceph QA QA Run #65561 (QA Closed): wip-yuriw-testing-20240417.222151-quincy
- duplicate of
- 10:22 PM Ceph QA QA Run #65561 (QA Closed): wip-yuriw-testing-20240417.222151-quincy
- * "PR #56818":https://github.com/ceph/ceph/pull/56818 -- quincy: qa/rgw: barbican uses branch stable/2023.1
* "PR #5... - 10:24 PM Ceph QA QA Run #65558: wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- also dupe is https://tracker.ceph.com/issues/65561
- 08:47 PM Ceph QA QA Run #65558 (QA Closed): wip-yuri4-testing-2024-04-19-0708-quincy (old wip-yuriw-testing-20240417.204632 (wip-yuri4-testing))
- --- done. these PRs were included:
https://github.com/ceph/ceph/pull/54172 - quincy: prevent anonymous topic operati... - 09:47 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
- When you're ready to approve it, change the Tracker status to "QA Approved" and reassign to Yuri. If anything needs r...
- 09:46 PM Ceph QA QA Run #65510: wip-yuriw-testing-20240416.150233
- @matan can you review the rados suite?
- 09:44 PM Ceph QA QA Run #65270 (QA Needs Rerun/Rebuilt): wip-yuri6-testing-2024-04-02-1310
- Hey @yuriw, https://github.com/ceph/ceph/pull/53545 caused some regressions. Can you remove it from the batch and reb...
- 09:41 PM RADOS Bug #65557 (Closed): Admin socket times out after osd restart
- This was actually related to a WIP branch that hasn't merged yet.
- 08:46 PM RADOS Bug #65557 (Closed): Admin socket times out after osd restart
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652505...
- 09:34 PM RADOS Bug #65559 (Closed): src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- Actually seems related to a WIP branch that hadn't been merged yet.
- 08:55 PM RADOS Bug #65559 (Closed): src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652491...
- 09:24 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Nir Soffer wrote in #note-12:
> Ilya Dryomov wrote in #note-11:
> > Hi Nir,
> >
> > I think the problem is the m... - 01:08 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-11:
> Hi Nir,
>
> I think the problem is the method you used to set these config opti... - 01:04 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Hi Nir,
I think the problem is the method you used to set these config options. Note that the way it's done in OD... - 08:56 PM Ceph QA QA Run #65560 (QA Closed): wip-yuri5-testing-2024-04-17-1400
- --- done. these PRs were included:
https://github.com/ceph/ceph/pull/49438 - os/bluestore: set rocksdb iterator boun... - 08:46 PM RADOS Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
- Laura Flores wrote in #note-12:
> /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-defaul... - 01:59 AM RADOS Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
- @lflores There's little chance that the above crash is related to what Joseph saw here, let's close this one and open...
- 08:14 PM Ceph QA QA Run #65530 (QA Closed): wip-pdonnell-testing-20240417.021458-debug
- Some strange ansible.cephlab failure breaking this run.
- 02:15 AM Ceph QA QA Run #65530 (QA Closed): wip-pdonnell-testing-20240417.021458-debug
- * "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli...
- 08:08 PM CephFS Backport #65556 (Fix Under Review): squid: mds: avoid recalling Fb when quiescing file
- 08:03 PM CephFS Bug #65472 (Pending Backport): mds: avoid recalling Fb when quiescing file
- 07:58 PM devops Bug #65555 (New): old pinned mistune in admin/doc-requirements.txt is vulnerable to CVE-2022-34749
- @admin/doc-requirements.txt@ pins to an older @mistune@ library version. Security scanners treat this as a vulnerabil...
- 07:32 PM Dashboard Bug #46735: FAIL: test_all (tasks.mgr.dashboard.test_rgw.RgwBucketTest)
- from https://jenkins.ceph.com/job/ceph-api/72562/consoleFull...
- 07:18 PM Orchestrator Bug #65546: quincy|reef: qa/suites/upgrade/pacific-x: failure to pull image causes dead jobs
- https://pulpito.ceph.com/teuthology-2024-04-17_01:16:02-upgrade:quincy-x-reef-distro-default-smithi/
- 02:12 PM Orchestrator Bug #65546 (New): quincy|reef: qa/suites/upgrade/pacific-x: failure to pull image causes dead jobs
- https://pulpito.ceph.com/teuthology-2024-04-17_01:08:06-upgrade:pacific-x-reef-distro-default-smithi/
Beyond the i... - 07:06 PM mgr Bug #64799: mgr: update cluster state for new maps from the mons before notifying modules
- Per let's not hurry up with backporting this chnage. IMHO it deserves some _baking_ in `main`.:
> let's not hurry... - 07:03 PM RADOS Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
- I created a pull request for this: https://github.com/ceph/ceph/pull/56971
A warning message is now generated if use... - 06:36 PM Ceph Bug #65509: osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
- Per https://github.com/ceph/ceph/pull/56924#issuecomment-2061948862 a workaround exists:
> (...) we could recover ... - 06:04 PM Dashboard Bug #47612: ERROR: setUpClass (tasks.mgr.dashboard.test_health.HealthTest)
- https://jenkins.ceph.com/job/ceph-api/72561/...
- 06:00 PM RADOS Bug #53000: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called
- from https://jenkins.ceph.com/job/ceph-pull-requests/133465/consoleFull...
- 05:34 PM Orchestrator Bug #65554 (In Progress): mgr/nfs: nfs module commands do not accept json-pretty format
- ...
- 04:10 PM Ceph QA QA Run #65045 (QA Closed): wip-yuri5-testing-2024-03-21-0833
- 04:08 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- Aishwarya Mathuria wrote in #note-20:
> All failures are tracked here: https://tracker.ceph.com/projects/rados/wiki/... - 03:53 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- All failures are tracked here: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65045
- 02:56 PM Ceph QA QA Run #65045 (QA Needs Approval): wip-yuri5-testing-2024-03-21-0833
- 02:54 PM Ceph QA QA Run #65045 (QA Approved): wip-yuri5-testing-2024-03-21-0833
- Aishwarya Mathuria wrote in #note-17:
> Rados approved
Great!
@amathuri ps change the status to QA Approve in the f... - 12:43 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- Rados approved
- 03:36 PM Orchestrator Bug #65553 (Pending Backport): cephadm: agent tries to json load response payload before checking for errors
- If the connection itself fails, the agent will end up hitting another exception...
- 03:33 PM rgw Backport #65353 (In Progress): squid: rgwlc: Executing radosgw-admin lc process --bucket <bkt-name> without setting lc rule results in Segmentation fault
- 03:32 PM Ceph QA QA Run #65552 (QA Closed): wip-yuri2-testing-2024-04-17-0823-reef
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/54662 - reef: debian: add missing bcrypt to c...- 03:30 PM rgw Backport #64496 (In Progress): squid: keystone admin token is not invalidated on http 401 response
- 03:28 PM rgw Backport #64552 (In Progress): squid: rgw/multisite: objects named "." or ".." are not replicated
- 03:26 PM rgw Feature #65551 (Pending Backport): [rgw][accounts] bucket quota management at account-level
- Account feature has been introduced by https://github.com/ceph/ceph/pull/54333 and we are planning to migrate our rad...
- 03:19 PM rbd Backport #65550 (In Progress): squid: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- https://github.com/ceph/ceph/pull/57031
- 03:18 PM rbd Backport #65549 (In Progress): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- https://github.com/ceph/ceph/pull/57030
- 03:07 PM rbd Backport #65548 (Duplicate): reef: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 03:07 PM rbd Backport #65547 (Resolved): quincy: [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- https://github.com/ceph/ceph/pull/57029
- 03:02 PM rbd Bug #65481 (Pending Backport): [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 01:40 PM CephFS Bug #65545 (Fix Under Review): Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
- 01:34 PM CephFS Bug #65545 (Pending Backport): Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
- Reported by the QE team at https://bugzilla.redhat.com/show_bug.cgi?id=2275459...
- 01:37 PM rgw Feature #65050: Add alternative way for providing user name/password for Kafka endpoint authentication
- Needs review. Corresponding PR is here:
https://github.com/ceph/ceph/pull/56493 - 01:01 PM CephFS Backport #65325 (In Progress): reef: client: log message when unmount call is received
- 01:01 PM CephFS Backport #65326 (In Progress): quincy: client: log message when unmount call is received
- 12:52 PM CephFS Backport #65365 (In Progress): reef: qa: run TestSnapshots.test_kill_mdstable for all mount types
- 12:52 PM CephFS Backport #65366 (In Progress): squid: qa: run TestSnapshots.test_kill_mdstable for all mount types
- 12:51 PM CephFS Backport #65520 (In Progress): reef: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
- 12:50 PM CephFS Backport #65519 (In Progress): squid: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
- 12:38 PM rgw Backport #65543 (In Progress): squid: rgw: increase log level on abort_early
- 12:37 PM rgw Backport #65543 (In Progress): squid: rgw: increase log level on abort_early
- https://github.com/ceph/ceph/pull/56949
- 12:37 PM Ceph Backport #65540 (In Progress): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
- 09:39 AM Ceph Backport #65540 (Resolved): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
- https://github.com/ceph/ceph/pull/56948
- 12:37 PM Orchestrator Bug #63784: qa/standalone/mon/mkfs.sh:'mkfs/a' already exists and is not empty: monitor may already exist
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648785/
- 12:37 PM rgw Backport #65544 (New): reef: rgw: increase log level on abort_early
- 12:35 PM RADOS Bug #50245: TEST_recovery_scrub_2: Not enough recovery started simultaneously
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648573/
- 12:31 PM Ceph Backport #65539 (In Progress): squid: Add alerts to ceph monitoring stack for the nvmeof gateways
- 09:39 AM Ceph Backport #65539 (Resolved): squid: Add alerts to ceph monitoring stack for the nvmeof gateways
- https://github.com/ceph/ceph/pull/56947
- 12:27 PM rgw Bug #65469 (Pending Backport): rgw: increase log level on abort_early
- 12:11 PM Orchestrator Bug #65035: ERROR: required file missing from config-json: idmap.conf
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648585/
- 12:04 PM RADOS Bug #53544: src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in thrash_cache_writeback_proxy_none tests
- @lflores FYI seeing this one after a while in one of the main runs - /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-tes...
- 11:48 AM Dashboard Backport #64791 (In Progress): squid: mgr/dashboard: In rgw multisite, during zone creation acess/secret key should not be compulsory provide an edit option to set these keys
- 11:47 AM RADOS Bug #56393: failed to complete snap trimming before timeout
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648606
/a/yuriw-2024-04-... - 11:45 AM Dashboard Bug #61786: test_dashboard_e2e.sh: Can't run because no spec files were found; couldn't determine Mocha version
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648833
- 11:43 AM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648693/
- 11:41 AM Infrastructure Bug #65229: Failed to reconnect to smithiXXX
- @akraitma I am seeing this here - /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default...
- 10:48 AM Orchestrator Bug #63502: Regression: Permanent KeyError: 'TYPE' : return self.blkid_api['TYPE'] == 'part'
- Vadym Kukharenko wrote in #note-1:
> I got the same problem.
> Fistly tried to upgrade from 17.2.6 to 17.2.7.
> Se... - 10:14 AM Dashboard Bug #65534 (Fix Under Review): mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
- 07:10 AM Dashboard Bug #65534 (Pending Backport): mgr/dashboard: grafana dashboad doesn't exist when anonymous_access is enabled
- Overall Performance does not display the graphs
Description of problem:
# cat /var/lib/ceph/tmp/grafana.yaml
s... - 10:10 AM Dashboard Cleanup #65207 (Resolved): mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- 10:10 AM Dashboard Backport #65504 (Resolved): reef: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- 10:09 AM Dashboard Backport #65505 (Resolved): squid: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- 10:08 AM Dashboard Backport #65542 (In Progress): squid: rgw roles e2e tests failure
- 10:02 AM Dashboard Backport #65542 (Resolved): squid: rgw roles e2e tests failure
- https://github.com/ceph/ceph/pull/56945
- 10:07 AM RADOS Bug #62512: osd msgr-worker high cpu 300% due to throttle-osd_client_messages get_or_fail_fail (osd_client_message_cap=256)
- Please see my latest update to the PR: https://github.com/ceph/ceph/pull/53477
I can confirm the fix is good and a... - 09:55 AM Dashboard Bug #65506 (Pending Backport): rgw roles e2e tests failure
- 09:45 AM Dashboard Bug #65541 (New): Empty (string, list, object) should be blank in dashboard
- Empty (string, list, object) should be blank in dashboard
We need to see how we show empty data structures in dash... - 09:31 AM Ceph Backport #65538 (New): reef: Add alerts to ceph monitoring stack for the nvmeof gateways
- 09:28 AM Ceph Feature #64335 (Pending Backport): Add alerts to ceph monitoring stack for the nvmeof gateways
- 08:37 AM Orchestrator Bug #64868: cephadm/osds, cephadm/workunits: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) in cluster log
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648713/
/a/yuriw-2024-04... - 08:29 AM RADOS Bug #64942: rados/verify: valgrind reports "Invalid read of size 8" error.
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648773/
- 08:23 AM Orchestrator Bug #64871: rados/cephadm/workunits: Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)" in cluster log
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648786/
/a/yuriw-2024-04-... - 08:16 AM Messengers Documentation #65537 (New): RDMA support
- Hi guys,
I needed to setup Ceph over RDMA, but I faced many issues! Because there is not enough info in the docume... - 08:03 AM CephFS Bug #65536 (Fix Under Review): mds: after the unresponsive client was evicted the blocked slow requests were not successfully cleaned up
- 07:50 AM CephFS Bug #65536 (Fix Under Review): mds: after the unresponsive client was evicted the blocked slow requests were not successfully cleaned up
Firstly a *client.188978:3 lookup #0x10000000000/csi* client request came and then was added to the waiter list:
...- 07:47 AM Orchestrator Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648850/
- 07:33 AM Orchestrator Bug #65017: cephadm: log_channel(cephadm) log [ERR] : Failed to connect to smithi090 (10.0.0.9). Permission denied
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648574/
- 07:28 AM sepia Support #65535 (In Progress): Sepia Lab Access Request
- kapandya@macvm Bp4C6fBYeF56whmrSsCAoQ 6567cd0199a79c959e1a34a0793b2db3a9cc16ac2a503b772de4cd9f642cc590
- 07:23 AM RADOS Bug #61774: centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648705
/a/yuriw-2024-04-... - 07:16 AM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648565
/a/yuriw-2024-04-... - 06:57 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Venky Shankar wrote in #note-15:
> Dhairya Parmar wrote in #note-14:
> > Venky Shankar wrote in #note-13:
> > > Ve... - 06:47 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya Parmar wrote in #note-14:
> Venky Shankar wrote in #note-13:
> > Venky Shankar wrote in #note-11:
> > > Dh... - 06:14 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Venky Shankar wrote in #note-13:
> Venky Shankar wrote in #note-11:
> > Dhairya Parmar wrote in #note-10:
> > > I ... - 05:03 AM CephFS Feature #65503 (New): mgr/stats, cephfs-top: provide per volume/sub-volume based performance metrics to monitor / troubleshoot performance issues
- 04:54 AM CephFS Feature #65503 (Rejected): mgr/stats, cephfs-top: provide per volume/sub-volume based performance metrics to monitor / troubleshoot performance issues
- 04:40 AM Ceph Bug #65533 (Resolved): qa/vstart_runner.py: don't let command run after timeout
- LocalRemote.run() accepts parameter @timeout@ but it is not passed to @subprocess@ and therefore has no effect.
- 03:16 AM crimson Bug #65532: osd crashes due to invalid clone_range ops
- It seems that this is due to incorrect clone_overlap calculations, will go into it.
- 03:15 AM crimson Bug #65532 (Fix Under Review): osd crashes due to invalid clone_range ops
- ...
- 02:57 AM sepia Support #65359: Sepia Lab Access Request
- Hi Adam,
I am able to access sepia lab
amk:openvpn$ ping teuthology.front.sepia.ceph.com
PING teuthology.front... - 02:51 AM crimson Bug #65531: crimson-osd: dump_historic_slow_ops command not correctly run
- I don't think it's necessary to put history_cliend_request and history_slow_cliend_request together, so I will separa...
- 02:34 AM crimson Bug #65531 (In Progress): crimson-osd: dump_historic_slow_ops command not correctly run
- right now, historic ops and historic slow ops all placed in OperationTypeCode::historic_client_request op_list, use l...
- 02:15 AM Ceph QA QA Run #65523 (QA Closed): wip-pdonnell-testing-20240416.232211-debug
- https://github.com/ceph/ceph/pull/56934#pullrequestreview-2004866085
- 01:43 AM rgw Bug #65436: Getting Object Crashing radosgw services
- I have same issue. After some days, i found bug https://tracker.ceph.com/issues/61359
After upgrade to 17.2.7, this ...
04/16/2024
- 11:26 PM Ceph QA QA Run #65510 (QA Needs Approval): wip-yuriw-testing-20240416.150233
- 03:03 PM Ceph QA QA Run #65510 (QA Closed): wip-yuriw-testing-20240416.150233
- * "PR #56814":https://github.com/ceph/ceph/pull/56814 -- squid: osd/SnapMapper: fix _lookup_purged_snap
* "PR #56697... - 11:22 PM Ceph QA QA Run #65522 (QA Closed): wip-pdonnell-testing-20240416.232051-debug
- 11:21 PM Ceph QA QA Run #65522 (QA Closed): wip-pdonnell-testing-20240416.232051-debug
- * "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli...
- 11:22 PM Ceph QA QA Run #65523 (QA Closed): wip-pdonnell-testing-20240416.232211-debug
- * "PR #56935":https://github.com/ceph/ceph/pull/56935 -- mds: regular file inode flags are not replicated by the poli...
- 11:19 PM RADOS Cleanup #65521 (New): Add expected warnings in cluster log to ignorelists
- Relevant Slack conversation:
Hey all, as I brought up in today's RADOS call, there has been a surge of cluster war... - 11:19 PM CephFS Backport #65520 (In Progress): reef: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
- https://github.com/ceph/ceph/pull/56951
- 11:18 PM CephFS Backport #65519 (In Progress): squid: qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
- https://github.com/ceph/ceph/pull/56950
- 11:17 PM CephFS Bug #65271 (Pending Backport): qa: cluster [WRN] Health detail: HEALTH_WARN 1 pool(s) do not have an application enabled" in cluster log
- 09:06 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-9:
> Nir Soffer wrote in #note-8:
> > Yes, the configuration is applied to both cluster... - 08:22 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Nir Soffer wrote in #note-8:
> Yes, the configuration is applied to both clusters. If I understand correctly,
> The... - 08:06 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-7:
> Nir Soffer wrote in #note-6:
> > The other log file (e.g. 62f28287-356f-4f81-87dc-... - 07:04 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Nir Soffer wrote in #note-6:
> The other log file (e.g. 62f28287-356f-4f81-87dc-51bb05942553-client.rbd-mirror-peer.... - 01:03 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Nir Soffer wrote in #note-5:
> > https://github.com/red-hat-storage/ocs-operator/blob/4a0325d824a409e84fac21ffbf0a... - 08:55 PM RADOS Bug #53768: timed out waiting for admin_socket to appear after osd.2 restart in thrasher/defaults workload/small-objects
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652505
- 08:55 PM CephFS Bug #65518 (Fix Under Review): mds: regular file inode flags are not replicated by the policylock
- 08:53 PM CephFS Bug #65518 (Pending Backport): mds: regular file inode flags are not replicated by the policylock
- Currently, the flags are only replicated for directory inodes.
- 08:40 PM RADOS Bug #64437: qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652514
- 08:37 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- /a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620629
/a/yuriw-2024-03... - 08:36 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- Hey @nmordech can you have a look?
- 08:35 PM RADOS Bug #65517: rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- Looks like the change was made in https://github.com/ceph/ceph/pull/53308, which did initially pass QA testing, but m...
- 08:31 PM RADOS Bug #65517 (Fix Under Review): rados/thrash-erasure-code-crush-4-nodes: ceph task fails at getting monitors
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652508...
- 08:16 PM Orchestrator Bug #64872: rados/cephadm/smoke: Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) in cluster log
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652511
- 08:12 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
- ...
- 08:10 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652474...
- 08:08 PM RADOS Bug #62776: rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled
- /a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652467
- 08:05 PM Ceph Bug #65509 (Fix Under Review): osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
- https://github.com/ceph/ceph/pull/56924
- 02:47 PM Ceph Bug #65509: osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
- See: https://github.com/ceph/ceph/pull/56924
- 02:03 PM Ceph Bug #65509 (Fix Under Review): osd: remove outdated, incorrect truncate asserts in ECTransaction's generate_transactions
- User hit this:...
- 08:04 PM CephFS Bug #65496 (Fix Under Review): mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- 02:42 AM CephFS Bug #65496 (Pending Backport): mds: ceph.dir.subvolume and ceph.quiesce.blocked is not properly replicated
- The logic for checking if an inode already had these vxattrs set has the serious defect that it will only execute xlo...
- 07:59 PM Orchestrator Backport #65415 (Resolved): squid: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
- 07:58 PM Orchestrator Backport #65382 (Resolved): squid: NLM should be enabled in NFS-Ganesha config file for locking functionality to work with v3 protocol
- 07:55 PM Orchestrator Bug #64865 (Resolved): cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
- 07:55 PM Orchestrator Backport #65414 (Resolved): squid: cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
- 07:54 PM Dashboard Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
- More in this run:
https://pulpito.ceph.com/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-... - 07:50 PM Dashboard Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
- And in a cephadm test: /a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/765...
- 07:44 PM Dashboard Bug #64870: Health check failed: 1 osds down (OSD_DOWN)" in cluster log
- Also found in an upgrade test:
description: rados/upgrade/parallel/{0-random-distro$/{ubuntu_22.04} 0-start 1-task... - 07:51 PM Orchestrator Bug #52109: test_cephadm.sh: Timeout('Port 8443 not free on 127.0.0.1.',)
- /a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7650646
- 07:49 PM Ceph QA QA Run #65516 (QA Closed): wip-rishabh-testing-20240416.193735
- https://github.com/ceph/ceph/pull/56846
https://github.com/ceph/ceph/pull/56732
https://github.com/ceph/ceph/pull/5... - 07:46 PM Infrastructure Bug #65448: Teuthology unable to find the "ceph-radosgw" package
- /a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7650669
- 07:42 PM RADOS Bug #65231: upgrade/quincy-x/parallel: "Reduced data availability: 1 pg peering (PG_AVAILABILITY)" in cluster log
- /a/yuriw-2024-04-10_14:17:51-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7650700
- 07:30 PM rgw Backport #64510 (Resolved): squid: backport rgw/lc: decorating log events with more details
- 07:29 PM rgw Backport #64949 (Resolved): squid: rgw-multisite: add x-rgw-replicated-at
- 07:29 PM rgw Backport #65292 (Resolved): squid: pubsub: validate Name in CreateTopic api
- 07:28 PM rgw Backport #65297 (Resolved): squid: allow AWS lifecycle event types to configure lifecycle notifications and Replication notifications
- 07:28 PM rgw Backport #65375 (Resolved): squid: lifecycle transition crashes since reloading bucket attrs for notification
- 07:27 PM rgw Feature #65466 (Resolved): rgw user accounts
- 07:27 PM rgw Backport #65402 (Resolved): squid: persistent topic stats test fails
- 07:27 PM rgw Feature #50078 (Resolved): [RFE] multisite: Bucket notification information should be shared between zones.
- 07:27 PM rgw Backport #64818 (Resolved): squid: [RFE] multisite: Bucket notification information should be shared between zones.
- 07:26 PM rgw Backport #65467 (Resolved): squid: rgw user accounts
- 07:25 PM rgw Backport #65411 (Resolved): squid: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- 06:14 PM Dashboard Backport #65515 (In Progress): squid: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- 05:58 PM Dashboard Backport #65515 (In Progress): squid: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- https://github.com/ceph/ceph/pull/56931
- 06:10 PM Dashboard Backport #65513 (In Progress): quincy: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- 05:51 PM Dashboard Backport #65513 (In Progress): quincy: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- https://github.com/ceph/ceph/pull/56930
- 06:01 PM Dashboard Backport #65512 (In Progress): reef: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- 05:51 PM Dashboard Backport #65512 (In Progress): reef: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- https://github.com/ceph/ceph/pull/56929
- 05:52 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Dhairya Parmar wrote in #note-27:
> Venky Shankar wrote in #note-26:
> > Dhairya Parmar wrote in #note-25:
> > > V... - 08:24 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Dhairya Parmar wrote in #note-27:
> Venky Shankar wrote in #note-26:
> > Dhairya Parmar wrote in #note-25:
> > > V... - 08:09 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- as mentioned in yesterday's standup - some of the PRs (https://github.com/ceph/ceph/pull/49971, https://github.com/ce...
- 05:51 PM Dashboard Backport #65514 (New): squid: mgr/dashboard: fix duplicate grafana panels when on mgr failover
- 05:49 PM Dashboard Bug #64970 (Pending Backport): mgr/dashboard: fix duplicate grafana panels when on mgr failover
- 05:44 PM Ceph QA QA Run #65420 (QA Closed): wip-yuri2-testing-2024-04-10-1311-squid
- this was assigned to me before there were any qa runs attached. i've already tested all of these prs. i think all but...
- 05:22 PM RADOS Bug #62588: ceph config set allows WHO to be osd.*, which is misleading
- ...
- 05:11 PM rgw Backport #65351 (Resolved): squid: rgw: crash in lc while transitioning to cloud
- 04:59 PM Orchestrator Documentation #64596 (Resolved): secure monitoring stack support is not documented
- 04:59 PM Orchestrator Backport #64631 (Resolved): squid: secure monitoring stack support is not documented
- 04:41 PM sepia Support #65359: Sepia Lab Access Request
- Hey Amarnath Reddy,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn a... - 03:52 PM sepia Support #65359: Sepia Lab Access Request
- Hi Adam,
These are new Credentials.
Earlier I did not have access to sepia lab.
Regards,
Amarnath - 04:40 PM sepia Bug #65475: folio03 install
- Hey @nmordech folio03 is now installed with rhel 9.3 , you should have ssh access in about an hour from now
- 04:19 PM Orchestrator Bug #65511 (Pending Backport): cephadm: anonymous_access: false is dropped from grafana spec after apply
- ...
- 03:17 PM Dashboard Bug #65506: rgw roles e2e tests failure
- Same issue happening on squid hence adding backport
- 11:54 AM Dashboard Bug #65506 (Fix Under Review): rgw roles e2e tests failure
- 11:09 AM Dashboard Bug #65506 (Resolved): rgw roles e2e tests failure
- *Rgw roles tests failing with 500 internal server error:*...
- 03:06 PM rgw Backport #65427 (Resolved): squid: Admin Ops socket crashes RGW
- 02:30 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- @lflores sorry for the delay! Will wrap it up by tomorrow.
- 02:17 PM crimson Bug #65491: recover_missing: racing read got wrong version
- Not a fix yet, bug I added few missing log lines that may help here:
https://github.com/ceph/ceph/pull/56916/commits... - 10:06 AM crimson Bug #65491: recover_missing: racing read got wrong version
- WIP
- 01:57 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- Yeah, there is a change in @attrs@ processing. Already prepared a commit: https://github.com/rzarzynski/ceph/commit/c...
- 01:53 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
I am considering the following suspect(s):
PR #54930 modified ScrubMap::object::attrs (where we see a problem) from ...- 01:28 PM CephFS Bug #65508 (Fix Under Review): qa: lockup not long enough to for test_quiesce_authpin_wait
- 01:25 PM CephFS Bug #65508 (Pending Backport): qa: lockup not long enough to for test_quiesce_authpin_wait
- https://pulpito.ceph.com/leonidus-2024-04-16_05:41:33-fs-wip-lusov-quiesce-xlock-distro-default-smithi/7657916/
- 12:08 PM Ceph Bug #65507 (New): diskprediction_local failed with python3.10
- 1. failed messages:...
- 11:56 AM Dashboard Backport #65504 (In Progress): reef: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- 10:44 AM Dashboard Backport #65504 (Resolved): reef: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- https://github.com/ceph/ceph/pull/56921
- 11:55 AM Dashboard Backport #65505 (In Progress): squid: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- 10:44 AM Dashboard Backport #65505 (Resolved): squid: mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- https://github.com/ceph/ceph/pull/56920
- 11:47 AM RADOS Bug #65449 (In Progress): NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
- 11:05 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- Venky Shankar wrote in #note-33:
> OK. So this bug has upgrades written all over it - it seemed obvious given that t... - 10:56 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- OK. So this bug has upgrades written all over it - it seemed obvious given that this is an upgrade task but we were t...
- 10:39 AM CephFS Feature #65503 (New): mgr/stats, cephfs-top: provide per volume/sub-volume based performance metrics to monitor / troubleshoot performance issues
- Reported by BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2275081
Currently the cephfs-top utility only displays... - 10:37 AM Dashboard Cleanup #65207 (Pending Backport): mgr/dashboard: Move features to advanced section in create image form and expand by default rbd config section
- 10:37 AM Dashboard Backport #65502 (New): squid: mgr/dashboard: provide hub Cluster HA for multi-cluster setup
- 10:32 AM Dashboard Bug #65499 (Pending Backport): mgr/dashboard: provide hub Cluster HA for multi-cluster setup
- 06:23 AM Dashboard Bug #65499 (Pending Backport): mgr/dashboard: provide hub Cluster HA for multi-cluster setup
- When adding a cluster to the multi-cluster setup, set all the mgr IP's as cross_origin_url in the connected cluster t...
- 10:08 AM Dashboard Backport #65501 (In Progress): squid: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
- 09:51 AM Dashboard Backport #65501 (In Progress): squid: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
- https://github.com/ceph/ceph/pull/56918
- 10:07 AM Dashboard Backport #65500 (In Progress): reef: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
- 09:51 AM Dashboard Backport #65500 (In Progress): reef: mgr/dashboard: snap schedule remove minutely from retention policy dropdown
- https://github.com/ceph/ceph/pull/56917
- 09:52 AM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
- Thanks, Matan! It sounds very promising. I talked to the customer and they are willing to test this cleanup procedure...
- 09:45 AM Dashboard Bug #65493 (Pending Backport): mgr/dashboard: snap schedule remove minutely from retention policy dropdown
- 05:58 AM Messengers Bug #65401: msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
- Could anyone give a review on this? Thanks very much!
- 05:51 AM Dashboard Backport #65498 (New): squid: mgr/dashboard: fetch prometheus api host with ip addr
- 05:48 AM Dashboard Bug #65302 (Pending Backport): mgr/dashboard: fetch prometheus api host with ip addr
- 04:45 AM CephFS Bug #65497 (Fix Under Review): qa: enhance labelled perf counters tests in test_admin.py
- 04:28 AM CephFS Backport #65347 (In Progress): squid: qa: failed cephfs-shell test_reading_conf
- 02:50 AM crimson Bug #64680: transaction_manager_test/tm_random_block_device_test_t.scatter_allocation/0 status failed
- This is caused by prefilling rbm devices, which is used to create scatterly allocated devices and is only used in uni...
- 01:05 AM crimson Feature #65478: Support SnapMapper::Scrubber
- It will be completed in these few days
04/15/2024
- 10:55 PM RADOS Bug #64863 (Resolved): rados/thrash-old-clients: Health detail: HEALTH_WARN 1/3 mons down, quorum a,c in cluster log
- https://github.com/ceph/ceph/pull/56619
Radoslaw Zarzynski wrote in #note-3:
> Hmm, I think I saw Laura's PR for ... - 10:30 PM rgw Backport #65339 (Resolved): squid: rgw: update options yaml file so LDAP uri isn't an invalid example
- 10:30 PM rgw Backport #65412 (Resolved): squid: multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
- 10:29 PM rgw Backport #64954 (Resolved): squid: Notification FilterRules for S3key, S3Metadata & S3Tags spit incorrect json output
- 10:24 PM teuthology Bug #64727: suites/dbench.sh: Socket exception: No route to host (113)
- /a/yuriw-2024-04-09_01:14:16-smoke-reef-release-distro-default-smithi/7647071
/a/yuriw-2024-04-09_01:14:16-smoke-ree... - 10:23 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- Ilya Dryomov wrote in #note-6:
> Nir Soffer wrote:
> > Restarting the ceph-mgr pod does not help, rbd-mirroring is ... - 09:36 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- Nir Soffer wrote:
> Restarting the ceph-mgr pod does not help, rbd-mirroring is broken and
> we don't have any work... - 09:03 PM cephsqlite Bug #65494 (In Progress): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- 09:03 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- ...
- 08:58 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- Tested with:
* image: quay.io/ceph/ceph:v18
* imageID: quay.io/ceph/ceph@sha256:8c1697a0a924bbd625c9f1b33893bbc47b9... - 07:53 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- Looks like a sqlite issue. Patrick, can you take a look please?
- 07:15 PM cephsqlite Bug #65494: ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- Thread from rook slack:
https://rook-io.slack.com/archives/CK9CF5H2R/p1711467112958679 - 07:13 PM cephsqlite Bug #65494 (Pending Backport): ceph-mgr critical error: "Module 'devicehealth' has failed: table Device already exists"
- h1. Description
We have a random error (about 1 in 200 deploys) when after creating a rook
cephcluster and cephbl... - 10:14 PM RADOS Bug #62776: rados: cluster [WRN] overall HEALTH_WARN - do not have an application enabled
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647437
- 10:11 PM Dashboard Bug #64377: tasks/e2e: Modular dependency problems
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647494
- 10:07 PM CephFS Bug #64946: qa: unable to locate package libcephfs1
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647487
/a/yuriw-2024-04-09_01:16:20-rados-reef... - 10:01 PM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647835
- 06:25 PM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
- Just a supplement to Nitzan's comment:
* this PG was @down@ and
* @ 'blocked_by': [2]@.
This brings the questi... - 09:59 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-4:
> > This is not ODF environment, this is upstream rook environment.
> >
> > You ca... - 09:21 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Nir Soffer wrote in #note-3:
> It can be, but rbd mirror should fail (and restart) if pod networking is broken, no?
... - 08:54 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Ilya Dryomov wrote in #note-1:
> Hi Nir,
>
> rbd-mirror daemon states that it was unable to connect to the remote... - 08:44 PM rbd Bug #65487: rbd-mirror daemon in ERROR state, require manual restart
- Tested with:
* image: quay.io/ceph/ceph:v18
* imageID: quay.io/ceph/ceph@sha256:06ddc3ef5b66f2dcc6d16e41842d33a3d... - 08:43 PM rbd Bug #65487 (Need More Info): rbd-mirror daemon in ERROR state, require manual restart
- Hi Nir,
rbd-mirror daemon states that it was unable to connect to the remote cluster. Could it be some kind of po... - 01:26 PM rbd Bug #65487 (Pending Backport): rbd-mirror daemon in ERROR state, require manual restart
- h1. Description
We experience a random error in rbd-mirror daemon, occurring 1-2 times per 100 deployments.
Whe... - 09:59 PM RADOS Bug #62992: Heartbeat crash in reset_timeout and clear_timeout
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647721
/a/yuriw-2024-04-09_01:16:20-rados-reef... - 09:55 PM Orchestrator Bug #64208: test_cephadm.sh: Container version mismatch causes job to fail.
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647904
/a/yuriw-2024-04-09_01:16:20-rados-reef... - 09:51 PM RADOS Bug #65183: Overriding an EC pool needs the "--yes-i-really-mean-it" flag in addition to "force"
- /a/yuriw-2024-04-09_01:16:20-rados-reef-release-distro-default-smithi/7647523
/a/yuriw-2024-04-09_01:16:20-rados-reef... - 08:32 PM RADOS Bug #65495: 1 slow request in rgw suite causes test failure
- i see that one of the osds on the other node has a similarly large log:...
- 08:19 PM RADOS Bug #65495 (New): 1 slow request in rgw suite causes test failure
- on an integration branch based on squid, a rgw suite job failed due to 'slow request' errors: https://qa-proxy.ceph.c...
- 06:32 PM RADOS Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
- Bump up. In QA.
- 06:31 PM RADOS Bug #65227: noscrub cluster flag prevents deep-scrubs from starting
- IIRC Ronen is already working on start orchestration between deep- and shallow-scrubs,
- 06:28 PM RADOS Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
- Bump up.
- 06:27 PM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
- The PR is in QA.
- 06:16 PM RADOS Bug #65449: NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
- Hi Nitzan! Would you mind taking a look?
- 06:11 PM RADOS Bug #59670 (Need More Info): Ceph status shows PG recovering when norecover flag is set
- The fix has been merged on 5 Jan 2024, so this could fit. It has been bacported only to Reef.
Wes Dillingham, do y... - 06:10 PM Orchestrator Backport #65383 (In Progress): reef: NLM should be enabled in NFS-Ganesha config file for locking functionality to work with v3 protocol
- 05:53 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
- Needs to be whitelisted; will bring this issue and others like it to the next RADOS meeting so we can divide up that ...
- 05:46 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
- Thanks Venky, it was a mistake that I added it there in the first place.
- 01:15 PM RADOS Bug #65422: upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
- Laura, handing this back to you since this isn't really cephfs related.
- 05:45 PM RADOS Bug #53472 (Need More Info): Active OSD processes do not see reduced memory target when adding more OSDs
- Pacific is EOL. Does it replicate on newer releases?
- 05:45 PM RADOS Bug #53472: Active OSD processes do not see reduced memory target when adding more OSDs
- This tracker is 2 years old. I'm not sure how the situation was back then but, at least now, BlueStore is observing t...
- 05:35 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Waiting for upstream QA.
- 05:34 PM RADOS Bug #65371: rados: PeeringState::calc_replicated_acting_stretch populate acting set before checking if < bucket_max
- Bump up.
- 05:02 PM rbd Bug #65481 (Fix Under Review): [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- 09:51 AM rbd Bug #65481 (Pending Backport): [test] krbd_msgr_segments and krbd_rxbounce fail on 8.stream
- https://qa-proxy.ceph.com/teuthology/yuriw-2024-04-09_15:14:48-krbd-reef-release-testing-default-smithi/7649268/teuth...
- 04:15 PM CephFS Documentation #57011: doc: 'profile cephfs-mirror' description is missing
- not sure why this got moved out of cephfs, this is our documentation bug
- 03:58 PM cleanup Tasks #65471 (Fix Under Review): rgw_sal_posix.cc printf compiler warnings
- 03:48 PM Dashboard Bug #65493 (Pending Backport): mgr/dashboard: snap schedule remove minutely from retention policy dropdown
- Remove minutely from retention policy dropdown
- 03:28 PM rgw Bug #65473 (Fix Under Review): rgw: exclude logging of request payer for 403 requests
- 03:27 PM Orchestrator Backport #65417 (In Progress): squid: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
- 03:26 PM Orchestrator Backport #65415 (In Progress): squid: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
- 03:23 PM Orchestrator Backport #65382 (In Progress): squid: NLM should be enabled in NFS-Ganesha config file for locking functionality to work with v3 protocol
- 03:20 PM Orchestrator Backport #65381 (In Progress): squid: upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
- 03:14 PM Orchestrator Backport #65378 (In Progress): squid: cephadm: client-keyring also overwrites ceph.conf
- 02:59 PM crimson Bug #53661 (Closed): Creation of the cluster failed with the crimson build
- Please re-open if still relevant.
- 02:55 PM crimson Bug #53047 (Closed): cmake command not found in the standalone cluster to execute cmake -DWITH_SEASTAR=ON .. command
- Please re-open if still relevant.
- 02:54 PM crimson Bug #52623 (Closed): Cache tries to get an invalid root extent
- Please re-open if still relevant.
- 02:53 PM crimson Bug #51639 (Closed): crimson/store_nbd: crash after start
- Please re-open if still relevant.
- 02:53 PM pulpito Feature #65492 (New): support looking at runs by user
- like
https://pulpito.ceph.com/?user=teuthology - 02:52 PM crimson Bug #47597 (Closed): got crush when stop one osd and restart it during rados bench
- Please re-open if still relevant.
- 02:52 PM crimson Bug #47030 (Closed): segault when evicting osdmap from cache
- Please re-open if still relevant.
- 02:50 PM crimson Bug #57547 (Closed): Hang with seastore at wait_for_active stage
- Please re-open if still relevant.
- 02:49 PM crimson Bug #57548 (Closed): Hang with alienstore
- Please re-open if still relevant.
- 02:49 PM crimson Subtask #45535 (Closed): crimson: crimson-osd failure in ceph-container
- Please re-open if still relevant.
- 02:40 PM rgw Bug #64571: lifecycle transition crashes since reloading bucket attrs for notification
- The cause for this issue seems to be due to multiple LC worker threads updating the same `bucket` handle, which is no...
- 02:31 PM CephFS Backport #65489 (In Progress): squid: mds: enhance scrub to fragment/merge dirfrags
- 01:28 PM CephFS Backport #65489 (In Progress): squid: mds: enhance scrub to fragment/merge dirfrags
- https://github.com/ceph/ceph/pull/56896
- 02:30 PM CephFS Backport #65488 (In Progress): reef: mds: enhance scrub to fragment/merge dirfrags
- 01:28 PM CephFS Backport #65488 (In Progress): reef: mds: enhance scrub to fragment/merge dirfrags
- https://github.com/ceph/ceph/pull/56895
- 02:28 PM CephFS Backport #65490 (In Progress): quincy: mds: enhance scrub to fragment/merge dirfrags
- 01:28 PM CephFS Backport #65490 (In Progress): quincy: mds: enhance scrub to fragment/merge dirfrags
- https://github.com/ceph/ceph/pull/56894
- 02:11 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Venky Shankar wrote in #note-11:
> Dhairya Parmar wrote in #note-10:
> > I was confident of the code, I've mentione... - 01:27 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- this doesn't seem related to test cases at all
time when the MGR_DOWN warning was seen:... - 06:13 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya Parmar wrote in #note-10:
> I was confident of the code, I've mentioned this in https://tracker.ceph.com/iss... - 02:09 PM CephFS Bug #65423 (Need More Info): Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
- fuchen ma wrote in #note-1:
> Another information:
> I found that the version of the non-crashed is 18.2.2, and the... - 02:08 PM CephFS Bug #65455 (Need More Info): read operation hung in Client::get_caps
- tod chen wrote in #note-1:
> the ceph version is 15.2.17 and 16.2.14
ceph 15.x is EOL'd and unsupported. Could yo... - 02:06 PM rgw Bug #65463: rgw/notifications: test data path v2 persistent migration fails
- * even tough no crash is observed, it seems like a similar issue to: https://tracker.ceph.com/issues/65337. when runn...
- 01:49 PM crimson Bug #65491 (In Progress): recover_missing: racing read got wrong version
- ...
- 01:12 PM rgw Bug #65486 (Fix Under Review): valgrind error on kafka shutdown
- 01:11 PM rgw Bug #65486 (Fix Under Review): valgrind error on kafka shutdown
- see: https://tracker.ceph.com/issues/65337#note-4
may cause crash on close. - 01:06 PM CephFS Bug #62123: mds: detect out-of-order locking
- This may also caused *MDS Behind on Trimming...*: https://www.mail-archive.com/ceph-users@ceph.io/msg24587.html.
- 12:24 PM bluestore Backport #65485: squid: bluestore/bluestore_types: check 'it' valid before using
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/56891
ceph-backport.sh versi... - 12:10 PM bluestore Backport #65485 (In Progress): squid: bluestore/bluestore_types: check 'it' valid before using
- 12:18 PM CephFS Feature #61866: MDSMonitor: require --yes-i-really-mean-it when failing an MDS with MDS_HEALTH_TRIM or MDS_HEALTH_CACHE_OVERSIZED health warnings
- Patrick, should we include other health warnings too? I didn't include it in PR because it was mentioned on this tick...
- 12:17 PM bluestore Backport #65484: reef: bluestore/bluestore_types: check 'it' valid before using
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/56890
ceph-backport.sh versi... - 12:09 PM bluestore Backport #65484 (In Progress): reef: bluestore/bluestore_types: check 'it' valid before using
- 12:16 PM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Venky Shankar wrote in #note-26:
> Dhairya Parmar wrote in #note-25:
> > Venky Shankar wrote in #note-24:
> > > Dh... - 10:52 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Dhairya Parmar wrote in #note-25:
> Venky Shankar wrote in #note-24:
> > Dhariya,
> >
> > Anything blocking w.r.... - 10:10 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Venky Shankar wrote in #note-24:
> Dhariya,
>
> Anything blocking w.r.t. the design for this enhancement? The lag... - 10:00 AM CephFS Bug #64563: mds: enhance laggy clients detections due to laggy OSDs
- Dhariya,
Anything blocking w.r.t. the design for this enhancement? The laggy OSD list is obviously something that ... - 12:14 PM bluestore Backport #65483: quincy: bluestore/bluestore_types: check 'it' valid before using
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/56889
ceph-backport.sh versi... - 12:09 PM bluestore Backport #65483 (Resolved): quincy: bluestore/bluestore_types: check 'it' valid before using
- 12:10 PM CephFS Bug #65157 (Can't reproduce): cephfs-mirror: set layout.pool_name xattr of destination subvol correctly
- Can't reproduce this:...
- 12:03 PM CephFS Backport #65316 (In Progress): squid: mds: CInode::item_caps used in two different lists
- https://github.com/ceph/ceph/pull/56887
- 12:03 PM CephFS Backport #65315 (In Progress): reef: mds: CInode::item_caps used in two different lists
- https://github.com/ceph/ceph/pull/56886
- 11:50 AM bluestore Bug #65482 (Pending Backport): bluestore/bluestore_types: check 'it' valid before using
- 11:47 AM bluestore Bug #65482 (Fix Under Review): bluestore/bluestore_types: check 'it' valid before using
- When sanitizer is enabled, unittest_bluestore_types fails as following
[ RUN ] sb_info_space_efficient_map_t.... - 10:27 AM crimson Feature #65478: Support SnapMapper::Scrubber
- junxiang mu wrote in #note-1:
> I can try implement this, can i tack this issue? :)
No problem!
I noticed that y... - 09:30 AM crimson Feature #65478: Support SnapMapper::Scrubber
- I can try implement this, can i tack this issue? :)
- 08:59 AM crimson Feature #65478 (New): Support SnapMapper::Scrubber
- We need to make crimson aware about SnapMapper::Scrubber and the purged snaps flow (track record_purged_snaps() in th...
- 10:23 AM CephFS Bug #61009: crash: void interval_set<T, C>::erase(T, T, std::function<bool(T, T)>) [with T = inodeno_t; C = std::map]: assert(p->first <= start)
- Explanation of the preallocated machinery which might help in the future:
I played around a bit more with prealloc... - 10:20 AM CephFS Bug #61009: crash: void interval_set<T, C>::erase(T, T, std::function<bool(T, T)>) [with T = inodeno_t; C = std::map]: assert(p->first <= start)
- Please see https://github.com/ceph/ceph/pull/53752#issuecomment-2056469527 for the status of the change.
This issu... - 09:09 AM ceph-volume Backport #65480 (In Progress): squid: prepare/create/activate refactor
- 09:06 AM ceph-volume Backport #65480 (In Progress): squid: prepare/create/activate refactor
- https://github.com/ceph/ceph/pull/56883
- 09:07 AM crimson Bug #57739 (Need More Info): crimson: LogMissingRequest and RepRequest operator<< access possibly invalid req
- 09:06 AM crimson Bug #57758 (Need More Info): crimson: disable autoscale for crimson in teuthology
- 09:05 AM crimson Bug #57801 (Resolved): crimson: tag pool types as crimson, disallow snapshot, scrub, ec operations
- 09:05 AM crimson Bug #64975 (Resolved): crimson: Health check failed: 9 scrub errors (OSD_SCRUB_ERRORS)" in cluster log'
- 09:03 AM Dashboard Bug #65479 (Fix Under Review): mgr/dashboard: use grafana server instead of grafana-server in grafana 10.4.0
- The grafana-server command is deprecated in grafana v10.4.0. It is advised to use grafan server in place of it.
- 09:02 AM crimson Bug #57990 (Closed): Crimson OSD crashes when trying to bring it up
- Yingxin Cheng wrote in #note-1:
> Crimson is not production ready yet, and there will be no backport to Quincy.
>
... - 09:01 AM crimson Bug #58391 (Need More Info): crimson-osd can't finish "mkfs" under RelWithDebInfo build type
- @rainman
Is this still relevant? - 08:59 AM ceph-volume Cleanup #61827 (Pending Backport): prepare/create/activate refactor
- 08:58 AM ceph-volume Cleanup #61827 (Fix Under Review): prepare/create/activate refactor
- 08:54 AM crimson Bug #61227 (Resolved): [crimson] ceph df stats are twice of actual values
- 08:52 AM ceph-volume Bug #65477 (Fix Under Review): `ceph-volume lvm prepare` does not create LVs anymore when using partitions
- 08:27 AM ceph-volume Bug #65477 (Fix Under Review): `ceph-volume lvm prepare` does not create LVs anymore when using partitions
- `ceph-volume lvm prepare` used to create VGs/LVs on partitions. This has changed with commit 1e7223281fa044c9653633e3...
- 08:50 AM crimson Bug #61875 (Resolved): crimson crashes during reboot when there are snap objects
- 08:50 AM crimson Bug #62526: during recovery crimson sends OI_ATTR with MAXed soid and kills classical OSDs
- @rzarzynski,
Is https://github.com/ceph/ceph/pull/53084 still relevant? - 08:48 AM crimson Bug #62550 (Resolved): osd crashes when doing peering
- 08:48 AM crimson Bug #63307 (Resolved): crimson: SnapTrimObjSubEvent doesn't actually seem to submit delta_stats
- 08:46 AM crimson Bug #64282 (Resolved): osd crashes due to unexpected pg creation
- 08:45 AM crimson Bug #64535 (Resolved): crimson osd crashes during crimson-rados-experimental teuthology tests
- 08:11 AM crimson Bug #64782 (Fix Under Review): test_python.sh TestIoctx.test_locator failes in cases of SeaStore
- 08:09 AM crimson Bug #65113 (Fix Under Review): crimson: SnapTrimObjSubEvent num_bytes stats calculation
- 08:07 AM crimson Bug #65130: crimson: crimson-rados did not detect reintroduction of https://tracker.ceph.com/issues/61875
- Added label: crimson-replicated-recovery to track all the required fixes
https://github.com/ceph/ceph/pulls?q=+is%... - 08:06 AM crimson Bug #65247 (Need More Info): ObjectContext::drop_recovery_read(): Assertion `recovery_read_marker' failed.
- 08:05 AM Dashboard Backport #65465 (In Progress): squid: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
- 08:05 AM crimson Feature #65288 (Fix Under Review): crimson: OSD support `trim stale osdmaps` socket command
- 08:04 AM Dashboard Backport #65464 (In Progress): reef: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
- 08:03 AM crimson Bug #65399 (Fix Under Review): osd crash due to deferred recovery
- 08:03 AM crimson Bug #65451 (Fix Under Review): tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
- 08:02 AM crimson Bug #65453 (Fix Under Review): osd crashes due to outdated recovery ops
- 08:02 AM crimson Bug #65474 (Fix Under Review): mgr crash due to corrupted incremental osdmap sent by crimson-osds
- 08:01 AM crimson Feature #65476 (In Progress): Support Erasure coded pools
- 07:41 AM crimson Bug #64332 (In Progress): seastar submodule: Enable SEASTAR_GATE_HOLDER_DEBUG
- 05:40 AM Dashboard Backport #65168 (In Progress): quincy: mgr/dashboard: CVE-2023-26159, CVE-2024-28849 follow-redirects package
- 05:34 AM Dashboard Backport #65170 (In Progress): reef: mgr/dashboard: CVE-2023-26159, CVE-2024-28849 follow-redirects package
- 04:40 AM mgr Bug #59580: memory leak (RESTful module, maybe others?)
- waitting for https://github.com/ceph/ceph/pull/54984 merge and backport
04/14/2024
- 02:43 PM mgr Bug #59580: memory leak (RESTful module, maybe others?)
- Hi,
It seems that the ceph-mgr oom issue happened again on 16.2.15. We had ceph-mgr "oom" this morning.
I have ... - 01:25 PM sepia Bug #65475 (In Progress): folio03 install
- 11:39 AM sepia Bug #65475 (In Progress): folio03 install
- @akraitma we need a new install on folio03, currently it's RHEL 8.6 and we can't use GCC with higher versions.
- 11:01 AM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- h2. Analysis (WIP)
* the following test run is a sure way to create the ‘__header’ failure in ‘main’:
@./teutholo... - 01:39 AM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- I created a test branch with some extra logging and managed to reproduce the issue with slightly more info....
- 07:17 AM crimson Bug #65474 (Resolved): mgr crash due to corrupted incremental osdmap sent by crimson-osds
- ...
04/13/2024
- 12:05 AM rgw Bug #65473: rgw: exclude logging of request payer for 403 requests
- PR: https://github.com/ceph/ceph/pull/56868
- 12:02 AM rgw Bug #65473 (Pending Backport): rgw: exclude logging of request payer for 403 requests
- As per AWS doc (https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html#ChargeDetails), reque...
04/12/2024
- 10:24 PM CephFS Bug #65472 (Pending Backport): mds: avoid recalling Fb when quiescing file
- To avoid extensive flushes by the client. (We don't need to trigger an fsync to quiesce a tree.)
See also: https:/... - 08:48 PM cleanup Tasks #65471 (Fix Under Review): rgw_sal_posix.cc printf compiler warnings
- ...
- 08:40 PM rgw Feature #65470 (New): Beast lacks ssl_short_trust option to reload ssl certificate without restart
- Previously civetweb rgw had an option (ssl_short_trust) to automatically reload certs, for instance when they are sho...
- 08:01 PM rgw Bug #65469 (Fix Under Review): rgw: increase log level on abort_early
- 07:59 PM rgw Bug #65469 (Pending Backport): rgw: increase log level on abort_early
- The function is typically invoked on client errors like NoSuchBucket. Logging these errors with level 1 may initially...
- 07:29 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
- the crash during the realm reload is due to connection being destroyed while its in use,
we call `kafka::shutdown` d... - 05:22 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
- @yuvalif the crash issue with kafka is all about the conn->destroyed being called while publish_internal() might be p...
- 05:00 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
- In our testing we are seeing the same crash, however we do not see it during the realm upload or shutdown.
Its just ... - 06:56 PM rgw Bug #65468: rgw: set correct requestId and hostId on s3select error
- PR: https://github.com/ceph/ceph/pull/56864
- 06:51 PM rgw Bug #65468 (Fix Under Review): rgw: set correct requestId and hostId on s3select error
- Previously, these fields remained constant despite the possibility of populating them with appropriate values.
- 06:12 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- Venky Shankar wrote in #note-31:
> [...]
>
> And patched up the yaml to use the custom quincy build to upgrade to... - 05:11 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- ...
- 10:18 AM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- I have a custom quincy branch (patched with debug in ceph-fuse/fuse_ll). That should give us enough debug to see what...
- 06:04 PM rgw Backport #65467 (In Progress): squid: rgw user accounts
- 05:58 PM rgw Backport #65467 (Resolved): squid: rgw user accounts
- https://github.com/ceph/ceph/pull/56863
- 05:56 PM rgw Feature #65466 (Resolved): rgw user accounts
- 05:18 PM rgw Bug #64381 (Resolved): iam role: CreateDate can go backwards
- 05:17 PM rgw Bug #64475 (Resolved): multisite: forwarded CreateRole request generates different CreateDate
- 03:35 PM rgw Bug #61772 (Closed): rgw/crypt/barbican: 'Namespace' object has no attribute 'admin_endpoints'
- 03:32 PM rgw-testing Bug #17776 (Closed): rgw: test aws4
- 03:29 PM teuthology Bug #59284: Missing `/home/ubuntu/cephtest/archive/coredump` file or directory
- from http://qa-proxy.ceph.com/teuthology/cbodley-2024-04-12_12:44:47-rgw-wip-rgw-account-v3-distro-default-smithi/765...
- 03:07 PM Ceph QA QA Run #65385 (QA Closed): wip-yuri4-testing-2024-04-08-1432
- 03:06 PM Ceph QA QA Run #65420 (QA Needs Approval): wip-yuri2-testing-2024-04-10-1311-squid
- @cbodley pls review all tests scheduled
- 03:05 PM Dashboard Backport #65465 (In Progress): squid: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
- https://github.com/ceph/ceph/pull/56881
- 03:05 PM Dashboard Backport #65464 (In Progress): reef: mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
- https://github.com/ceph/ceph/pull/56880
- 03:01 PM Dashboard Bug #64980 (Pending Backport): mgr/dashboard: fixed snap schedule repeat frequency validation to prevent duplicates
- 02:59 PM Dashboard Backport #65459 (In Progress): reef: mgr/dashboard: fix snap schedule delete retention
- 11:52 AM Dashboard Backport #65459 (In Progress): reef: mgr/dashboard: fix snap schedule delete retention
- https://github.com/ceph/ceph/pull/56862
- 02:58 PM Dashboard Backport #65458 (In Progress): squid: mgr/dashboard: fix snap schedule delete retention
- 11:52 AM Dashboard Backport #65458 (In Progress): squid: mgr/dashboard: fix snap schedule delete retention
- https://github.com/ceph/ceph/pull/56861
- 02:57 PM rgw Bug #65463 (New): rgw/notifications: test data path v2 persistent migration fails
- from https://qa-proxy.ceph.com/teuthology/cbodley-2024-04-12_12:44:47-rgw-wip-rgw-account-v3-distro-default-smithi/76...
- 02:41 PM rgw Bug #65462: rgw: eliminate ssl enforcement for sse-s3 encryption
- PR: https://github.com/ceph/ceph/pull/56860
- 02:40 PM rgw Bug #65462 (Pending Backport): rgw: eliminate ssl enforcement for sse-s3 encryption
- Implement distinct SSL enforcement configurations for SSE-S3, SSE-C, and SSE-KMS encryption methods.
This can be hel... - 02:23 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- I was confident of the code, I've mentioned this in https://tracker.ceph.com/issues/65265#note-6. I then raised a PR ...
- 01:32 PM Ceph Bug #65228 (In Progress): class:device-class config database mask does not work for osd_compact_on_start
- 01:30 PM Ceph QA QA Run #65447 (QA Closed): wip-pdonnell-testing-20240411.210829-debug
- https://github.com/ceph/ceph/pull/56755#issuecomment-2051518639
- 01:28 PM cleanup Tasks #65460 (New): audit rgw_get_request_metadata(), stop storing unneccessary headers as xattrs
- @rgw_get_request_metadata()@ adds object/bucket xattrs for most of the headers in @x_meta_map@ (which stores any head...
- 12:14 PM CephFS Tasks #64819 (Resolved): data corruption during rmw after lseek
- The reproducers above were simplifications of failures/errors from running the ffsb test suite on a fscrypt enabled d...
- 12:11 PM CephFS Bug #62246: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- Venky Shankar wrote in #note-12:
> Rishabh, do we need this for squid too?
Answering this myself - the PR was mer... - 08:21 AM CephFS Bug #62246: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- Rishabh, do we need this for squid too?
- 11:44 AM Dashboard Bug #65370 (Pending Backport): mgr/dashboard: fix snap schedule delete retention
- 10:21 AM Dashboard Bug #65457: mgr/dashboard: ninja fails on `src/pybind/mgr/dashboard/frontend/dist`
- Sorry for the Chinese words.
核心已转储 should be core dump. - 09:57 AM Dashboard Bug #65457: mgr/dashboard: ninja fails on `src/pybind/mgr/dashboard/frontend/dist`
- Arch is arm. I don’t know if nodejs needs to adjust to the arm architecture.
- 09:41 AM Dashboard Bug #65457 (New): mgr/dashboard: ninja fails on `src/pybind/mgr/dashboard/frontend/dist`
- After I install deps, execute `./do_cmake.sh`, cd `bulid` and `ninja`.
It fails on dashboard frontend.
There is... - 10:02 AM sepia Support #65238: Sepia Lab Access Request
- adam kraitman wrote in #note-2:
> Hey Jiffin Tony Thottan, Are these new/additional or replacement credentials?
M... - 09:24 AM Ceph QA QA Run #65456: wip-rishabh-testing-20240407.092921-reef
- Link to wiki - https://tracker.ceph.com/projects/cephfs/wiki/Reef#12-April-2024
Backport PR has been merged - https:... - 09:20 AM Ceph QA QA Run #65456 (QA Closed): wip-rishabh-testing-20240407.092921-reef
- 09:18 AM Ceph QA QA Run #65456 (QA Closed): wip-rishabh-testing-20240407.092921-reef
- https://github.com/ceph/ceph/pull/54942
Not adding more PRs to the testing branch since this one already has too m... - 09:17 AM Ceph QA QA Run #65329 (QA Closed): wip-rishabh-testing-20240404.111254-quincy
- Backport PR has been merged - https://github.com/ceph/ceph/pull/54946#event-12446845381
Link to wiki - https://track... - 09:16 AM Ceph QA QA Run #65329: wip-rishabh-testing-20240404.111254-quincy
- https://pulpito.ceph.com/rishabh-2024-04-11_13:44:02-fs-wip-rishabh-testing-20240404.111254-quincy-testing-default-sm...
- 09:12 AM CephFS Backport #63834 (Resolved): reef: mon/FSCommands: support swapping file systems by name
- 09:11 AM CephFS Backport #63407 (Resolved): quincy: cephfs: print better error message when MDS caps perms are not right
- 08:48 AM rgw Backport #65003 (Resolved): reef: [CVE-2023-46159] RGW crash upon misconfigured CORS rule
- 08:14 AM CephFS Bug #62188: AttributeError: 'RemoteProcess' object has no attribute 'read'
- Rishabh Dave wrote in #note-9:
> All the recent failures are from QA runs for Reef, this is because the fix for this... - 06:56 AM CephFS Bug #65455: read operation hung in Client::get_caps
- the ceph version is 15.2.17 and 16.2.14
- 06:55 AM CephFS Bug #65455 (Rejected): read operation hung in Client::get_caps
- How to reproduce the scene
1. I used two nfs ganesha+libcephfs as the nfs server (server1, server2), and used the sa... - 06:40 AM RADOS Bug #59831: crash: void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*): assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))
- I had the same problem with version 14.2.21,is there any progress...
- 06:02 AM Ceph QA QA Run #65454 (QA Needs Approval): wip-vshankar-testing-20240411.061452
- - https://github.com/ceph/ceph/pull/56193
- https://github.com/ceph/ceph/pull/56135
- https://github.com/ceph/ceph/... - 05:59 AM crimson Bug #65453 (Fix Under Review): osd crashes due to outdated recovery ops
- PGs' recovery backends don't discard old recovery ops...
- 05:53 AM CephFS Bug #65246 (Fix Under Review): qa/cephfs: test_multifs_single_path_rootsquash (tasks.cephfs.test_admin.TestFsAuthorize)
- 05:19 AM Ceph QA QA Run #65324 (QA Closed): wip-vshankar-testing-20240330.170739
- All merged.
- 05:15 AM Ceph QA QA Run #65324: wip-vshankar-testing-20240330.170739
- Moving https://github.com/ceph/ceph/pull/55945 to another test branch since we hit a client crash due to another PR t...
- 04:59 AM Ceph QA QA Run #65324: wip-vshankar-testing-20240330.170739
- Dropping https://github.com/ceph/ceph/pull/55144 due to https://github.com/ceph/ceph/pull/55144#discussion_r1562013425
- 04:53 AM Ceph QA QA Run #65324: wip-vshankar-testing-20240330.170739
- Dropping https://github.com/ceph/ceph/pull/56148 from the list of PRs as the change is buggy - Jos has updated it and...
- 05:18 AM CephFS Feature #57481 (Pending Backport): mds: enhance scrub to fragment/merge dirfrags
- 05:09 AM RADOS Bug #59670: Ceph status shows PG recovering when norecover flag is set
- We saw this issue again in another setup and it has been fixed here: https://github.com/ceph/ceph/pull/54708.
The p... - 03:04 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
- !clipboard-202404121104-m5hdh.png!
- 03:03 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
- !clipboard-202404121103-bsweb.png!
- 03:02 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
- !clipboard-202404121102-9u6kt.png!
- 03:01 AM Ceph Bug #65452: peer pg_info_t's last_complete in primary pg cannot be updated
- case:primary osd executes do_osd_ops write fail and need to execute record_write_error. In record_write_error functio...
- 02:47 AM Ceph Bug #65452 (New): peer pg_info_t's last_complete in primary pg cannot be updated
- !clipboard-202404121047-ovd7r.png!
- 02:07 AM crimson Bug #65451: tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
- Probably can be addressed by https://github.com/ceph/ceph/commit/3a6332fd6676da590b9ede46954b2a6a74308bd7, will split...
- 01:58 AM crimson Bug #65451 (Resolved): tri_mutex::promote_from_read(): Assertion `readers == 1' failed.
- See the assert failure in osd.1 from https://pulpito.ceph.com/yingxin-2024-04-11_01:17:19-crimson-rados-ci-yingxin-cr...
04/11/2024
- 11:11 PM Ceph QA QA Run #65385 (QA Approved): wip-yuri4-testing-2024-04-08-1432
- @yuriw rados approved: https://tracker.ceph.com/projects/rados/wiki/SQUID#httpstrackercephcomissues65385
- 11:03 PM RADOS Bug #65450: rados/thrash-old-clients: "PG_BACKFILL: Low space hindering backfill" warning in cluster log
- Should be evaluated to see whether this should be added to the ignorelist, or if it points to a larger bug.
- 11:03 PM RADOS Bug #65450 (New): rados/thrash-old-clients: "PG_BACKFILL: Low space hindering backfill" warning in cluster log
- /a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7649192...
- 10:59 PM RADOS Bug #65449 (Fix Under Review): NeoRadosWatchNotify.WatchNotifyTimeout failed due to nonexistent pool
- /a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7649011...
- 10:52 PM Infrastructure Bug #65448: Teuthology unable to find the "ceph-radosgw" package
- Please refile this as an "Infrastructure" bug if not explicitly related to RGW.
- 10:52 PM Infrastructure Bug #65448 (New): Teuthology unable to find the "ceph-radosgw" package
- /a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648970...
- 10:49 PM Dashboard Bug #64377: tasks/e2e: Modular dependency problems
- /a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7649198
- 10:44 PM Orchestrator Bug #52109: test_cephadm.sh: Timeout('Port 8443 not free on 127.0.0.1.',)
- /a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648958
- 10:42 PM ceph-volume Bug #56620: Deploy a ceph cluster with cephadm,using ceph-volume lvm create command to create osd can not managed by cephadm
- Looks like a case of this:
/a/yuriw-2024-04-09_14:58:25-rados-wip-yuri4-testing-2024-04-08-1432-distro-default-smith... - 09:49 PM Orchestrator Bug #65233: upgrade/cephfs/mds_upgrade_sequence: 'ceph orch ps' command times out
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648940
- 09:44 PM rgw Backport #65427 (In Progress): squid: Admin Ops socket crashes RGW
- 02:37 PM rgw Backport #65427 (Resolved): squid: Admin Ops socket crashes RGW
- https://github.com/ceph/ceph/pull/56840
- 09:08 PM Ceph QA QA Run #65447 (QA Closed): wip-pdonnell-testing-20240411.210829-debug
- * "PR #56755":https://github.com/ceph/ceph/pull/56755 -- mds/quiesce: xlock the file to let clients keep their buffer...
- 07:23 PM Ceph QA QA Run #65446 (QA Closed): wip-pdonnell-testing-20240411.192137-squid-debug
- 07:22 PM Ceph QA QA Run #65446: wip-pdonnell-testing-20240411.192137-squid-debug
- ...
- 07:21 PM Ceph QA QA Run #65446 (QA Closed): wip-pdonnell-testing-20240411.192137-squid-debug
- * "PR #56671":https://github.com/ceph/ceph/pull/56671 -- squid: mds: skip sr moves when target is an unlinked dir
- 06:50 PM CephFS Bug #62188 (Duplicate): AttributeError: 'RemoteProcess' object has no attribute 'read'
- All the recent failures are from QA runs for Reef, this is because the fix for this issue (https://tracker.ceph.com/i...
- 06:24 PM CephFS Backport #65441 (New): quincy: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- 06:24 PM CephFS Backport #65440 (New): reef: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- 06:22 PM CephFS Bug #62246 (Pending Backport): qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- 05:33 PM CephFS Bug #62246: qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- *The PR linked here fixes multiple issues. This specific commit
from the PR fixes the issue -
https://github.com/ceph... - 05:32 PM CephFS Bug #62246 (Resolved): qa/cephfs: test_mount_mon_and_osd_caps_present_mds_caps_absent fails
- 06:07 PM Ceph QA QA Run #65349: wip-yuri3-testing-2024-04-05-0825
- sure thing Laura
- 06:01 PM sepia Support #64967: Sepia Lab Access Request
- adam kraitman wrote in #note-6:
> Hey If you re-run the new-client script, It's unfortunately not idempotent so if y... - 01:36 PM sepia Support #64967: Sepia Lab Access Request
- Hey If you re-run the new-client script, It's unfortunately not idempotent so if you re-ran it and still have the out...
- 05:14 PM rgw Bug #65436 (Need More Info): Getting Object Crashing radosgw services
- Hello,
We are seeing crashes when users are trying to get a specific file.... - 05:04 PM Ceph QA QA Run #65270 (QA Needs Approval): wip-yuri6-testing-2024-04-02-1310
- 05:04 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- Laura Flores wrote in #note-23:
> Once this is resolved, I need the results rerun.
Attempting a rerun again - 04:04 PM Ceph QA QA Run #65270 (QA Needs Rerun/Rebuilt): wip-yuri6-testing-2024-04-02-1310
- @yuriw this will need to be rerun. I see a lot of failures from "Failed to establish a new connection" that I suspect...
- 05:03 PM Ceph QA QA Run #65435: wip-pdonnell-testing-20240411.165310
- Example tracker for https://github.com/ceph/ceph/pull/56835
- 05:02 PM Ceph QA QA Run #65435 (QA Closed): wip-pdonnell-testing-20240411.165310
- 04:53 PM Ceph QA QA Run #65435 (QA Closed): wip-pdonnell-testing-20240411.165310
- * "PR #56755":https://github.com/ceph/ceph/pull/56755 -- mds/quiesce: xlock the file to let clients keep their buffer...
- 04:07 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- @amathuri can you review this run? It looks like a lot of failures but many seem to be expected warnings. LMK if you ...
- 03:59 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- @yuriw I'll take a look. I see now that it's ready for QA approval
- 03:49 PM Ceph QA QA Run #65045 (QA Needs Approval): wip-yuri5-testing-2024-03-21-0833
- 03:48 PM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- @lflores what about this batch?
- 02:48 PM rgw Feature #63915 (New): propagate kafka errors to client in case of sync notifications
- 02:37 PM rgw Backport #65426 (New): quincy: Admin Ops socket crashes RGW
- 02:36 PM rgw Backport #65425 (New): reef: Admin Ops socket crashes RGW
- 02:30 PM rgw Bug #64244 (Pending Backport): Admin Ops socket crashes RGW
- 02:20 PM rgw Cleanup #63962 (New): rgw-file: FLAG_SYMBOLIC_LINK decl aliases other flags
- 02:17 PM rgw Bug #64805 (Fix Under Review): rgw: dynamic resharding will block write op
- 02:11 PM rgw Bug #61710 (Won't Fix): quincy/pacific: PUT requests during reshard of versioned bucket fail with 404 and leave behind dark data
- 02:05 PM rgw Bug #63378 (New): rgw/multisite: Segmentation fault during full sync
- 01:55 PM CephFS Bug #65261: qa/cephfs: cephadm related failure on fs/upgrade job
- https://pulpito.ceph.com/rishabh-2024-04-08_08:23:45-fs-wip-rishabh-testing-20240407.092921-reef-testing-default-smit...
- 01:39 PM sepia Support #65359 (In Progress): Sepia Lab Access Request
- 01:39 PM sepia Support #65359: Sepia Lab Access Request
- Hey Amarnath Reddy Are these new/additional or replacement credentials?
- 01:37 PM Infrastructure Bug #65229 (In Progress): Failed to reconnect to smithiXXX
- Hey @lflores please ping me if you see this failure again
- 01:13 PM CephFS Backport #62425 (Fix Under Review): reef: nofail option in fstab not supported
- 01:12 PM CephFS Backport #62426 (Fix Under Review): quincy: nofail option in fstab not supported
- 01:12 PM CephFS Backport #63362 (Fix Under Review): quincy: mds: create an admin socket command for raising a signal
- 01:12 PM CephFS Backport #63363 (Fix Under Review): reef: mds: create an admin socket command for raising a signal
- 01:11 PM CephFS Backport #63479 (Fix Under Review): reef: src/mds/MDLog.h: 100: FAILED ceph_assert(!segments.empty())
- 01:11 PM CephFS Backport #63480 (Fix Under Review): quincy: src/mds/MDLog.h: 100: FAILED ceph_assert(!segments.empty())
- 01:11 PM CephFS Backport #63822 (Fix Under Review): reef: cephfs/fuse: renameat2 with flags has wrong semantics
- 01:10 PM CephFS Tasks #63669 (Fix Under Review): qa: add teuthology tests for quiescing a group of subvolumes
- 12:19 PM CephFS Bug #64977 (Fix Under Review): mds spinlock due to lock contention leading to memory exaustion
- 11:25 AM rbd Bug #65421 (Duplicate): upgrade/reef-x/stress-split: TestMigration.StressLive failure
- This isn't specific to upgrade/reef-x/stress-split -- no need to track separately.
- 09:54 AM Ceph QA QA Run #65099: wip-yuri10-testing-2024-03-24-1159
- In the new run (https://pulpito.ceph.com/yuriw-2024-04-10_14:20:47-rados-wip-yuri10-testing-2024-03-24-1159-distro-de...
- 09:07 AM CephFS Bug #65423: Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
- fuchen ma wrote in #note-1:
> Another information:
> I found that the version of the non-crashed is 18.2.2, and the... - 09:06 AM CephFS Bug #65423: Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
- Another information:
I found that the version of the non-crashed is 18.2.2, and the version of the crashed ones are ... - 08:35 AM CephFS Bug #65423 (Rejected): Monitor crashes down when I try to create a FS. The stacks maybe related to metadata server map decoder during the PAXOS service
- I have created a ceph cluster with 5 monitors and 2 metadata servers.
After that, I want to create a fs. Thus, I use... - 09:03 AM Orchestrator Documentation #65424 (New): hardware-monitoring/#developers is broken
- https://docs.ceph.com/en/latest/hardware-monitoring/#developpers
It just contains a bunch of python-mock doc stri... - 06:11 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Dhairya mentioned that the tracebacks seems in the mgr logs are logged by object formatter and not necessarily unhand...
- 04:30 AM Orchestrator Backport #65414 (In Progress): squid: cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
04/10/2024
- 10:15 PM RADOS Bug #65422 (New): upgrade/quincy-x/parallel: "1 pg degraded (PG_DEGRADED)" in cluster log
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648908...
- 10:01 PM rbd Bug #65421 (Duplicate): upgrade/reef-x/stress-split: TestMigration.StressLive failure
- ...
- 09:11 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648890
/a/yuriw-2024-0... - 07:26 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648938/remote/smithi122...
- 05:13 PM RADOS Bug #65185: OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
- Laura Flores wrote:
> /a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098... - 08:59 PM CephFS Bug #64707: suites/fsstress.sh hangs on one client - test times out
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648870
- 08:12 PM Ceph QA QA Run #65420 (QA Closed): wip-yuri2-testing-2024-04-10-1311-squid
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/56069 - squid: rgw: replicate v2 topic/notifi...- 08:07 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
- @vshankar ping!
- 07:17 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
- @pdonnell fyi again
- 06:58 PM Ceph QA QA Run #65330: wip-yuri7-testing-2024-04-04-0800
- @pdonnell fyi
- 07:46 PM CephFS Fix #65408 (Fix Under Review): qa: under valgrind, restart valgrind/mds when MDS exits with 0
- So, the mds_valgrind_exit already exists and is turned on. The original problem in #65314 wasn't caused by a failover...
- 05:18 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
- @vshankar test
- 05:18 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
- @pdonnell test
- 01:33 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
- (Trying to see if redmine adds Venky to the "Watchers" list)
- 01:32 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
- test @vshankar
- 01:31 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
- test @vshankar
- 01:27 PM CephFS Fix #65408: qa: under valgrind, restart valgrind/mds when MDS exits with 0
- cc @vshankar
- 01:27 PM CephFS Fix #65408 (Fix Under Review): qa: under valgrind, restart valgrind/mds when MDS exits with 0
- Instead of issuing a re-...
- 07:31 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
- it's possible that the intent was to preface the call to pull_image with something that logs into the repo on the rem...
- 05:29 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
- Thanks, I see that the pull_image function doesn't honor those settings currently. I have some other somewhat related...
- 04:56 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
- the command that was failing was cephadm.py:pull_image, which invokes sudo cephadm --image <name> pull. I'm not 100%...
- 02:11 PM Orchestrator Feature #65398: allow images from private repos in teuthology test/ceph orch/cephadm
- In theory it should work. The code in the task translates the yaml paramaters to cli parameters for bootstrap. Here's...
- 02:02 AM Orchestrator Feature #65398 (New): allow images from private repos in teuthology test/ceph orch/cephadm
- It appears as though the cephadm teuthology task supports private registries (those that require username/password lo...
- 07:06 PM rgw Backport #65351 (Fix Under Review): squid: rgw: crash in lc while transitioning to cloud
- 06:31 PM CephFS Tasks #64819: data corruption during rmw after lseek
- The RC for this issue is fixed by:...
- 05:39 PM Orchestrator Backport #65419 (New): quincy: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
- 05:39 PM Orchestrator Backport #65418 (New): reef: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
- 05:39 PM Orchestrator Backport #65417 (Resolved): squid: cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
- https://github.com/ceph/ceph/pull/56903
- 05:37 PM Orchestrator Bug #65122 (Pending Backport): cephadmin returns "1" on successful host-maintenance enter/exit - should return "0"
- 05:37 PM RADOS Bug #64460: rados/upgrade/parallel: "[WRN] MON_DOWN: 1/3 mons down, quorum a,b" in cluster log
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648863
- 05:33 PM Orchestrator Bug #64868: cephadm/osds, cephadm/workunits: Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED) in cluster log
- Also during stress/split: yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7...
- 05:31 PM CephFS Bug #64502: pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648857
- 05:31 PM Orchestrator Backport #65416 (New): reef: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
- 05:31 PM Orchestrator Backport #65415 (Resolved): squid: cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
- https://github.com/ceph/ceph/pull/56902
- 05:30 PM Orchestrator Bug #65155 (Pending Backport): cephadm: test_cephadm script fails with "ERROR: required file missing from config-json: idmap.conf"
- 05:29 PM RADOS Bug #65235: upgrade/reef-x/stress-split: "OSDMAP_FLAGS: noscrub flag(s) set" warning in cluster log
- There are many instances of this flag getting set in the test run intentionally, so it makes sense to whitelist.
<pr... - 05:27 PM nvme-of Feature #65259 (Resolved): cephadm - make changes to ceph-nvmeof.conf template
- 05:27 PM nvme-of Backport #65296 (Rejected): squid: cephadm - make changes to ceph-nvmeof.conf template
- Handling this backport as part of https://github.com/ceph/ceph/pull/56497 that includes other changes to the nvmeof c...
- 05:26 PM Orchestrator Bug #65234: upgrade/quincy-x/stress-split: cephadm failed to parse grafana.ini file due to inadequate permission
- /a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648854
- 05:23 PM Orchestrator Backport #65414 (Resolved): squid: cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
- https://github.com/ceph/ceph/pull/56826
- 05:17 PM Orchestrator Bug #64865 (Pending Backport): cephadm: Health check failed: 1 osds down (OSD_DOWN) in cluster log
- 05:10 PM CephFS Bug #50719: xattr returning from the dead (sic!)
- Those MDs logs would be everything. they are from the moment I built the MDS services until you requested the logs wh...
- 04:52 PM sepia Bug #65413 (New): uid mismatch on some machines
- ...
- 04:41 PM rgw Backport #65412 (In Progress): squid: multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
- 04:40 PM rgw Backport #65412 (Resolved): squid: multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
- https://github.com/ceph/ceph/pull/56822
- 04:36 PM rgw Bug #65373 (Pending Backport): multisite: test_object_sync gets wrong object body: b'<x-rgw' != b'asdasd'
- 02:51 PM rgw Bug #63791: RGW: a subuser with no permission can still list buckets and create buckets
- This commit can be backported to quincy reef ?
- 02:25 PM rgw Bug #63791 (Resolved): RGW: a subuser with no permission can still list buckets and create buckets
- 02:40 PM Ceph QA QA Run #65252 (QA Closed): wip-yuri2-testing-2024-04-01-1235-quincy
- 05:46 AM Ceph QA QA Run #65252 (QA Approved): wip-yuri2-testing-2024-04-01-1235-quincy
- 05:44 AM Ceph QA QA Run #65252: wip-yuri2-testing-2024-04-01-1235-quincy
- @yuriw rados approved
- 02:40 PM mgr Backport #65154: quincy: pybind/mgr/devicehealth: "rados.ObjectNotFound: [errno 2] RADOS object not found (Failed to operate read op for oid $dev"
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/56480
merged - 02:38 PM CephFS Backport #63823 (Fix Under Review): quincy: cephfs/fuse: renameat2 with flags has wrong semantics
- 02:36 PM bluestore Backport #63914 (Resolved): quincy: Some of ObjectStore/*Deferred* test cases are failing with bluestore_allocator is set to bitmap
- 02:34 PM Infrastructure Bug #64481: Octo Lab VMWare TestBed Setup Requirement for VMWare Certification Tests
- @kramaswamy test comment
- 02:32 PM Ceph Feature #63801: verified mon backups
- Christian Rohmann wrote in #note-2:
> My thoughts would be:
> * Full restore might not always be wanted, so extra... - 02:32 PM Dashboard Backport #65026: quincy: mgr/dashboard: Develop a Chinese version for dashboard
- Rongqi Sun wrote in #note-2:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/p... - 02:31 PM bluestore Bug #63795: Some of ObjectStore/*Deferred* test cases are failing with bluestore_allocator is set to bitmap
- https://github.com/ceph/ceph/pull/55779 merged
- 02:24 PM Ceph Feature #64436 (Fix Under Review): rgw: add remaining x-amz-replication-status options
- 02:21 PM Ceph QA QA Run #65099 (QA Needs Approval): wip-yuri10-testing-2024-03-24-1159
- 02:19 PM Ceph QA QA Run #65270 (QA Needs Approval): wip-yuri6-testing-2024-04-02-1310
- 01:38 PM rgw Backport #65411 (In Progress): squid: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- 01:36 PM rgw Backport #65411 (Resolved): squid: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- https://github.com/ceph/ceph/pull/56820
- 01:38 PM rgw Backport #65410 (In Progress): reef: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- 01:36 PM rgw Backport #65410 (In Progress): reef: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- https://github.com/ceph/ceph/pull/56819
- 01:38 PM rgw Backport #65409 (In Progress): quincy: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- 01:36 PM rgw Backport #65409 (In Progress): quincy: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- https://github.com/ceph/ceph/pull/56818
- 01:32 PM rgw Bug #65334 (Pending Backport): Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- 01:24 PM CephFS Bug #65262 (Triaged): qa/cephfs: kernel_untar_build.sh failed due to build error
- 01:17 PM rgw Backport #65402 (In Progress): squid: persistent topic stats test fails
- backport included in https://github.com/ceph/ceph/pull/56069 for https://tracker.ceph.com/issues/64818
- 10:39 AM rgw Backport #65402 (Resolved): squid: persistent topic stats test fails
- 12:56 PM CephFS Bug #65350 (Triaged): mgr/snap_schedule: restore yearly spec from uppercase Y to lowercase y
- 12:29 PM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- @ NotADirectoryError@ is probably not a valid (in-built) exception in some python version. My question is, if this ex...
- 08:40 AM CephFS Bug #65265: qa: health warning "no active mgr (MGR_DOWN)" occurs before and after test_nfs runs
- Venky Shankar wrote in #note-3:
> Thanks for taking a look, Laura.
>
> Dhariya, please take this one. AFAICT, thi... - 12:23 PM Orchestrator Bug #65407: sequence item 0: expected str instance, dict found
- /var/log/user.log:Apr 10 15:03:33 d3p1u01-rc9h7j020-01 ceph-mgr[4176565]: [cephadm ERROR cephadm.serve] Failed to app...
- 12:20 PM Orchestrator Bug #65407 (New): sequence item 0: expected str instance, dict found
- ceph version 17.2.4 (1353ed37dec8d74973edc3d5d5908c20ad5a7332) quincy (stable)
ceph orch apply -i osd_ssd.yaml
<pre... - 12:01 PM CephFS Backport #65406 (In Progress): quincy: mds: Reduce log level for messages when mds is stopping
- https://github.com/ceph/ceph/pull/57228
- 12:01 PM CephFS Backport #65405 (In Progress): reef: mds: Reduce log level for messages when mds is stopping
- https://github.com/ceph/ceph/pull/57227
- 12:01 PM CephFS Backport #65404 (In Progress): squid: mds: Reduce log level for messages when mds is stopping
- https://github.com/ceph/ceph/pull/57224
- 11:57 AM CephFS Bug #65260 (Pending Backport): mds: Reduce log level for messages when mds is stopping
- 11:44 AM CephFS Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
- Venky Shankar wrote in #note-18:
> So, for some reason this part of the code
>
> [...]
>
> especially derefere... - 11:34 AM CephFS Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
- So, for some reason this part of the code...
- 07:58 AM CephFS Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
- I haven't been unable to reproduce this with the main branch. If possible, please collect ceph-mds coredump and attac...
- 11:28 AM CephFS Bug #65317 (Fix Under Review): cephfs_mirror: update peer status for invalid metadata in remote snapshot
- 11:06 AM RADOS Backport #65307 (In Progress): quincy: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- 11:06 AM RADOS Backport #65306 (In Progress): squid: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- 11:05 AM RADOS Backport #65305 (In Progress): reef: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
- 10:47 AM CephFS Bug #48680: mds: scrubbing stuck "scrub active (0 inodes in the stack)"
- This might be due to enabling of frags as seen in the job description for the job mentioned in comment#4 and probably...
- 10:39 AM rgw Backport #65403 (New): reef: persistent topic stats test fails
- 10:37 AM rgw Bug #65354 (Duplicate): rgw/notifications: topic migration test failures
- the issues above are failures due to test issues that were fixed here: https://tracker.ceph.com/issues/63909
sometim... - 10:32 AM CephFS Bug #65171 (Fix Under Review): Provide metrics support for the Replication Start/End Notifications
- 10:31 AM rgw Bug #63909 (Pending Backport): persistent topic stats test fails
- 10:19 AM RADOS Feature #54525: osd/mon: log memory usage during tick
- PR: https://github.com/ceph/ceph/pull/56812
- 10:15 AM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
in: /a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636676
th...- 05:34 AM RADOS Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
- /a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636676
- 09:46 AM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
- the valgrind report indicates a crash during sutdown. when we shutdown the kafka manager, we destroy all connections,...
- 09:44 AM Messengers Bug #65401: msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
- I'm not sure this is by designed or a mistake, so I push a pr for disccussion. pr:https://github.com/ceph/ceph/pull/5...
- 09:26 AM Messengers Bug #65401 (New): msg: conneciton between mgr and osd is periodically down which leads heavy load to mgr
- I find the connection between osd and mgr are periodically mark_down due to ms_connection_idle_timeout config.
This ... - 08:56 AM Dashboard Feature #65268 (Resolved): mgr/dashboard: update NVMe-oF API "listener add" sync
- 08:56 AM Dashboard Backport #65390 (Resolved): squid: mgr/dashboard: update NVMe-oF API "listener add" sync
- 08:33 AM Ceph Bug #52604 (Closed): osd: mkfs: bluestore_stored > 235GiB from start
- The fix was merged
- 07:55 AM RADOS Feature #64519: OSD/MON: No snapshot metadata keys trimming
- Eugen Block wrote in #note-6:
> I know I'm a bit early asking this, but I helped raise this issue and Mykola picked ... - 07:45 AM Ceph Bug #65400 (New): ceph-exporter
- During the run of the ocs-ci tests (for example "test_fsgroupchangepolicy_when_depoyment_scaled") we receive the foll...
- 07:02 AM bluestore Bug #65298: Free space can be leaked in Quincy+ when bdev_async_discard is enabled
- PR https://github.com/ceph/ceph/pull/56744 should solve this issue
- 06:40 AM crimson Bug #65399 (Fix Under Review): osd crash due to deferred recovery
- Crimson OSD will fail if a recovery op is finished after a recovery/backfill is deferred:...
- 05:40 AM RADOS Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
- /a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636628
- 05:37 AM RADOS Bug #64725: rados/singleton: application not enabled on pool 'rbd'
- /a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636638
/a/yuriw-2... - 05:23 AM Infrastructure Bug #58907: OCI runtime error: runc: runc create failed: unable to start container process
- /a/yuriw-2024-04-02_15:39:50-rados-wip-yuri2-testing-2024-04-01-1235-quincy-distro-default-smithi/7636710
- 04:47 AM CephFS Bug #64977: mds spinlock due to lock contention leading to memory exaustion
- The *client.379194623:32785 lookup* request was spinning infinitely in MDS:...
- 03:05 AM CephFS Bug #62123 (Fix Under Review): mds: detect out-of-order locking
- 01:11 AM rgw Bug #64803 (Fix Under Review): ninja all on fedora 39 fails because arrow_ext requires C++14
04/09/2024
- 11:50 PM rgw Bug #65397: rgw: allow disabling mdsearch APIs
- PR: https://github.com/ceph/ceph/pull/56802
- 11:48 PM rgw Bug #65397 (Fix Under Review): rgw: allow disabling mdsearch APIs
- Since this is visible to the bucket owners, it can be presumed to be a functional feature. Providing the ability to d...
- 07:43 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- https://shaman.ceph.com/builds/ceph/wip-yuri6-testing-2024-04-02-1310/a5074d4516d566e9d8b6aec912f26afd099de101/
- 07:33 PM Ceph QA QA Run #65270 (QA Needs Rerun/Rebuilt): wip-yuri6-testing-2024-04-02-1310
- 07:29 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- Laura Flores wrote in #note-16:
> Hey @yuriw just checking that this run is getting / has gotten rebased and rerun?
... - 06:51 PM Ceph QA QA Run #65270: wip-yuri6-testing-2024-04-02-1310
- Hey @yuriw just checking that this run is getting / has gotten rebased and rerun?
- 07:31 PM rgw Bug #65337: rgw: Segmentation fault in rgw::notify::Manager during realm reload
- i managed to reproduce under valgrind. this report of use-after-free looks relevant:...
- 06:54 PM Orchestrator Bug #64208 (In Progress): test_cephadm.sh: Container version mismatch causes job to fail.
- 06:36 PM Orchestrator Bug #65396 (New): smb service takes a very long time to delete
- Executing `ceph orch rm smb.foo` gets stuck in `<deleting>` phase.
I suspect that there may be an issue removing s... - 05:45 PM Ceph QA QA Run #65099: wip-yuri10-testing-2024-03-24-1159
- https://shaman.ceph.com/builds/ceph/wip-yuri10-testing-2024-03-24-1159/bf9c5618cc018927fa77780d25b6deb1f5dc254d/
- 05:01 PM Ceph QA QA Run #65099 (QA Needs Rerun/Rebuilt): wip-yuri10-testing-2024-03-24-1159
- 05:01 PM Ceph QA QA Run #65099: wip-yuri10-testing-2024-03-24-1159
- Seems like it needs rebase as I see a recent commit by @rzarzynski
rebasing - 04:23 PM mgr Feature #64318: mgr/prometheus add support for TLS and client cert authentication
- Redouane Kachach Elhichou wrote in #note-5:
> Christian Rohmann wrote:
> > Redouane Kachach Elhichou wrote:
> > ... - 04:18 PM RADOS Bug #65227: noscrub cluster flag prevents deep-scrubs from starting
- https://github.com/ceph/ceph/blob/main/doc/dev/osd_internals/scrub.rst
https://github.com/ceph/ceph/blob/v17.2.7/src... - 03:31 PM Orchestrator Bug #65367 (Resolved): PermissionError: [Errno 13] Permission denied in the fake filesystem
- all 4 PRs are now merged. This should no longer occur in any make check runs started after this point.
- 03:16 PM Orchestrator Bug #65395 (Fix Under Review): [node-proxy] the agent shouldn't fail when RedFish returns empty data
- 03:11 PM Orchestrator Bug #65395 (Fix Under Review): [node-proxy] the agent shouldn't fail when RedFish returns empty data
- If for some reason the redfish returns empty data, node-proxy fails because it can't access non-existing keys, it bas...
- 03:06 PM Stable releases Tasks #65393: reef v18.2.3
- h3. QE VALIDATION (STARTED 4/8/23)
PRs list => https://pad.ceph.com/p/reef_v18.2.3_QE_PRs_LIST
*%{color:blue}Releas... - 03:02 PM Stable releases Tasks #65393 (New): reef v18.2.3
- h3. Workflow
* "Preparing the release":http://ceph.com/docs/master/dev/development-workflow/#preparing-a-new-relea... - 03:05 PM Orchestrator Feature #65394 (In Progress): [node-proxy] implement 'endpoints discovering'
- RFE in order to add the required logic in order to make the daemon explore the API for discovering the different endp...
- 03:03 PM Orchestrator Bug #65392 (Fix Under Review): [node-proxy] the node-proxy daemon crashes when get_logger() is passed a log level
- 02:48 PM Orchestrator Bug #65392 (Fix Under Review): [node-proxy] the node-proxy daemon crashes when get_logger() is passed a log level
- ...
- 02:59 PM Ceph QA QA Run #65385 (QA Needs Approval): wip-yuri4-testing-2024-04-08-1432
- 02:52 PM Ceph Backport #65368 (Resolved): squid: install-deps: enable copr ceph/grpc
- 11:28 AM Ceph Backport #65368 (In Progress): squid: install-deps: enable copr ceph/grpc
- 02:51 PM Ceph Bug #65184 (Resolved): install-deps: enable copr ceph/grpc
- 02:41 PM rgw Bug #65334: Command failed with status 128: 'git clone -b stable/xena https://github.com/openstack/barbican.git /home/ubuntu/cephtest/barbican'
- the barbican repo changed the name of this branch to @unmaintained/xena@
- 02:41 PM Ceph QA QA Run #65360 (QA Closed): wip-yuri-testing-2024-04-07-0902
- 09:53 AM Ceph QA QA Run #65360 (QA Approved): wip-yuri-testing-2024-04-07-0902
- 09:53 AM Ceph QA QA Run #65360: wip-yuri-testing-2024-04-07-0902
- Thanks, @lflores
I should have refreshed the page before going over the failures.... - 02:40 PM Ceph QA QA Run #65252: wip-yuri2-testing-2024-04-01-1235-quincy
- @nmordech see `QA Runs:`above
- 04:57 AM Ceph QA QA Run #65252: wip-yuri2-testing-2024-04-01-1235-quincy
- @lflores sure, waiting for the runs.
@yuriw do we have the link to the suites run? - 02:34 PM Ceph Backport #65391 (In Progress): squid: osd/scrub: "reservation requested while still reserved" error in cluster log
- 02:15 PM Ceph Backport #65391 (In Progress): squid: osd/scrub: "reservation requested while still reserved" error in cluster log
- 02:10 PM Ceph Bug #64827 (Pending Backport): osd/scrub: "reservation requested while still reserved" error in cluster log
- 12:56 PM Ceph QA QA Run #65237: wip-ceph_test_rados-partial-reads
- Fixed the one above and scheduled a rerun: https://pulpito.ceph.com/rzarzynski-2024-04-09_12:48:11-rados-wip-ceph_tes...
- 12:39 PM rgw Bug #64308: CORS Preflight Failure After Upgrading to 17.2.7
- Will the backports make it into the next release of Quincy/Reef?
- 11:54 AM Dashboard Backport #65390 (In Progress): squid: mgr/dashboard: update NVMe-oF API "listener add" sync
- 09:36 AM Dashboard Backport #65390 (Resolved): squid: mgr/dashboard: update NVMe-oF API "listener add" sync
- https://github.com/ceph/ceph/pull/56783
- 09:55 AM CephFS Bug #64977: mds spinlock due to lock contention leading to memory exaustion
- Posted more logs at fed9e44e-a0ec-4692-ae23-6a1047fe9247
- 09:29 AM Dashboard Feature #65268 (Pending Backport): mgr/dashboard: update NVMe-oF API "listener add" sync
- 08:41 AM Ceph Bug #61598 (Duplicate): gcc-14: FTBFS "error: call to non-'constexpr' function 'virtual unsigned int DoutPrefixProvider::get_subsys() const'"
- 08:08 AM Ceph Feature #63801: verified mon backups
- *This is really a good idea to have built-in! Thanks for taking this up!*
We have been using a custom backup scrip... - 08:02 AM CephFS Bug #65389 (Fix Under Review): The ceph_readdir function in libcephfs returns incorrect d_reclen value
- When @struct dirent@ entries are returned by @ceph_readdir()@ function, the field @d_reclen@ is always 1.
Based on... - 07:19 AM CephFS Bug #65388 (New): The MDS_SLOW_REQUEST warning is flapping even though the slow requests don't go away
- I have caught a cluster in an unhealthy state - probably some MDS deadlock that results in requests being blocked (de...
- 07:14 AM CephFS Bug #65171 (In Progress): Provide metrics support for the Replication Start/End Notifications
- 03:14 AM Ceph Support #64378: Slow / Single backfilling on Reef (18.2.1-pve2)
- Aha, there's a new feature in Ceph that auto-resets these values:
https://docs.ceph.com/en/quincy/rados/configurat... - 01:53 AM Ceph Support #64378: Slow / Single backfilling on Reef (18.2.1-pve2)
- I observe the same problem on 18.2.1:...
- 02:57 AM Linux kernel client Bug #51279: kclient hangs on umount (testing branch)
- I have added more debug logs and will dump why the *flushsnap_ack* was dropped directly:...
- 01:48 AM Orchestrator Bug #65387 (New): cephadm: Unable to use gather-facts without podman/docker installed
- cephadm gather-facts can be used to gather inventory across the hosts to validate hardware prior to deployment. Howev...
- 01:12 AM Ceph QA QA Run #65045: wip-yuri5-testing-2024-03-21-0833
- https://shaman.ceph.com/builds/ceph/wip-yuri5-testing-2024-03-21-0833/fbfd55d0098e16f2a4f0d8b71252fe1ef3b65d2a/
- 12:22 AM Ceph Bug #65386 (New): rados: create test to validate replica read
- RADOS supports the ability to send reads to replicas rather than the primary. The primary use for this feature is to...
Also available in: Atom