Reef » History » Revision 86
Revision 85 (Rishabh Dave, 04/12/2024 06:57 AM) → Revision 86/91 (Venky Shankar, 04/17/2024 11:12 AM)
h1. Reef h2. On-call Schedule * Feb: Venky * Mar: Patrick * Apr: Jos * May: Xiubo * Jun: Rishabh * Jul: Kotresh * Aug: Milind * Sep: Leonid * Oct: Dhairya * Nov: Chris h3. 2024-04-17 ADD NEW ENTRY BELOW https://tracker.ceph.com/issues/65393#note-1 https://pulpito.ceph.com/yuriw-2024-04-09_01:14:24-fs-reef-release-distro-default-smithi/ * "selinux denials with centos9.stream":https://tracker.ceph.com/issues/64616 * "pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main":https://tracker.ceph.com/issues/64502 * "qa: ‘Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)’":https://tracker.ceph.com/issues/52624 * "Test failure: test_add_ancestor_and_child_directory tasks.cephfs.test_mirroring.TestMirroring":https://tracker.ceph.com/issues/62221 * "QA failure: test_fscrypt_dummy_encryption_with_quick_group":https://tracker.ceph.com/issues/65136 * "error during scrub thrashing: rank damage found: {'backtrace'}":https://tracker.ceph.com/issues/57676 * "[testing] dbench: write failed on handle 10009 Resource temporarily unavailable":https://tracker.ceph.com/issues/57656 * "qa: fs/snaps/snaptest-git-ceph.sh failed when reseting to tag 'v0.1’":https://tracker.ceph.com/issues/63265 * "AttributeError: 'RemoteProcess' object has no attribute 'read’":https://tracker.ceph.com/issues/62188 h3. 12 April 2024 https://tracker.ceph.com/issues/65264 https://pulpito.ceph.com/rishabh-2024-04-08_08:23:45-fs-wip-rishabh-testing-20240407.092921-reef-testing-default-smithi/ * https://tracker.ceph.com/issues/64711 Test failure: test_cephfs_mirror_cancel_mirroring_and_readd (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/62067 ffsb.sh failure "Resource temporarily unavailable" * https://tracker.ceph.com/issues/53859 Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/61820 valgrind issues with ceph-mon * https://tracker.ceph.com/issues/64502 pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main * https://tracker.ceph.com/issues/64616 selinux denials with centos9.stream * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/65136 QA failure: test_fscrypt_dummy_encryption_with_quick_group * https://tracker.ceph.com/issues/65261 qa/cephfs: cephadm related failure on fs/upgrade job * https://tracker.ceph.com/issues/63265 qa: fs/snaps/snaptest-git-ceph.sh failed when reseting to tag 'v0.1' h3. 2024-04-08 https://tracker.ceph.com/issues/65336 * "selinux denials with centos9.stream":https://tracker.ceph.com/issues/64616 * "pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main":https://tracker.ceph.com/issues/64502 * "Test failure: test_fscrypt_dummy_encryption_with_quick_group tasks.cephfs.test_fscrypt.TestFscrypt":https://tracker.ceph.com/issues/59684 * "leak in mds.c detected by valgrind during CephFS QA run":https://tracker.ceph.com/issues/63949 * "qa: ‘Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)’":https://tracker.ceph.com/issues/52624 * "[testing] dbench: write failed on handle 10009 Resource temporarily unavailable":https://tracker.ceph.com/issues/57656 * "Test failure: test_add_ancestor_and_child_directory tasks.cephfs.test_mirroring.TestMirroring":https://tracker.ceph.com/issues/62221 * "AttributeError: 'RemoteProcess' object has no attribute 'read’":https://tracker.ceph.com/issues/62188 * "qa: The following counters failed to be set on mds daemons: {'mds.exported', 'mds.imported’}":https://tracker.ceph.com/issues/65372 * "mds: failed to store backtrace and force file system read-only":https://tracker.ceph.com/issues/63259 h3. 2024-04-03 https://tracker.ceph.com/issues/65148 https://pulpito.ceph.com/?branch=wip-vshankar-testing-20240326.105515-reef * "error during scrub thrashing: rank damage found: {'backtrace'}":https://tracker.ceph.com/issues/57676 * "reef: qa: AttributeError: 'TestSnapSchedulesSubvolAndGroupArguments' object has no attribute 'get_ceph_cmd_stdout'":https://tracker.ceph.com/issues/64937 * "selinux denials with centos9.stream":https://tracker.ceph.com/issues/64616 * "pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main":https://tracker.ceph.com/issues/64502 * "qa: scrub - object missing on disk; some files may be lost":https://tracker.ceph.com/issues/48562 * "Test failure: test_fscrypt_dummy_encryption_with_quick_group tasks.cephfs.test_fscrypt.TestFscrypt":https://tracker.ceph.com/issues/59684 * "qa: ‘Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)’":https://tracker.ceph.com/issues/52624 * "Test failure: test_cephfs_mirror_cancel_mirroring_and_readd":https://tracker.ceph.com/issues/64711 * "[testing] dbench: write failed on handle 10009 Resource temporarily unavailable":https://tracker.ceph.com/issues/57656 * "leak in mds.c detected by valgrind during CephFS QA run":https://tracker.ceph.com/issues/63949 * "Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm)":https://tracker.ceph.com/issues/53859 h3. 3 APR 2024 https://pulpito.ceph.com/rishabh-2024-03-29_18:05:24-fs-wip-rishabh-testing-20240327.051042-reef-testing-default-smithi/ https://pulpito.ceph.com/rishabh-2024-03-29_18:05:24-fs-wip-rishabh-testing-20240327.051042-reef-testing-default-smithi/ * https://tracker.ceph.com/issues/53859 Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/65136 QA failure: test_fscrypt_dummy_encryption_with_quick_group * https://tracker.ceph.com/issues/47292 Test failure: test_df_for_valid_file (tasks.cephfs.test_cephfs_shell.TestDF) * https://tracker.ceph.com/issues/48562 qa: scrub - object missing on disk; some files may be lost * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/64502 pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main * https://tracker.ceph.com/issues/64616 selinux denials with centos9.stream * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/62067 ffsb.sh failure "Resource temporarily unavailable" * https://tracker.ceph.com/issues/57206 ceph_test_libcephfs_reclaim crashes during test * https://tracker.ceph.com/issues/65261 qa/cephfs: cephadm related failure on fs/upgrade job * https://tracker.ceph.com/issues/65262 qa/cephfs: kernel_untar_build.sh failed due to build error h3. 2024-03-28 https://tracker.ceph.com/issues/65202 https://pulpito.ceph.com/yuriw-2024-03-29_19:10:12-fs-wip-yuri11-testing-2024-03-28-0753-reef-distro-default-smithi/ * "error during scrub thrashing: rank damage found: {'backtrace'}":https://tracker.ceph.com/issues/57676 * "qa: Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)":https://tracker.ceph.com/issues/52624 * "selinux denials with centos9.stream":https://tracker.ceph.com/issues/64616 * "reef: qa: AttributeError: 'TestSnapSchedulesSubvolAndGroupArguments' object has no attribute 'get_ceph_cmd_stdout'":https://tracker.ceph.com/issues/64937 * "pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main":https://tracker.ceph.com/issues/64502 * "Test failure: test_fscrypt_dummy_encryption_with_quick_group tasks.cephfs.test_fscrypt.TestFscrypt":https://tracker.ceph.com/issues/59684 * "qa: scrub - object missing on disk; some files may be lost":https://tracker.ceph.com/issues/48562 * "Test failure: test_add_ancestor_and_child_directory tasks.cephfs.test_mirroring.TestMirroring":https://tracker.ceph.com/issues/62221 * "[testing] dbench: write failed on handle 10009 Resource temporarily unavailable":https://tracker.ceph.com/issues/57656 * "ceph_test_libcephfs_reclaim crashes during test":https://tracker.ceph.com/issues/57206 h3. 2024-03-26 https://trello.com/c/xEeVJoco/1978-wip-yuri-testing-2024-03-12-1240-reef https://pulpito.ceph.com/yuriw-2024-03-13_19:31:43-fs-wip-yuri-testing-2024-03-12-1240-reef-distro-default-smithi/ * "[testing] dbench: write failed on handle 10009 Resource temporarily unavailable":https://tracker.ceph.com/issues/57656 * "error during scrub thrashing: rank damage found: {'backtrace'}":https://tracker.ceph.com/issues/57676 * "pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main":https://tracker.ceph.com/issues/64502 * "selinux denials with centos9.stream":https://tracker.ceph.com/issues/64616 * "Test failure: test_add_ancestor_and_child_directory tasks.cephfs.test_mirroring.TestMirroring":https://tracker.ceph.com/issues/62221 * "Test failure: test_fscrypt_dummy_encryption_with_quick_group tasks.cephfs.test_fscrypt.TestFscrypt":https://tracker.ceph.com/issues/59684 * "Test failure: test_multiple_path_r tasks.cephfs.test_admin.TestFsAuthorize":https://tracker.ceph.com/issues/64172 * "reef: qa: AttributeError: 'TestSnapSchedulesSubvolAndGroupArguments' object has no attribute 'get_ceph_cmd_stdout'":https://tracker.ceph.com/issues/64937 Test failures caused by PR: * https://pulpito.ceph.com/yuriw-2024-03-13_19:31:43-fs-wip-yuri-testing-2024-03-12-1240-reef-distro-default-smithi/7598790 * https://pulpito.ceph.com/yuriw-2024-03-13_19:31:43-fs-wip-yuri-testing-2024-03-12-1240-reef-distro-default-smithi/7598726 * https://pulpito.ceph.com/yuriw-2024-03-13_19:31:43-fs-wip-yuri-testing-2024-03-12-1240-reef-distro-default-smithi/7598661 h3. 20th March 2024 https://trello.com/c/bCURjoHF/1981-wip-yuri8-testing-2024-03-15-0740-reef-old-wip-yuri8-testing-2024-03-14-1513-reef https://pulpito.ceph.com/yuriw-2024-03-15_20:10:26-fs-wip-yuri8-testing-2024-03-15-0740-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/64502 pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/64616 selinux denials with centos9.stream * https://tracker.ceph.com/issues/61820 valgrind issues with ceph-mon * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/64937 reef: qa: AttributeError: 'TestSnapSchedulesSubvolAndGroupArguments' object has no attribute 'get_ceph_cmd_stdout' * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/59343 (see https://tracker.ceph.com/issues/59343#note-25) qa: fs/snaps/snaptest-multiple-capsnaps.sh failed (awaiting kernel backport) * https://tracker.ceph.com/issues/65001 mds: ceph-mds might silently ignore client_session(request_close, ...) message h3. 19th March 2024 https://trello.com/c/mF32rpDn/1982-wip-yuri10-testing-2024-03-15-1653-reef https://pulpito.ceph.com/yuriw-2024-03-16_15:03:17-fs-wip-yuri10-testing-2024-03-15-1653-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/64502 pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/61820 valgrind issues with ceph-mon * https://tracker.ceph.com/issues/64616 selinux denials with centos9.stream * https://tracker.ceph.com/issues/64937 reef: qa: AttributeError: 'TestSnapSchedulesSubvolAndGroupArguments' object has no attribute 'get_ceph_cmd_stdout' * https://tracker.ceph.com/issues/48562 qa: scrub - object missing on disk; some files may be lost * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/63265 qa: fs/snaps/snaptest-git-ceph.sh failed when reseting to tag 'v0.1' * https://tracker.ceph.com/issues/59343 qa: fs/snaps/snaptest-multiple-capsnaps.sh failed (awaiting kernel backport) h3. 11th March 2024 https://trello.com/c/TLeKDODy/1973-wip-yuri11-testing-2024-03-11-0838-reef-old-wip-yuri11-testing-2024-03-06-0821-reef https://pulpito.ceph.com/yuriw-2024-03-12_14:59:27-fs-wip-yuri11-testing-2024-03-11-0838-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/64502 pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/61820 valgrind issues with ceph-mon * https://tracker.ceph.com/issues/64616 selinux denials with centos9.stream * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/64937 reef: qa: AttributeError: 'TestSnapSchedulesSubvolAndGroupArguments' object has no attribute 'get_ceph_cmd_stdout' * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/48562 qa: scrub - object missing on disk; some files may be lost PRs failing QA: * https://github.com/ceph/ceph/pull/54729#pullrequestreview-1937730202 * https://github.com/ceph/ceph/pull/55692#pullrequestreview-1937932954 h3. 5th March 2024 https://pulpito.ceph.com/?branch=wip-vshankar-testing1-reef-2024-02-28-1133 * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/64616 selinux denials with centos9.stream * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/59531 "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.7. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's * https://tracker.ceph.com/issues/64747 postgresql pkg install failure * https://tracker.ceph.com/issues/57206 ceph_test_libcephfs_reclaim crashes during test * https://tracker.ceph.com/issues/64748 reef: snaptest-git-ceph.sh failure h3. 05 Feb 2024 * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/59531 "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.7. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's * https://tracker.ceph.com/issues/59534 Command failed (workunit test suites/dbench.sh) on smithi077 with status 1 * https://tracker.ceph.com/issues/61820 valgrind issues with ceph-mon * https://tracker.ceph.com/issues/63522 No module named ... (PYTHONPATH issues) h3. 29 Jan 2024 * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/61831 qa: test_mirroring_init_failure_with_recovery failure * https://tracker.ceph.com/issues/64060 qa: AttributeError: 'TestSubvolumeGroups' object has no attribute '_generate_random_group_name' * https://tracker.ceph.com/issues/59531 "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.7. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/51282 cluster [WRN] Health check failed: Degraded data redundancy: 9/46 objects degraded (19.565%), 4 pgs degraded (PG_DEGRADED)" in cluster log * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/59534 Command failed (workunit test suites/dbench.sh) on smithi077 with status 1 h3. 04 Dec 2023 https://pulpito.ceph.com/?sha1=a6f82997ac57b70a895455dfed6256360a1e4c32 * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/62658 error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds * https://tracker.ceph.com/issues/63233 mon|client|mds: valgrind reports possible leaks in the MDS * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/63339 reef: mds: warning `clients failing to advance oldest client/flush tid` seen with some workloads (missing backport) * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/62482 qa: "cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)" (backport for cephfs-mirror ignorelist missing) * https://tracker.ceph.com/issues/59684 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/62287 reef: ceph_test_libcephfs_reclaim crashes during test * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout h3. 21 Nov 2023 https://pulpito.ceph.com/?branch=wip-vshankar-testing2-2023-11-15-1128-reef * https://tracker.ceph.com/issues/62658 error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/63339 Test failure: test_client_blocklisted_oldest_tid (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/53859 Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/62580 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/57656 dbench: write failed on handle 10010 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) h3. 07 Nov 2023 https://pulpito.ceph.com/?branch=reef-release (18.2.1) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/63233 mon|client|mds: valgrind reports possible leaks in the MDS * https://tracker.ceph.com/issues/62658 error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/63488 smoke test fails from "NameError: name 'DEBUGFS_META_DIR' is not defined" (missing kclient patches in ubuntu 20.04) * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout * https://tracker.ceph.com/issues/63339 reef: mds: warning `clients failing to advance oldest client/flush tid` seen with some workloads (missing backport) h3. 27 Oct 2023 https://pulpito.ceph.com/yuriw-2023-10-25_14:39:02-fs-wip-yuri6-testing-2023-10-23-1148-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/58126 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/63339 Test failure: test_client_blocklisted_oldest_tid (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/48873 Command failed on smithi118 with status 32: 'sudo mount -t nfs -o port=2049 172.21.15.118:/ceph /mnt' * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/55805 Command failed (workunit test kernel_untar_build.sh) on smithi167 with status 2 * https://tracker.ceph.com/issues/57676 error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/62658 error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds * https://tracker.ceph.com/issues/62580 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/63340 saw valgrind issues h3. 18 Oct 2023 https://pulpito.ceph.com/?branch=wip-vshankar-testing-reef-20231018.163145 * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/63233 mon|client|mds: valgrind reports possible leaks in the MDS * https://tracker.ceph.com/issues/59531 "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.7. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's IOPS capacity using other benchmark tools (e.g. Fio)" * https://tracker.ceph.com/issues/63281 src/mds/MDLog.h: 100: FAILED ceph_assert(!segments.empty()) * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/61831 qa: test_mirroring_init_failure_with_recovery failure * https://tracker.ceph.com/issues/62810 Failure in snaptest-git-ceph.sh (it's an async unlink/create bug) -- Need to fix again * https://tracker.ceph.com/issues/57206 ceph_test_libcephfs_reclaim crashes during test * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout * https://tracker.ceph.com/issues/59343 qa: fs/snaps/snaptest-multiple-capsnaps.sh failed h3. 11 Oct 2023 https://pulpito.ceph.com/?branch=wip-yuri4-testing-2023-10-11-0735-reef * https://patchwork.kernel.org/project/ceph-devel/patch/20220523160951.8781-1-lhenriques@suse.de/ hit max job timeout, kclient incorrectly decode the session message. And have been backported to downstream kernel-4.18.0-500.el8. * https://tracker.ceph.com/issues/63259 "hit max job timeout", failed to store the backtrace and fs was forced to readonly. * https://tracker.ceph.com/issues/63105 "smithi060 with status 139: 'mkdir -p -- ...qa/workunits/libcephfs/test.sh'" * https://tracker.ceph.com/issues/61820 "saw valgrind issues" * https://tracker.ceph.com/issues/62937 "stderr:logrotate does not support parallel execution on the same set of logfiles" * https://tracker.ceph.com/issues/57676 "error during scrub thrashing: rank damage found: {'backtrace'}" * https://tracker.ceph.com/issues/63212 Failed to download "ior.tbz2". * https://tracker.ceph.com/issues/62277 "Error: Unable to find a match: python2" * https://tracker.ceph.com/issues/61574 qa: build failure for mdtest project * https://tracker.ceph.com/issues/62658 "error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds" * https://tracker.ceph.com/issues/63265 "Command failed (workunit test fs/snaps/snaptest-git-ceph.sh) on smithi062 with status 128: .../fs/snaps/snaptest-git-ceph.sh'" * https://tracker.ceph.com/issues/62221 "Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring)" * https://tracker.ceph.com/issues/43863 "mkdir: cannot create directory ‘/home/ubuntu/cephtest/archive/audit’: File exists" h3. 09 Oct 2023 https://pulpito.ceph.com/?branch=wip-vshankar-testing-reef-20231009.131610 * https://tracker.ceph.com/issues/63211 error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds * https://tracker.ceph.com/issues/63212 Failed to download "ior.tbz2" * https://tracker.ceph.com/issues/61831 Test failure: test_mirroring_init_failure_with_recovery (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/18845 saw valgrind issues * https://tracker.ceph.com/issues/53859 Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/61574 qa: build failure for mdtest project * https://tracker.ceph.com/issues/57655 Failed to build the kernel source * https://tracker.ceph.com/issues/62580 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/57087 Test failure: test_fragmented_injection (tasks.cephfs.test_data_scan.TestDataScan) h3. 27 Sep 2023 https://pulpito.ceph.com/?branch=wip-vshankar-testing-reef-20230927.021134 * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/59531 quincy: "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.7. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's IOPS capacity using other benchmark tools (e.g. Fio)" * https://tracker.ceph.com/issues/61574 qa: "[Makefile:59: mdtest] Error 1" (undefined reference to `fi_strerror') * https://tracker.ceph.com/issues/61399 qa: build failure for ior * https://tracker.ceph.com/issues/62653 qa: unimplemented fcntl command: 1036 with fsstress * https://tracker.ceph.com/issues/62847 mds: blogbench requests stuck (5mds+scrub+snaps-flush) * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout * https://tracker.ceph.com/issues/63089 qa: tasks/mirror times out * https://tracker.ceph.com/issues/58244 Test failure: test_rebuild_inotable (tasks.cephfs.test_data_scan.TestDataScan) h3. 26 Sep 2023 https://pulpito.ceph.com/?branch=wip-vshankar-testing-reef-20230926.161455 * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/59531 quincy: "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.000000 IOPS for osd.7. IOPS capacity is unchanged at 315.000000 IOPS. The recommendation is to establish the osd's IOPS capacity using other benchmark tools (e.g. Fio)" * https://tracker.ceph.com/issues/58126 Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/62658 error during scrub thrashing: reached maximum tries (31) after waiting for 900 seconds * https://tracker.ceph.com/issues/62067 ffsb.sh failure "Resource temporarily unavailable" h3. 2 August 2023 https://pulpito.ceph.com/yuriw-2023-07-28_23:11:59-fs-reef-release-distro-default-smithi/ https://pulpito.ceph.com/yuriw-2023-07-29_14:02:18-fs-reef-release-distro-default-smithi/ * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/62187 <<< covered dbench failures iozone: command not found * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/61574 qa: "[Makefile:59: mdtest] Error 1" (undefined reference to `fi_strerror') * https://tracker.ceph.com/issues/61399 qa: build failure for ior * https://tracker.ceph.com/issues/62277 Error: Unable to find a match: python2 with fscrypt tests * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/61182 qa: workloads/cephfs-mirror-ha-workunit - stopping mirror daemon after the test finishes timesout. * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure h3. 31 July 2023 https://pulpito.ceph.com/?branch=wip-yuri-testing-2023-07-25-0833-reef * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/61892 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/61399 qa: build failure for ior * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout * https://tracker.ceph.com/issues/62221 Test failure: test_add_ancestor_and_child_directory (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/61182 qa: workloads/cephfs-mirror-ha-workunit - stopping mirror daemon after the test finishes timesout. * https://tracker.ceph.com/issues/57206 ceph_test_libcephfs_reclaim crashes during test * https://tracker.ceph.com/issues/61574 qa: "[Makefile:59: mdtest] Error 1" (undefined reference to `fi_strerror') * https://tracker.ceph.com/issues/62187 iozone: command not found h3. 25 July 2023 https://pulpito.ceph.com/?branch=wip-yuri4-testing-2023-07-06-0738-reef * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/58340 mds: fsstress.sh hangs with multimds * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/61892 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/58220 Command failed (workunit test fs/quota/quota.sh) on smithi081 with status 1: * https://tracker.ceph.com/issues/61399 qa: build failure for ior * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout h3. 19 July 2023 NOTE: Results from two runs clubbed into one run entry https://pulpito.ceph.com/?branch=wip-yuri8-testing-2023-07-14-0803-reef https://pulpito.ceph.com/?branch=wip-yuri7-testing-2023-07-14-0803-reef * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/58340 mds: fsstress.sh hangs with multimds * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/61892 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/55332 (fixed - needs backport) Failure in snaptest-git-ceph.sh - see https://tracker.ceph.com/issues/55332#note-41 * https://tracker.ceph.com/issues/61201 qa: test_rebuild_moved_file (tasks/data-scan) fails because mds crashes in pacific * https://tracker.ceph.com/issues/62076 reef: Test failure: test_grow_shrink (tasks.cephfs.test_failover.TestMultiFilesystems) * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/58220 Command failed (workunit test fs/quota/quota.sh) on smithi081 with status 1: * https://tracker.ceph.com/issues/44565 src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()) * https://tracker.ceph.com/issues/62081 tasks/fscrypt-common does not finish, timesout h3. 11 July 2023 https://pulpito.ceph.com/?branch=wip-yuri8-testing-2023-07-06-0909-reef * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/58340 mds: fsstress.sh hangs with multimds * https://tracker.ceph.com/issues/61892 Test failure: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays) * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/58220 Command failed (workunit test fs/quota/quota.sh) on smithi081 with status 1: * https://tracker.ceph.com/issues/48773 qa: scrub does not complete h3. 27 Jun 2023 * https://tracker.ceph.com/issues/61182 workloads/cephfs-mirror-ha-workunit: reached maximum tries (50) after waiting for 300 seconds (mirror daemon stop times out) * https://tracker.ceph.com/issues/61820 During valgrind test: CommandFailedError: Command failed on smithi133 with status 1: 'sudo ceph --cluster ceph osd crush tunables default' * https://tracker.ceph.com/issues/61399 *** [Makefile:299: ior] Error 1 * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/61574 [Makefile:59: mdtest] Error 1 (undefined reference to `fi_strerror') * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/57206 qa/workunits/libcephfs/test.sh - ceph_test_libcephfs_reclaim core dump - segmentation fault * https://tracker.ceph.com/issues/59344 qa: workunit test fs/quota/quota.sh failed with "setfattr: .: Invalid argument" h3. 05 Jun 2023 https://tracker.ceph.com/issues/61515#note-1 Re-run: https://pulpito.ceph.com/yuriw-2023-06-02_15:02:25-fs-reef-release-distro-default-smithi/ h4. Known Failures * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/59683 fs/fscrypt.sh - Error: Unable to find a match: userspace-rcu-devel libedit-devel device-mapper-devel * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/61399 qa: "[Makefile:299: ior] Error 1" * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/61574 [Makefile:59: mdtest] Error 1 (undefined reference to `fi_strerror') * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/55332 (fixed - needs backport) Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/61394 mds.a (mds.0) 1 : cluster [WRN] evicting unresponsive client smithi152 (4298), after 303.726 seconds" in cluster log * https://tracker.ceph.com/issues/51279 kclient hangs on umount h4. Packaging/Installation Issues: (smithi 202 issues) * tasks/workunits/snaps Error reimaging machines: Expected smithi202's OS to be rhel 8.6 but found ubuntu 22.04 https://pulpito.ceph.com/yuriw-2023-06-02_15:02:25-fs-reef-release-distro-default-smithi/7294352 * tasks/snapshots Error reimaging machines: 'ssh_keyscan smithi202.front.sepia.ceph.com' reached maximum tries (6) after waiting for 5 seconds https://pulpito.ceph.com/yuriw-2023-06-02_15:02:25-fs-reef-release-distro-default-smithi/7294353 h3. 02 Jun 2023 https://tracker.ceph.com/issues/61515#note-1 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/ h4. Known Test Failures * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288861 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288876 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288892 * https://tracker.ceph.com/issues/57655 ---------------------------CHECK---------------------------- qa: fs:mixed-clients kernel_untar_build failure https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288880 Revert PR isn't merged to reef. Should we merge it before release ? https://github.com/ceph/ceph/pull/51661 https://github.com/ceph/ceph/pull/51500 * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288884 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288901 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288941 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288973 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289012 * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288896 * https://tracker.ceph.com/issues/59346 qa/workunits/fs/test_python.sh failed with "AssertionError: DiskQuotaExceeded not raised by write" https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288903 * https://tracker.ceph.com/issues/55332 (fixed - needs backport) Failure in snaptest-git-ceph.sh https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288905 * https://tracker.ceph.com/issues/59344 qa: workunit test fs/quota/quota.sh failed with "setfattr: .: Invalid argument" https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288914 * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288939 * https://tracker.ceph.com/issues/61394 mds.a (mds.0) 1 : cluster [WRN] evicting unresponsive client smithi152 (4298), after 303.726 seconds" in cluster log https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288948 * https://tracker.ceph.com/issues/61182 workloads/cephfs-mirror-ha-workunit: reached maximum tries (50) after waiting for 300 seconds (mirror daemon stop times out) https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288953 * https://tracker.ceph.com/issues/51279 ---------------------------------CHECK------------------------------------------- kclient hangs on umount https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288964 h4. Known Make Failures * https://tracker.ceph.com/issues/59683 fs/fscrypt.sh - Error: Unable to find a match: userspace-rcu-devel libedit-devel device-mapper-devel https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288877 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288912 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288995 * https://tracker.ceph.com/issues/61399 qa: "[Makefile:299: ior] Error 1" https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288890 * https://tracker.ceph.com/issues/61574 [Makefile:59: mdtest] Error 1 (undefined reference to `fi_strerror') https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288920 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289015 h4. Package/Installation Issues * fs/upgrade/mds_upgrade_sequence : Command failed on smithi119 with status 1: 'sudo yum install -y kernel' 2023-05-28T18:58:30.824 INFO:teuthology.orchestra.run.smithi119.stderr:FATAL ERROR: python callback ??? failed, aborting! https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288908 * Command failed on smithi119 with status 100: 'sudo apt-get clean' 2023-05-28T19:06:41.566 INFO:teuthology.orchestra.run.smithi119.stderr:E: Could not get lock /var/cache/apt/archives/lock. It is held by process 2127 (apt-get) 2023-05-28T19:06:41.566 INFO:teuthology.orchestra.run.smithi119.stderr:E: Unable to lock directory /var/cache/apt/archives/ https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288928 * Command failed on smithi119 with status 2: 'sudo dpkg -i /tmp/linux-image.deb' 2023-05-28T19:18:49.306 DEBUG:teuthology.orchestra.run.smithi119:> sudo dpkg -i /tmp/linux-image.deb 2023-05-28T19:18:49.315 INFO:teuthology.orchestra.run.smithi119.stderr:dpkg: error: dpkg frontend lock was locked by another process with pid 2205 2023-05-28T19:18:49.315 INFO:teuthology.orchestra.run.smithi119.stderr:Note: removing the lock file is always wrong, and can end up damaging the 2023-05-28T19:18:49.315 INFO:teuthology.orchestra.run.smithi119.stderr:locked area and the entire system. See <https://wiki.debian.org/Teams/Dpkg/FAQ>. 2023-05-28T19:18:49.316 DEBUG:teuthology.orchestra.run:got remote process result: 2 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288942 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288970 * Command failed on smithi119 with status 100: 'sudo apt-get clean' 2023-05-28T19:41:50.171 DEBUG:teuthology.orchestra.run.smithi119:> sudo apt-get clean ... 2023-05-28T19:41:50.398 INFO:teuthology.orchestra.run.smithi119.stderr:E: Could not get lock /var/cache/apt/archives/lock. It is held by process 2120 (apt-get) 2023-05-28T19:41:50.399 INFO:teuthology.orchestra.run.smithi119.stderr:E: Unable to lock directory /var/cache/apt/archives/ 2023-05-28T19:41:50.400 DEBUG:teuthology.orchestra.run:got remote process result: 100 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288987 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289009 * tasks/mds-flush Command failed on smithi119 with status 1: 'sudo yum install -y kernel' 2023-05-28T19:49:47.762 INFO:teuthology.orchestra.run.smithi119.stderr:[Errno 2] No such file or directory: '/var/cache/dnf/baseos-055ffcb2ec25a27f/packages/kernel-4.18.0-492.el8.x86_64.rpm' https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288998 h4. ssh connection to smithi119 was lost FAILED: * Failed to connect to the host via ssh: ssh: connect to host smithi119.front.sepia.ceph.com port 22: No route to host https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289021 DEAD: * SSH connection to smithi119 was lost: 'sudo grub2-mkconfig -o /boot/grub2/grub.cfg' https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288909 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288996 * SSH connection to smithi119 was lost: 'sudo DEBIAN_FRONTEND=noninteractive apt-get -y install linux-image-generic' https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288926 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288938 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288958 https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289008 * SSH connection to smithi119 was lost: 'sudo update-grub' https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7288982 h4. Others * https://tracker.ceph.com/issues/61575 task/mds_thrash.py - assert manager.is_clean() https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289030/ * https://tracker.ceph.com/issues/61576 Failed to manage policy for boolean nagios_run_sudo: [Errno 11] Resource temporarily unavailable https://pulpito.ceph.com/yuriw-2023-05-28_14:46:14-fs-reef-release-distro-default-smithi/7289016 h3. 23 May 2023 https://pulpito.ceph.com/yuriw-2023-05-22_14:44:12-fs-wip-yuri3-testing-2023-05-21-0740-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/59683 fs/fscrypt.sh - Error: Unable to find a match: userspace-rcu-devel libedit-devel device-mapper-devel * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/56695 (fixed - backport not yet merged in reef) Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/58340 mds: fsstress.sh hangs with multimds * https://tracker.ceph.com/issues/55332 (fixed - needs backport) Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/61358 osd.2 (osd.2) 3 : cluster [WRN] 1 slow requests (by type [ 'delayed' : 1 ] most affected pool [ 'cephfs_data' : 1 ])" in cluster log * https://tracker.ceph.com/issues/61182 workloads/cephfs-mirror-ha-workunit: reached maximum tries (50) after waiting for 300 seconds (mirror daemon stop times out) * https://tracker.ceph.com/issues/59297 qa: test_join_fs_unset failure * https://tracker.ceph.com/issues/51282 cluster [WRN] Health check failed: Degraded data redundancy: 9/46 objects degraded (19.565%), 4 pgs degraded (PG_DEGRADED)" in cluster log h3. 19 May 2023 https://pulpito.ceph.com/yuriw-2023-05-10_18:53:39-fs-wip-yuri3-testing-2023-05-10-0851-reef-distro-default-smithi/ https://pulpito.ceph.com/yuriw-2023-05-15_15:47:06-fs-wip-yuri3-testing-2023-05-10-0851-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/57655 qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/61265 (NEW - not related) tasks.cephfs.fuse_mount:process failed to terminate after unmount * https://tracker.ceph.com/issues/56446 (fixed - reef backport yet to be merged) Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/55332 (fixed - needs backport) Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/59667 cluster [WRN] client could not reconnect as file system flag refuse_client_session is set" in cluster log * https://tracker.ceph.com/issues/56506 Test failure: test_rebuild_backtraceless (tasks.cephfs.test_data_scan.TestDataScan) * https://tracker.ceph.com/issues/61278 (NEW - teuthology bug) Config file not found: "/home/teuthworker/src/github.com_ceph_ceph-c_ef2746e02c24971e8f0b792fa43f63ba4ae04ec2/qa/tasks/cephadm.conf". * https://tracker.ceph.com/issues/51282 cluster [WRN] Health check failed: Degraded data redundancy: 9/46 objects degraded (19.565%), 4 pgs degraded (PG_DEGRADED)" in cluster log * https://tracker.ceph.com/issues/59683 fs/fscrypt.sh - Error: Unable to find a match: userspace-rcu-devel libedit-devel device-mapper-devel * https://tracker.ceph.com/issues/48773 workunit/postgres - error during scrub thrashing: reached maximum tries (30) after waiting for 900 seconds * https://tracker.ceph.com/issues/61279 qa: test_dirfrag_limit (tasks.cephfs.test_strays.TestStrays) failed h3. 17 May 2023 https://pulpito.ceph.com/yuriw-2023-05-15_15:22:39-fs-wip-yuri6-testing-2023-04-26-1247-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/59683 fs/fscrypt.sh - Error: Unable to find a match: userspace-rcu-devel libedit-devel device-mapper-devel * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi002 with status 1 * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/56446 (fixed - reef backport yet to be merged) Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/61182 workloads/cephfs-mirror-ha-workunit: reached maximum tries (50) after waiting for 300 seconds (mirror daemon stop times out) * https://tracker.ceph.com/issues/51964 Test failure: test_cephfs_mirror_restart_sync_on_blocklist (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/55332 (fixed - needs backport) Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/56695 (fixed - backport not yet merged in reef) Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/58340 mds: fsstress.sh hangs with multimds * https://tracker.ceph.com/issues/48773 workunit/postgres - error during scrub thrashing: reached maximum tries (30) after waiting for 900 seconds * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/53302 tasks/fscrypt-iozone- Command failed on smithi079 with status 3: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' h3. 16 May 2023 https://pulpito.ceph.com/yuriw-2023-05-09_19:37:41-fs-wip-yuri10-testing-2023-05-08-0849-reef-distro-default-smithi/ * https://tracker.ceph.com/issues/58126 [kclient bug] Test failure: test_fscrypt_dummy_encryption_with_quick_group (tasks.cephfs.test_fscrypt.TestFscrypt) * https://tracker.ceph.com/issues/61182 workloads/cephfs-mirror-ha-workunit: reached maximum tries (50) after waiting for 300 seconds (mirror daemon stop times out) * https://tracker.ceph.com/issues/47292 Test failure: test_df_for_valid_file (tasks.cephfs.test_cephfs_shell.TestDF) * https://tracker.ceph.com/issues/51964 Test failure: test_cephfs_mirror_restart_sync_on_blocklist (tasks.cephfs.test_mirroring.TestMirroring) * https://tracker.ceph.com/issues/59667 cluster [WRN] client could not reconnect as file system flag refuse_client_session is set" in cluster log * https://tracker.ceph.com/issues/52624 cluster [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)" in cluster log * https://tracker.ceph.com/issues/61400 valgrind+ceph-mon: segmentation fault in rocksdb+tcmalloc * https://tracker.ceph.com/issues/48773 workunit/postgres - error during scrub thrashing: reached maximum tries (30) after waiting for 900 seconds * https://tracker.ceph.com/issues/57206 qa/workunits/libcephfs/test.sh - ceph_test_libcephfs_reclaim core dump - segmentation fault * https://tracker.ceph.com/issues/38704 (Very old one resolved - ignore in testcase) tasks/cephfs_misc_tests - cluster [WRN] Health check failed: 1/3 mons down, quorum a,b (MON_DOWN)" in cluster log' * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi099 with status 1 * https://tracker.ceph.com/issues/53302 tasks/fscrypt-iozone- Command failed on smithi079 with status 3: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' tasks/cfuse_workunit_suites_blogbench traceless/50pc - Command failed on smithi133 with status 3: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' * https://tracker.ceph.com/issues/58220 Command failed (workunit test fs/quota/quota.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/58340 mds: fsstress.sh hangs with multimds * https://tracker.ceph.com/issues/59683 fs/fscrypt.sh - Error: Unable to find a match: userspace-rcu-devel libedit-devel device-mapper-devel