Main » History » Revision 91
Revision 90 (Rishabh Dave, 09/29/2022 12:19 PM) → Revision 91/263 (Rishabh Dave, 10/10/2022 06:18 PM)
h1. MAIN h3. placeholder h3. 10 Oct 2022 http://pulpito.front.sepia.ceph.com/rishabh-2022-09-30_19:45:21-fs-wip-rishabh-testing-30Sep2022-testing-default-smithi/ reruns * fs-thrash, passed: http://pulpito.front.sepia.ceph.com/rishabh-2022-10-04_13:19:47-fs-wip-rishabh-testing-30Sep2022-testing-default-smithi/ * fs-verify, passed: http://pulpito.front.sepia.ceph.com/rishabh-2022-10-05_12:25:37-fs-wip-rishabh-testing-30Sep2022-testing-default-smithi/ * cephadm failures also passed after many re-runs: http://pulpito.front.sepia.ceph.com/rishabh-2022-10-06_13:50:51-fs-wip-rishabh-testing-30Sep2022-2-testing-default-smithi/ ** needed this PR to be merged in ceph-ci branch * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke * https://tracker.ceph.com/issues/57299 qa: test_dump_loads fails with JSONDecodeError * https://tracker.ceph.com/issues/57655 [Exist in main as well] qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/57206 libcephfs/test.sh: ceph_test_libcephfs_reclaim h3. 2022 Sep 29 http://pulpito.front.sepia.ceph.com/rishabh-2022-09-14_12:48:43-fs-wip-rishabh-testing-2022Sep9-1708-testing-default-smithi/ * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/36593 Command failed (workunit test fs/quota/quota.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/56632 Test failure: test_subvolume_snapshot_clone_quota_exceeded * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing h3. 2022 Sep 26 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20220923.171109 * https://tracker.ceph.com/issues/55804 qa failure: pjd link tests failed * https://tracker.ceph.com/issues/57676 qa: error during scrub thrashing: rank damage found: {'backtrace'} * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/57580 Test failure: test_newops_getvxattr (tasks.cephfs.test_newops.TestNewOps) * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/57299 qa: test_dump_loads fails with JSONDecodeError * https://tracker.ceph.com/issues/57280 qa: tasks/kernel_cfuse_workunits_untarbuild_blogbench fails - Failed to fetch package version from shaman * https://tracker.ceph.com/issues/57205 Test failure: test_subvolume_group_ls_filter_internal_directories (tasks.cephfs.test_volumes.TestSubvolumeGroups) * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/57677 qa: "1 MDSs behind on trimming (MDS_TRIM)" * https://tracker.ceph.com/issues/57206 libcephfs/test.sh: ceph_test_libcephfs_reclaim * https://tracker.ceph.com/issues/57446 qa: test_subvolume_snapshot_info_if_orphan_clone fails * https://tracker.ceph.com/issues/57655 [Exist in main as well] qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/57682 client: ERROR: test_reconnect_after_blocklisted h3. 2022 Sep 22 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20220920.234701 * https://tracker.ceph.com/issues/57299 qa: test_dump_loads fails with JSONDecodeError * https://tracker.ceph.com/issues/57205 Test failure: test_subvolume_group_ls_filter_internal_directories (tasks.cephfs.test_volumes.TestSubvolumeGroups) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/57580 Test failure: test_newops_getvxattr (tasks.cephfs.test_newops.TestNewOps) * https://tracker.ceph.com/issues/57280 qa: tasks/kernel_cfuse_workunits_untarbuild_blogbench fails - Failed to fetch package version from shaman * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/56446 Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/57206 libcephfs/test.sh: ceph_test_libcephfs_reclaim * https://tracker.ceph.com/issues/51267 CommandFailedError: Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi096 with status 1:... NEW: * https://tracker.ceph.com/issues/57656 [testing] dbench: write failed on handle 10009 (Resource temporarily unavailable) * https://tracker.ceph.com/issues/57655 [Exist in main as well] qa: fs:mixed-clients kernel_untar_build failure * https://tracker.ceph.com/issues/57657 mds: scrub locates mismatch between child accounted_rstats and self rstats Segfault probably caused by: https://github.com/ceph/ceph/pull/47795#issuecomment-1255724799 h3. 2022 Sep 16 https://pulpito.ceph.com/?branch=wip-vshankar-testing1-20220905-132828 * https://tracker.ceph.com/issues/57446 qa: test_subvolume_snapshot_info_if_orphan_clone fails * https://tracker.ceph.com/issues/57299 qa: test_dump_loads fails with JSONDecodeError * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/57205 Test failure: test_subvolume_group_ls_filter_internal_directories (tasks.cephfs.test_volumes.TestSubvolumeGroups) * https://tracker.ceph.com/issues/57280 qa: tasks/kernel_cfuse_workunits_untarbuild_blogbench fails - Failed to fetch package version from shaman * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/48203 https://tracker.ceph.com/issues/36593 qa: quota failure qa: quota failure caused by clients stepping on each other * https://tracker.ceph.com/issues/57580 Test failure: test_newops_getvxattr (tasks.cephfs.test_newops.TestNewOps) h3. 2022 Aug 26 http://pulpito.front.sepia.ceph.com/rishabh-2022-08-22_17:49:59-fs-wip-rishabh-testing-2022Aug19-testing-default-smithi/ http://pulpito.front.sepia.ceph.com/rishabh-2022-08-24_11:56:51-fs-wip-rishabh-testing-2022Aug19-testing-default-smithi/ * https://tracker.ceph.com/issues/57206 libcephfs/test.sh: ceph_test_libcephfs_reclaim * https://tracker.ceph.com/issues/56632 Test failure: test_subvolume_snapshot_clone_quota_exceeded (tasks.cephfs.test_volumes.TestSubvolumeSnapshotClones) * https://tracker.ceph.com/issues/56446 Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/53859 qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/54462 Command failed (workunit test fs/snaps/snaptest-git-ceph.sh) on smithi055 with status 128 * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/36593 Command failed (workunit test fs/quota/quota.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) h3. 2022 Aug 22 https://pulpito.ceph.com/vshankar-2022-08-12_09:34:24-fs-wip-vshankar-testing1-20220812-072441-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-08-18_04:30:42-fs-wip-vshankar-testing1-20220818-082047-testing-default-smithi/ (drop problematic PR and re-run) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/56446 Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/51278 mds: "FAILED ceph_assert(!segments.empty())" * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/57205 Test failure: test_subvolume_group_ls_filter_internal_directories (tasks.cephfs.test_volumes.TestSubvolumeGroups) * https://tracker.ceph.com/issues/57206 ceph_test_libcephfs_reclaim crashes during test * https://tracker.ceph.com/issues/53859 Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) h3. 2022 Aug 12 https://pulpito.ceph.com/vshankar-2022-08-10_04:06:00-fs-wip-vshankar-testing-20220805-190751-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-08-11_12:16:58-fs-wip-vshankar-testing-20220811-145809-testing-default-smithi/ (drop problematic PR and re-run) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/56446 Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithixxx with status 1 h3. 2022 Aug 04 https://pulpito.ceph.com/?branch=wip-vshankar-testing1-20220804-123835 (only mgr/volumes, mgr/stats) Unrealted teuthology failure on rhel h3. 2022 Jul 25 http://pulpito.front.sepia.ceph.com/rishabh-2022-07-22_11:34:20-fs-wip-rishabh-testing-2022Jul22-1400-testing-default-smithi/ 1st re-run: http://pulpito.front.sepia.ceph.com/rishabh-2022-07-24_03:51:19-fs-wip-rishabh-testing-2022Jul22-1400-testing-default-smithi 2nd re-run: http://pulpito.front.sepia.ceph.com/rishabh-2022-07-24_08:53:36-fs-wip-rishabh-testing-2022Jul22-1400-testing-default-smithi/ 3rd re-run: http://pulpito.front.sepia.ceph.com/rishabh-2022-07-24_08:53:36-fs-wip-rishabh-testing-2022Jul22-1400-testing-default-smithi/ 4th (final) re-run: http://pulpito.front.sepia.ceph.com/rishabh-2022-07-28_03:59:01-fs-wip-rishabh-testing-2022Jul28-0143-testing-default-smithi/ * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) * https://tracker.ceph.com/issues/54460 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/36593 Command failed (workunit test fs/quota/quota.sh) on smithixxx with status 1 * https://tracker.ceph.com/issues/54462 Command failed (workunit test fs/snaps/snaptest-git-ceph.sh) on smithi055 with status 128~ h3. 2022 July 22 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20220721.235756 MDS_HEALTH_DUMMY error in log fixed by followup commit. transient selinux ping failure * https://tracker.ceph.com/issues/56694 qa: avoid blocking forever on hung umount * https://tracker.ceph.com/issues/56695 [RHEL stock] pjd test failures * https://tracker.ceph.com/issues/56696 admin keyring disappears during qa run * https://tracker.ceph.com/issues/56697 qa: fs/snaps fails for fuse * https://tracker.ceph.com/issues/50222 osd: 5.2s0 deep-scrub : stat mismatch * https://tracker.ceph.com/issues/56698 client: FAILED ceph_assert(_size == 0) * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" h3. 2022 Jul 15 http://pulpito.front.sepia.ceph.com/rishabh-2022-07-08_23:53:34-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/ re-run: http://pulpito.front.sepia.ceph.com/rishabh-2022-07-15_06:42:04-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/ * https://tracker.ceph.com/issues/53859 Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) * https://tracker.ceph.com/issues/50223 client.xxxx isn't responding to mclientcaps(revoke) * https://tracker.ceph.com/issues/50222 osd: deep-scrub : stat mismatch * https://tracker.ceph.com/issues/56632 Test failure: test_subvolume_snapshot_clone_quota_exceeded (tasks.cephfs.test_volumes.TestSubvolumeSnapshotClones) * https://tracker.ceph.com/issues/56634 workunit test fs/snaps/snaptest-intodir.sh * https://tracker.ceph.com/issues/56644 Test failure: test_rapid_creation (tasks.cephfs.test_fragment.TestFragmentation) h3. 2022 July 05 http://pulpito.front.sepia.ceph.com/rishabh-2022-07-02_14:14:52-fs-wip-rishabh-testing-20220702-1631-testing-default-smithi/ On 1st re-run some jobs passed - http://pulpito.front.sepia.ceph.com/rishabh-2022-07-03_15:10:28-fs-wip-rishabh-testing-20220702-1631-distro-default-smithi/ On 2nd re-run only few jobs failed - http://pulpito.front.sepia.ceph.com/rishabh-2022-07-06_05:24:29-fs-wip-rishabh-testing-20220705-2132-distro-default-smithi/ http://pulpito.front.sepia.ceph.com/rishabh-2022-07-06_05:24:29-fs-wip-rishabh-testing-20220705-2132-distro-default-smithi/ * https://tracker.ceph.com/issues/56446 Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/55804 Command failed (workunit test suites/pjd.sh) on smithi047 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/ * https://tracker.ceph.com/issues/56445 Command failed on smithi080 with status 123: "find /home/ubuntu/cephtest/archive/syslog -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --" * https://tracker.ceph.com/issues/51267 Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi098 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 * https://tracker.ceph.com/issues/50224 Test failure: test_mirroring_init_failure_with_recovery (tasks.cephfs.test_mirroring.TestMirroring) h3. 2022 July 04 https://pulpito.ceph.com/vshankar-2022-06-29_09:19:00-fs-wip-vshankar-testing-20220627-100931-testing-default-smithi/ (rhel runs were borked due to: https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/JSZQFUKVLDND4W33PXDGCABPHNSPT6SS/, tests ran with --filter-out=rhel) * https://tracker.ceph.com/issues/56445 Command failed on smithi162 with status 123: "find /home/ubuntu/cephtest/archive/syslog -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --" * https://tracker.ceph.com/issues/56446 Test failure: test_client_cache_size (tasks.cephfs.test_client_limits.TestClientLimits) * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" h3. 2022 June 20 https://pulpito.ceph.com/vshankar-2022-06-15_04:03:39-fs-wip-vshankar-testing1-20220615-072516-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-06-19_08:22:46-fs-wip-vshankar-testing1-20220619-102531-testing-default-smithi/ * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/55804 qa failure: pjd link tests failed * https://tracker.ceph.com/issues/54108 qa: iogen workunit: "The following counters failed to be set on mds daemons: {'mds.exported', 'mds.imported'}" * https://tracker.ceph.com/issues/55332 Failure in snaptest-git-ceph.sh (it's an async unlink/create bug) h3. 2022 June 13 https://pulpito.ceph.com/pdonnell-2022-06-12_05:08:12-fs:workload-wip-pdonnell-testing-20220612.004943-distro-default-smithi/ * https://tracker.ceph.com/issues/56024 cephadm: removes ceph.conf during qa run causing command failure * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/56012 mds: src/mds/MDLog.cc: 283: FAILED ceph_assert(!mds->is_ any_replay()) h3. 2022 Jun 13 https://pulpito.ceph.com/vshankar-2022-06-07_00:25:50-fs-wip-vshankar-testing-20220606-223254-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-06-10_01:04:46-fs-wip-vshankar-testing-20220609-175550-testing-default-smithi/ * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/53859 qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/55804 qa failure: pjd link tests failed * https://tracker.ceph.com/issues/56003 client: src/include/xlist.h: 81: FAILED ceph_assert(_size == 0) * https://tracker.ceph.com/issues/56011 fs/thrash: snaptest-snap-rm-cmp.sh fails in mds5sum comparison * https://tracker.ceph.com/issues/56012 mds: src/mds/MDLog.cc: 283: FAILED ceph_assert(!mds->is_ any_replay()) h3. 2022 Jun 07 https://pulpito.ceph.com/vshankar-2022-06-06_21:25:41-fs-wip-vshankar-testing1-20220606-230129-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-06-07_10:53:31-fs-wip-vshankar-testing1-20220607-104134-testing-default-smithi/ (rerun after dropping a problematic PR) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50224 qa: test_mirroring_init_failure_with_recovery failure h3. 2022 May 12 https://pulpito.ceph.com/?branch=wip-vshankar-testing-20220509-125847 https://pulpito.ceph.com/vshankar-2022-05-13_17:09:16-fs-wip-vshankar-testing-20220513-120051-testing-default-smithi/ (drop prs + rerun) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/55332 Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/53859 qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/55538 Test failure: test_flush (tasks.cephfs.test_readahead.TestReadahead) * https://tracker.ceph.com/issues/55258 lots of "heartbeat_check: no reply from X.X.X.X" in OSD logs (cropss up again, though very infrequent) h3. 2022 May 04 https://pulpito.ceph.com/vshankar-2022-05-01_13:18:44-fs-wip-vshankar-testing1-20220428-204527-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-05-02_16:58:59-fs-wip-vshankar-testing1-20220502-201957-testing-default-smithi/ (after dropping PRs) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/55332 Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/53859 qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) * https://tracker.ceph.com/issues/55516 qa: fs suite tests failing with "json.decoder.JSONDecodeError: Extra data: line 2 column 82 (char 82)" * https://tracker.ceph.com/issues/55537 mds: crash during fs:upgrade test * https://tracker.ceph.com/issues/55538 Test failure: test_flush (tasks.cephfs.test_readahead.TestReadahead) h3. 2022 Apr 25 https://pulpito.ceph.com/?branch=wip-vshankar-testing-20220420-113951 (owner vshankar) * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/55258 lots of "heartbeat_check: no reply from X.X.X.X" in OSD logs * https://tracker.ceph.com/issues/55377 kclient: mds revoke Fwb caps stuck after the kclient tries writebcak once h3. 2022 Apr 14 https://pulpito.ceph.com/?branch=wip-vshankar-testing1-20220411-144044 * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/52438 qa: ffsb timeout * https://tracker.ceph.com/issues/55170 mds: crash during rejoin (CDir::fetch_keys) * https://tracker.ceph.com/issues/55331 pjd failure * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/55332 Failure in snaptest-git-ceph.sh * https://tracker.ceph.com/issues/55258 lots of "heartbeat_check: no reply from X.X.X.X" in OSD logs h3. 2022 Apr 11 https://pulpito.ceph.com/?branch=wip-vshankar-testing-55110-20220408-203242 * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/52438 qa: ffsb timeout * https://tracker.ceph.com/issues/48680 mds: scrubbing stuck "scrub active (0 inodes in the stack)" * https://tracker.ceph.com/issues/55236 qa: fs/snaps tests fails with "hit max job timeout" * https://tracker.ceph.com/issues/54108 qa: iogen workunit: "The following counters failed to be set on mds daemons: {'mds.exported', 'mds.imported'}" * https://tracker.ceph.com/issues/54971 Test failure: test_perf_stats_stale_metrics (tasks.cephfs.test_mds_metrics.TestMDSMetrics) * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/55258 lots of "heartbeat_check: no reply from X.X.X.X" in OSD logs h3. 2022 Mar 21 https://pulpito.ceph.com/vshankar-2022-03-20_02:16:37-fs-wip-vshankar-testing-20220319-163539-testing-default-smithi/ Run didn't go well, lots of failures - debugging by dropping PRs and running against master branch. Only merging unrelated PRs that pass tests. h3. 2022 Mar 08 https://pulpito.ceph.com/vshankar-2022-02-28_04:32:15-fs-wip-vshankar-testing-20220226-211550-testing-default-smithi/ rerun with - (drop) https://github.com/ceph/ceph/pull/44679 - (drop) https://github.com/ceph/ceph/pull/44958 https://pulpito.ceph.com/vshankar-2022-03-06_14:47:51-fs-wip-vshankar-testing-20220304-132102-testing-default-smithi/ * https://tracker.ceph.com/issues/54419 (new) `ceph orch upgrade start` seems to never reach completion * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/52438 qa: ffsb timeout * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing h3. 2022 Feb 09 https://pulpito.ceph.com/vshankar-2022-02-05_17:27:49-fs-wip-vshankar-testing-20220201-113815-testing-default-smithi/ rerun with - (drop) https://github.com/ceph/ceph/pull/37938 - (drop) https://github.com/ceph/ceph/pull/44335 - (drop) https://github.com/ceph/ceph/pull/44491 - (drop) https://github.com/ceph/ceph/pull/44501 https://pulpito.ceph.com/vshankar-2022-02-08_14:27:29-fs-wip-vshankar-testing-20220208-181241-testing-default-smithi/ * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/54066 test_subvolume_no_upgrade_v1_sanity fails with `AssertionError: 1000 != 0` * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/52438 qa: ffsb timeout h3. 2022 Feb 01 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20220127.171526 * https://tracker.ceph.com/issues/54107 kclient: hang during umount * https://tracker.ceph.com/issues/54106 kclient: hang during workunit cleanup * https://tracker.ceph.com/issues/54108 qa: iogen workunit: "The following counters failed to be set on mds daemons: {'mds.exported', 'mds.imported'}" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/52438 qa: ffsb timeout h3. 2022 Jan 13 https://pulpito.ceph.com/vshankar-2022-01-06_13:18:41-fs-wip-vshankar-testing-20220106-145819-testing-default-smithi/ rerun with: - (add) https://github.com/ceph/ceph/pull/44570 - (drop) https://github.com/ceph/ceph/pull/43184 https://pulpito.ceph.com/vshankar-2022-01-13_04:42:40-fs-wip-vshankar-testing-20220106-145819-testing-default-smithi/ * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/53859 qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) h3. 2022 Jan 03 https://pulpito.ceph.com/vshankar-2021-12-22_07:37:44-fs-wip-vshankar-testing-20211216-114012-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2022-01-03_12:27:45-fs-wip-vshankar-testing-20220103-142738-testing-default-smithi/ (rerun) * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/51267 CommandFailedError: Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi096 with status 1:... * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/51278 mds: "FAILED ceph_assert(!segments.empty())" * https://tracker.ceph.com/issues/52279 cephadm tests fail due to: error adding seccomp filter rule for syscall bdflush: requested action matches default action of filter h3. 2021 Dec 22 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211222.014316 * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/52279 cephadm tests fail due to: error adding seccomp filter rule for syscall bdflush: requested action matches default action of filter * https://tracker.ceph.com/issues/50224 qa: test_mirroring_init_failure_with_recovery failure * https://tracker.ceph.com/issues/48773 qa: scrub does not complete h3. 2021 Nov 30 https://pulpito.ceph.com/vshankar-2021-11-24_07:14:27-fs-wip-vshankar-testing-20211124-094330-testing-default-smithi/ https://pulpito.ceph.com/vshankar-2021-11-30_06:23:32-fs-wip-vshankar-testing-20211124-094330-distro-default-smithi/ (rerun w/ QA fixes) * https://tracker.ceph.com/issues/53436 mds, mon: mds beacon messages get dropped? (mds never reaches up:active state) * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/48812 qa: test_scrub_pause_and_resume_with_abort failure * https://tracker.ceph.com/issues/51076 "wait_for_recovery: failed before timeout expired" during thrashosd test with EC backend. * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") h3. 2021 November 9 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211109.180315 * https://tracker.ceph.com/issues/53214 qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-f42229a869be.client4874/metrics': Is a directory" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/53216 qa: "RuntimeError: value of attributes should be either str or None. client_id" * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") h3. 2021 November 03 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211103.023355 * https://tracker.ceph.com/issues/51964 qa: test_cephfs_mirror_restart_sync_on_blocklist failure * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/52436 fs/ceph: "corrupt mdsmap" * https://tracker.ceph.com/issues/53074 pybind/mgr/cephadm: upgrade sequence does not continue if no MDS are active * https://tracker.ceph.com/issues/53150 pybind/mgr/cephadm/upgrade: tolerate MDS failures during upgrade straddling v16.2.5 * https://tracker.ceph.com/issues/53155 MDSMonitor: assertion during upgrade to v16.2.5+ h3. 2021 October 26 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211025.000447 * https://tracker.ceph.com/issues/53074 pybind/mgr/cephadm: upgrade sequence does not continue if no MDS are active * https://tracker.ceph.com/issues/52997 testing: hang ing umount * https://tracker.ceph.com/issues/50824 qa: snaptest-git-ceph bus error * https://tracker.ceph.com/issues/52436 fs/ceph: "corrupt mdsmap" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/53082 ceph-fuse: segmenetation fault in Client::handle_mds_map * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50224 qa: test_mirroring_init_failure_with_recovery failure * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") h3. 2021 October 19 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211019.013028 * https://tracker.ceph.com/issues/52995 qa: test_standby_count_wanted failure * https://tracker.ceph.com/issues/52948 osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up" * https://tracker.ceph.com/issues/52996 qa: test_perf_counters via test_openfiletable * https://tracker.ceph.com/issues/48772 qa: pjd: not ok 9, 44, 80 * https://tracker.ceph.com/issues/52997 testing: hang ing umount * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/48773 qa: scrub does not complete h3. 2021 October 12 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211012.192211 Some failures caused by teuthology bug: https://tracker.ceph.com/issues/52944 New test caused failure: https://github.com/ceph/ceph/pull/43297#discussion_r729883167 * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/52948 osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50224 qa: test_mirroring_init_failure_with_recovery failure * https://tracker.ceph.com/issues/52949 RuntimeError: The following counters failed to be set on mds daemons: {'mds.dir_split'} h3. 2021 October 02 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20211002.163337 Some failures caused by cephadm upgrade test. Fixed in follow-up qa commit. test_simple failures caused by PR in this set. A few reruns because of QA infra noise. * https://tracker.ceph.com/issues/52822 qa: failed pacific install on fs:upgrade * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete h3. 2021 September 20 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210917.174826 * https://tracker.ceph.com/issues/52677 qa: test_simple failure * https://tracker.ceph.com/issues/51279 kclient hangs on umount (testing branch) * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/52438 qa: ffsb timeout h3. 2021 September 10 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210910.181451 * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") * https://tracker.ceph.com/issues/52624 qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)" * https://tracker.ceph.com/issues/52625 qa: test_kill_mdstable (tasks.cephfs.test_snapshots.TestSnapshots) * https://tracker.ceph.com/issues/52439 qa: acls does not compile on centos stream * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/52626 mds: ScrubStack.cc: 831: FAILED ceph_assert(diri) * https://tracker.ceph.com/issues/51279 kclient hangs on umount (testing branch) h3. 2021 August 27 Several jobs died because of device failures. https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210827.024746 * https://tracker.ceph.com/issues/52430 mds: fast async create client mount breaks racy test * https://tracker.ceph.com/issues/52436 fs/ceph: "corrupt mdsmap" * https://tracker.ceph.com/issues/52437 mds: InoTable::replay_release_ids abort via test_inotable_sync * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/52438 qa: ffsb timeout * https://tracker.ceph.com/issues/52439 qa: acls does not compile on centos stream h3. 2021 July 30 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210729.214022 * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/51975 pybind/mgr/stats: KeyError h3. 2021 July 28 https://pulpito.ceph.com/pdonnell-2021-07-28_00:39:45-fs-wip-pdonnell-testing-20210727.213757-distro-basic-smithi/ with qa fix: https://pulpito.ceph.com/pdonnell-2021-07-28_16:20:28-fs-wip-pdonnell-testing-20210728.141004-distro-basic-smithi/ * https://tracker.ceph.com/issues/51905 qa: "error reading sessionmap 'mds1_sessionmap'" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") * https://tracker.ceph.com/issues/51267 CommandFailedError: Command failed (workunit test fs/snaps/snaptest-multiple-capsnaps.sh) on smithi096 with status 1:... * https://tracker.ceph.com/issues/51279 kclient hangs on umount (testing branch) h3. 2021 July 16 https://pulpito.ceph.com/pdonnell-2021-07-16_05:50:11-fs-wip-pdonnell-testing-20210716.022804-distro-basic-smithi/ * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/48772 qa: pjd: not ok 9, 44, 80 * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/51279 kclient hangs on umount (testing branch) * https://tracker.ceph.com/issues/50824 qa: snaptest-git-ceph bus error h3. 2021 July 04 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210703.052904 * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/39150 mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/48771 qa: iogen: workload fails to cause balancing * https://tracker.ceph.com/issues/51279 kclient hangs on umount (testing branch) * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" ("freshly-calculated rstats don't match existing ones") h3. 2021 July 01 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210701.192056 * https://tracker.ceph.com/issues/51197 qa: [WRN] Scrub error on inode 0x10000001520 (/client.0/tmp/t/linux-5.4/Documentation/driver-api) see mds.f log and `damage ls` output for details * https://tracker.ceph.com/issues/50866 osd: stat mismatch on objects * https://tracker.ceph.com/issues/48773 qa: scrub does not complete h3. 2021 June 26 https://pulpito.ceph.com/pdonnell-2021-06-26_00:57:00-fs-wip-pdonnell-testing-20210625.225421-distro-basic-smithi/ * https://tracker.ceph.com/issues/51183 qa: FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/ceph/3fab6bea-f243-47a4-a956-8c03a62b61b5.client4721/mds_sessions' * https://tracker.ceph.com/issues/51410 kclient: fails to finish reconnect during MDS thrashing (testing branch) * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/51169 qa: ubuntu 20.04 sys protections prevent multiuser file access in /tmp * https://tracker.ceph.com/issues/48772 qa: pjd: not ok 9, 44, 80 h3. 2021 June 21 https://pulpito.ceph.com/pdonnell-2021-06-22_00:27:21-fs-wip-pdonnell-testing-20210621.231646-distro-basic-smithi/ One failure caused by PR: https://github.com/ceph/ceph/pull/41935#issuecomment-866472599 * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings * https://tracker.ceph.com/issues/51183 qa: FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/ceph/3fab6bea-f243-47a4-a956-8c03a62b61b5.client4721/mds_sessions' * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/48771 qa: iogen: workload fails to cause balancing * https://tracker.ceph.com/issues/51169 qa: ubuntu 20.04 sys protections prevent multiuser file access in /tmp * https://tracker.ceph.com/issues/50495 libcephfs: shutdown race fails with status 141 * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/50824 qa: snaptest-git-ceph bus error * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" h3. 2021 June 16 https://pulpito.ceph.com/pdonnell-2021-06-16_21:26:55-fs-wip-pdonnell-testing-20210616.191804-distro-basic-smithi/ MDS abort class of failures caused by PR: https://github.com/ceph/ceph/pull/41667 * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/51169 qa: ubuntu 20.04 sys protections prevent multiuser file access in /tmp * https://tracker.ceph.com/issues/43216 MDSMonitor: removes MDS coming out of quorum election * https://tracker.ceph.com/issues/51278 mds: "FAILED ceph_assert(!segments.empty())" * https://tracker.ceph.com/issues/51279 kclient hangs on umount (testing branch) * https://tracker.ceph.com/issues/51280 mds: "FAILED ceph_assert(r == 0 || r == -2)" * https://tracker.ceph.com/issues/51183 qa: FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/ceph/3fab6bea-f243-47a4-a956-8c03a62b61b5.client4721/mds_sessions' * https://tracker.ceph.com/issues/51281 qa: snaptest-snap-rm-cmp.sh: "echo 'FAIL: bad match, /tmp/a 4637e766853d1ad16a7b17079e2c6f03 != real c3883760b18d50e8d78819c54d579b00'" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/51076 "wait_for_recovery: failed before timeout expired" during thrashosd test with EC backend. * https://tracker.ceph.com/issues/51228 qa: rmdir: failed to remove 'a/.snap/*': No such file or directory * https://tracker.ceph.com/issues/51282 pybind/mgr/mgr_util: .mgr pool may be created to early causing spurious PG_DEGRADED warnings h3. 2021 June 14 https://pulpito.ceph.com/pdonnell-2021-06-14_20:53:05-fs-wip-pdonnell-testing-20210614.173325-distro-basic-smithi/ Some Ubuntu 20.04 upgrade fallout. In particular, upgrade tests are failing due to missing packages for 18.04 Pacific. * https://tracker.ceph.com/issues/51169 qa: ubuntu 20.04 sys protections prevent multiuser file access in /tmp * https://tracker.ceph.com/issues/51228 qa: rmdir: failed to remove 'a/.snap/*': No such file or directory * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/51183 qa: FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/ceph/3fab6bea-f243-47a4-a956-8c03a62b61b5.client4721/mds_sessions' * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/51182 pybind/mgr/snap_schedule: Invalid command: Unexpected argument 'fs=cephfs' * https://tracker.ceph.com/issues/51229 qa: test_multi_snap_schedule list difference failure * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing h3. 2021 June 13 https://pulpito.ceph.com/pdonnell-2021-06-12_02:45:35-fs-wip-pdonnell-testing-20210612.002809-distro-basic-smithi/ Some Ubuntu 20.04 upgrade fallout. In particular, upgrade tests are failing due to missing packages for 18.04 Pacific. * https://tracker.ceph.com/issues/51169 qa: ubuntu 20.04 sys protections prevent multiuser file access in /tmp * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/51182 pybind/mgr/snap_schedule: Invalid command: Unexpected argument 'fs=cephfs' * https://tracker.ceph.com/issues/51183 qa: FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/ceph/3fab6bea-f243-47a4-a956-8c03a62b61b5.client4721/mds_sessions' * https://tracker.ceph.com/issues/51197 qa: [WRN] Scrub error on inode 0x10000001520 (/client.0/tmp/t/linux-5.4/Documentation/driver-api) see mds.f log and `damage ls` output for details * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed h3. 2021 June 11 https://pulpito.ceph.com/pdonnell-2021-06-11_18:02:10-fs-wip-pdonnell-testing-20210611.162716-distro-basic-smithi/ Some Ubuntu 20.04 upgrade fallout. In particular, upgrade tests are failing due to missing packages for 18.04 Pacific. * https://tracker.ceph.com/issues/51169 qa: ubuntu 20.04 sys protections prevent multiuser file access in /tmp * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/48771 qa: iogen: workload fails to cause balancing * https://tracker.ceph.com/issues/43216 MDSMonitor: removes MDS coming out of quorum election * https://tracker.ceph.com/issues/51182 pybind/mgr/snap_schedule: Invalid command: Unexpected argument 'fs=cephfs' * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/51183 qa: FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/ceph/3fab6bea-f243-47a4-a956-8c03a62b61b5.client4721/mds_sessions' * https://tracker.ceph.com/issues/51184 qa: fs:bugs does not specify distro h3. 2021 June 03 https://pulpito.ceph.com/pdonnell-2021-06-03_03:40:33-fs-wip-pdonnell-testing-20210603.020013-distro-basic-smithi/ * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/50016 qa: test_damage: "RuntimeError: 2 mutations had unexpected outcomes" * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/50622 (regression) msg: active_connections regression * https://tracker.ceph.com/issues/49845#note-2 (regression) qa: failed umount in test_volumes * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/43216 MDSMonitor: removes MDS coming out of quorum election h3. 2021 May 18 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210518.214114 Regression in testing kernel caused some failures. Ilya fixed those and rerun looked better. Some odd new noise in the rerun relating to packaging and "No module named 'tasks.ceph'". * https://tracker.ceph.com/issues/50824 qa: snaptest-git-ceph bus error * https://tracker.ceph.com/issues/50622 (regression) msg: active_connections regression * https://tracker.ceph.com/issues/49845#note-2 (regression) qa: failed umount in test_volumes * https://tracker.ceph.com/issues/48203 (stock kernel update required) qa: quota failure h3. 2021 May 18 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210518.025642 * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/45591 mgr: FAILED ceph_assert(daemon != nullptr) * https://tracker.ceph.com/issues/50866 osd: stat mismatch on objects * https://tracker.ceph.com/issues/50016 qa: test_damage: "RuntimeError: 2 mutations had unexpected outcomes" * https://tracker.ceph.com/issues/50867 qa: fs:mirror: reduced data availability * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/50622 (regression) msg: active_connections regression * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50868 qa: "kern.log.gz already exists; not overwritten" * https://tracker.ceph.com/issues/50870 qa: test_full: "rm: cannot remove 'large_file_a': Permission denied" h3. 2021 May 11 https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210511.232042 * one class of failures caused by PR * https://tracker.ceph.com/issues/48812 qa: test_scrub_pause_and_resume_with_abort failure * https://tracker.ceph.com/issues/50390 mds: monclient: wait_auth_rotating timed out after 30 * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/50224 qa: test_mirroring_init_failure_with_recovery failure * https://tracker.ceph.com/issues/50622 (regression) msg: active_connections regression * https://tracker.ceph.com/issues/50825 qa: snaptest-git-ceph hang during mon thrashing v2 * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/50823 qa: RuntimeError: timeout waiting for cluster to stabilize h3. 2021 May 14 https://pulpito.ceph.com/pdonnell-2021-05-14_21:45:42-fs-master-distro-basic-smithi/ * https://tracker.ceph.com/issues/48812 qa: test_scrub_pause_and_resume_with_abort failure * https://tracker.ceph.com/issues/50821 qa: untar_snap_rm failure during mds thrashing * https://tracker.ceph.com/issues/50622 (regression) msg: active_connections regression * https://tracker.ceph.com/issues/50822 qa: testing kernel patch for client metrics causes mds abort * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50823 qa: RuntimeError: timeout waiting for cluster to stabilize * https://tracker.ceph.com/issues/50824 qa: snaptest-git-ceph bus error * https://tracker.ceph.com/issues/50825 qa: snaptest-git-ceph hang during mon thrashing v2 * https://tracker.ceph.com/issues/50826 kceph: stock RHEL kernel hangs on snaptests with mon|osd thrashers h3. 2021 May 01 https://pulpito.ceph.com/pdonnell-2021-05-01_09:07:09-fs-wip-pdonnell-testing-20210501.040415-distro-basic-smithi/ * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/50281 qa: untar_snap_rm timeout * https://tracker.ceph.com/issues/48203 (stock kernel update required) qa: quota failure * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50390 mds: monclient: wait_auth_rotating timed out after 30 * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/50622 (regression) msg: active_connections regression * https://tracker.ceph.com/issues/45591 mgr: FAILED ceph_assert(daemon != nullptr) * https://tracker.ceph.com/issues/50221 qa: snaptest-git-ceph failure in git diff * https://tracker.ceph.com/issues/50016 qa: test_damage: "RuntimeError: 2 mutations had unexpected outcomes" h3. 2021 Apr 15 https://pulpito.ceph.com/pdonnell-2021-04-15_01:35:57-fs-wip-pdonnell-testing-20210414.230315-distro-basic-smithi/ * https://tracker.ceph.com/issues/50281 qa: untar_snap_rm timeout * https://tracker.ceph.com/issues/50220 qa: dbench workload timeout * https://tracker.ceph.com/issues/50246 mds: failure replaying journal (EMetaBlob) * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/50016 qa: test_damage: "RuntimeError: 2 mutations had unexpected outcomes" * https://tracker.ceph.com/issues/50222 osd: 5.2s0 deep-scrub : stat mismatch * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/49845 qa: failed umount in test_volumes * https://tracker.ceph.com/issues/37808 osd: osdmap cache weak_refs assert during shutdown * https://tracker.ceph.com/issues/50387 client: fs/snaps failure * https://tracker.ceph.com/issues/50389 mds: "cluster [ERR] Error recovering journal 0x203: (2) No such file or directory" in cluster log" * https://tracker.ceph.com/issues/50216 qa: "ls: cannot access 'lost+found': No such file or directory" * https://tracker.ceph.com/issues/50390 mds: monclient: wait_auth_rotating timed out after 30 h3. 2021 Apr 08 https://pulpito.ceph.com/pdonnell-2021-04-08_22:42:24-fs-wip-pdonnell-testing-20210408.192301-distro-basic-smithi/ * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/50016 qa: test_damage: "RuntimeError: 2 mutations had unexpected outcomes" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/50279 qa: "Replacing daemon mds.b as rank 0 with standby daemon mds.c" * https://tracker.ceph.com/issues/50246 mds: failure replaying journal (EMetaBlob) * https://tracker.ceph.com/issues/48365 qa: ffsb build failure on CentOS 8.2 * https://tracker.ceph.com/issues/50216 qa: "ls: cannot access 'lost+found': No such file or directory" * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50280 cephadm: RuntimeError: uid/gid not found * https://tracker.ceph.com/issues/50281 qa: untar_snap_rm timeout h3. 2021 Apr 08 https://pulpito.ceph.com/pdonnell-2021-04-08_04:31:36-fs-wip-pdonnell-testing-20210408.024225-distro-basic-smithi/ https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20210408.142238 (with logic inversion / QA fix) * https://tracker.ceph.com/issues/50246 mds: failure replaying journal (EMetaBlob) * https://tracker.ceph.com/issues/50250 mds: "log [WRN] : Scrub error on inode 0x10000004506 (/client.0/tmp/clients/client3/~dmtmp/COREL) see mds.a log and `damage ls` output for details" h3. 2021 Apr 07 https://pulpito.ceph.com/pdonnell-2021-04-07_02:12:41-fs-wip-pdonnell-testing-20210406.213012-distro-basic-smithi/ * https://tracker.ceph.com/issues/50215 qa: "log [ERR] : error reading sessionmap 'mds2_sessionmap'" * https://tracker.ceph.com/issues/49466 qa: "Command failed on gibba030 with status 1: 'set -ex\nsudo dd of=/tmp/tmp.ZEeZBasJer'" * https://tracker.ceph.com/issues/50216 qa: "ls: cannot access 'lost+found': No such file or directory" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/49845 qa: failed umount in test_volumes * https://tracker.ceph.com/issues/50220 qa: dbench workload timeout * https://tracker.ceph.com/issues/50221 qa: snaptest-git-ceph failure in git diff * https://tracker.ceph.com/issues/50222 osd: 5.2s0 deep-scrub : stat mismatch * https://tracker.ceph.com/issues/50223 qa: "client.4737 isn't responding to mclientcaps(revoke)" * https://tracker.ceph.com/issues/50224 qa: test_mirroring_init_failure_with_recovery failure h3. 2021 Apr 01 https://pulpito.ceph.com/pdonnell-2021-04-01_00:45:34-fs-wip-pdonnell-testing-20210331.222326-distro-basic-smithi/ * https://tracker.ceph.com/issues/48772 qa: pjd: not ok 9, 44, 80 * https://tracker.ceph.com/issues/50177 osd: "stalled aio... buggy kernel or bad device?" * https://tracker.ceph.com/issues/48771 qa: iogen: workload fails to cause balancing * https://tracker.ceph.com/issues/49845 qa: failed umount in test_volumes * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/48805 mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/50178 qa: "TypeError: run() got an unexpected keyword argument 'shell'" * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed h3. 2021 Mar 24 https://pulpito.ceph.com/pdonnell-2021-03-24_23:26:35-fs-wip-pdonnell-testing-20210324.190252-distro-basic-smithi/ * https://tracker.ceph.com/issues/49500 qa: "Assertion `cb_done' failed." * https://tracker.ceph.com/issues/50019 qa: mount failure with cephadm "probably no MDS server is up?" * https://tracker.ceph.com/issues/50020 qa: "RADOS object not found (Failed to operate read op for oid cephfs_mirror)" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/48805 mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/48772 qa: pjd: not ok 9, 44, 80 * https://tracker.ceph.com/issues/50021 qa: snaptest-git-ceph failure during mon thrashing * https://tracker.ceph.com/issues/48771 qa: iogen: workload fails to cause balancing * https://tracker.ceph.com/issues/50016 qa: test_damage: "RuntimeError: 2 mutations had unexpected outcomes" * https://tracker.ceph.com/issues/49466 qa: "Command failed on gibba030 with status 1: 'set -ex\nsudo dd of=/tmp/tmp.ZEeZBasJer'" h3. 2021 Mar 18 https://pulpito.ceph.com/pdonnell-2021-03-18_13:46:31-fs-wip-pdonnell-testing-20210318.024145-distro-basic-smithi/ * https://tracker.ceph.com/issues/49466 qa: "Command failed on gibba030 with status 1: 'set -ex\nsudo dd of=/tmp/tmp.ZEeZBasJer'" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/48805 mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/49845 qa: failed umount in test_volumes * https://tracker.ceph.com/issues/49605 mgr: drops command on the floor * https://tracker.ceph.com/issues/48203 (stock kernel update required) qa: quota failure * https://tracker.ceph.com/issues/49928 client: items pinned in cache preventing unmount x2 h3. 2021 Mar 15 https://pulpito.ceph.com/pdonnell-2021-03-15_22:16:56-fs-wip-pdonnell-testing-20210315.182203-distro-basic-smithi/ * https://tracker.ceph.com/issues/49842 qa: stuck pkg install * https://tracker.ceph.com/issues/49466 qa: "Command failed on gibba030 with status 1: 'set -ex\nsudo dd of=/tmp/tmp.ZEeZBasJer'" * https://tracker.ceph.com/issues/49822 test: test_mirroring_command_idempotency (tasks.cephfs.test_admin.TestMirroringCommands) failure * https://tracker.ceph.com/issues/49240 terminate called after throwing an instance of 'std::bad_alloc' * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/49500 qa: "Assertion `cb_done' failed." * https://tracker.ceph.com/issues/49843 qa: fs/snaps/snaptest-upchildrealms.sh failure * https://tracker.ceph.com/issues/49845 qa: failed umount in test_volumes * https://tracker.ceph.com/issues/48805 mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/49605 mgr: drops command on the floor and failure caused by PR: https://github.com/ceph/ceph/pull/39969 h3. 2021 Mar 09 https://pulpito.ceph.com/pdonnell-2021-03-09_03:27:39-fs-wip-pdonnell-testing-20210308.214827-distro-basic-smithi/ * https://tracker.ceph.com/issues/49500 qa: "Assertion `cb_done' failed." * https://tracker.ceph.com/issues/48805 mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details" * https://tracker.ceph.com/issues/48773 qa: scrub does not complete * https://tracker.ceph.com/issues/45434 qa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failed * https://tracker.ceph.com/issues/49240 terminate called after throwing an instance of 'std::bad_alloc' * https://tracker.ceph.com/issues/49466 qa: "Command failed on gibba030 with status 1: 'set -ex\nsudo dd of=/tmp/tmp.ZEeZBasJer'" * https://tracker.ceph.com/issues/49684 qa: fs:cephadm mount does not wait for mds to be created * https://tracker.ceph.com/issues/48771 qa: iogen: workload fails to cause balancing