Bug #22886
closedkclient: Test failure: test_full_same_file (tasks.cephfs.test_full.TestClusterFull)
0%
Description
kclient is slow to or does not release caps (I think?) and this leads to a test timeout waiting for purge.
/ceph/teuthology-archive/pdonnell-2018-01-30_23:38:56-kcephfs-wip-pdonnell-i22627-testing-basic-smithi/2129751/remote/smithi052/log/ceph-mds.a-s.log.gz
File is not purged until after client unmounts.
This is with my branch updating the kcephfs suite: https://github.com/ceph/ceph-ci/tree/wip-pdonnell-i22627
Updated by Patrick Donnelly about 6 years ago
These may be related:
Failure: Test failure: test_purge_queue_op_rate (tasks.cephfs.test_strays.TestStrays) 3 jobs: ['2129710', '2129760', '2129610'] suites intersection: ['debug/mds_client.yaml', 'dirfrag/frag_enable.yaml', 'frag_enable.yaml', 'kcephfs/recovery/{clusters/1-mds-4-client.yaml', 'log-config.yaml', 'mounts/kmounts.yaml', 'osd-asserts.yaml', 'overrides/{debug.yaml', 'tasks/strays.yaml', 'whitelist_health.yaml', 'whitelist_health.yaml}', 'whitelist_wrongly_marked_down.yaml}'] suites union: ['debug/mds_client.yaml', 'dirfrag/frag_enable.yaml', 'frag_enable.yaml', 'kcephfs/recovery/{clusters/1-mds-4-client.yaml', 'log-config.yaml', 'mounts/kmounts.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd-asserts.yaml', 'overrides/{debug.yaml', 'tasks/strays.yaml', 'whitelist_health.yaml', 'whitelist_health.yaml}', 'whitelist_wrongly_marked_down.yaml}']
Updated by Zheng Yan about 6 years ago
it seems to be caused by delay dirty metadata writeback
Updated by Zheng Yan about 6 years ago
patch https://github.com/ceph/ceph-client/commit/b9e5d03b6e64972164bff45ae3adb64a23e7568a fixes this issue.
but other test case still fails.
http://qa-proxy.ceph.com/teuthology/zyan-2018-02-02_08:57:53-kcephfs-wip-pdonnell-i22627-testing-basic-mira/2141612/teuthology.log
Updated by Zheng Yan about 6 years ago
this patch https://github.com/ceph/ceph-ci/commit/2fff0eb4c491f04803debec7c0f5de66e3825ee7 seems to make full tests pass on kclient.
but the test still failed with error
{client.0-kernel-sha1: b9e5d03b6e64972164bff45ae3adb64a23e7568a, client.1-kernel-sha1: b9e5d03b6e64972164bff45ae3adb64a23e7568a, client.2-kernel-sha1: b9e5d03b6e64972164bff45ae3adb64a23e7568a, client.3-kernel-sha1: b9e5d03b6e64972164bff45ae3adb64a23e7568a, description: 'kcephfs/recovery/{clusters/1-mds-4-client.yaml debug/mds_client.yaml dirfrag/frag_enable.yaml mounts/kmounts.yaml objectstore-ec/bluestore-comp.yaml overrides/{debug.yaml frag_enable.yaml log-config.yaml osd-asserts.yaml whitelist_health.yaml whitelist_wrongly_marked_down.yaml} tasks/mds-full.yaml whitelist_health.yaml}', duration: 1410.9979951381683, failure_reason: '"2018-02-02 14:19:25.023368 mon.a mon.0 172.21.4.108:6789/0 230 : cluster [WRN] Health check failed: pauserd,pausewr flag(s) set (OSDMAP_FLAGS)" in cluster log', flavor: basic, mon.a-kernel-sha1: b9e5d03b6e64972164bff45ae3adb64a23e7568a, mon.b-kernel-sha1: b9e5d03b6e64972164bff45ae3adb64a23e7568a, owner: scheduled_zyan@teuthology, success: false}
Updated by Patrick Donnelly about 6 years ago
- Status changed from New to In Progress
Yes, that error has been happening for the mds-full tests now with and without kclient. I'll look into that today. Thanks Zheng!
Updated by Patrick Donnelly about 6 years ago
- Status changed from In Progress to Pending Backport
- Backport set to luminous
Updated by Patrick Donnelly about 6 years ago
Updated by Nathan Cutler about 6 years ago
- Copied to Backport #22966: luminous: kclient: Test failure: test_full_same_file (tasks.cephfs.test_full.TestClusterFull) added
Updated by Nathan Cutler about 6 years ago
- Status changed from Pending Backport to Resolved