Project

General

Profile

Actions

Bug #38652

closed

mds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug on RHEL 7.5

Added by Patrick Donnelly about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Failure: "2019-03-07 20:19:55.015557 mon.b (mon.0) 310 : cluster [WRN] Health check failed: 1 clients failing to respond to capability release (MDS_CLIENT_LATE_RELEASE)" in cluster log
23 jobs: ['3679196', '3679259', '3679211', '3679328', '3679383', '3679117', '3679469', '3679375', '3679414', '3679274', '3679156', '3679289', '3679282', '3679297', '3679242', '3679159', '3679070', '3679446', '3679461', '3679125', '3679417', '3679110', '3679312']
suites intersection: ['conf/{client.yaml', 'fuse-default-perm-no.yaml}', 'mds.yaml', 'mon.yaml', 'mount/kclient/{mount.yaml', 'ms-die-on-skipped.yaml}}', 'multimds/basic/{begin.yaml', 'osd.yaml}', 'overrides/{basic/{frag_enable.yaml', 'q_check_counter/check_counter.yaml', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['clusters/3-mds.yaml', 'clusters/9-mds.yaml', 'conf/{client.yaml', 'fuse-default-perm-no.yaml}', 'inline/no.yaml', 'inline/yes.yaml', 'k-distro.yaml}', 'mds.yaml', 'mon.yaml', 'mount/kclient/{mount.yaml', 'ms-die-on-skipped.yaml}}', 'multimds/basic/{begin.yaml', 'objectstore-ec/bluestore-bitmap.yaml', 'objectstore-ec/bluestore-comp-ec-root.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd.yaml}', 'overrides/{basic/{frag_enable.yaml', 'overrides/{distro/random/{k-testing.yaml', 'overrides/{distro/rhel/{7.5.yaml', 'q_check_counter/check_counter.yaml', 'supported$/{rhel_latest.yaml}}', 'supported$/{ubuntu_16.04.yaml}}', 'tasks/cfuse_workunit_kernel_untar_build.yaml}', 'tasks/cfuse_workunit_misc.yaml}', 'tasks/cfuse_workunit_suites_blogbench.yaml}', 'tasks/cfuse_workunit_suites_dbench.yaml}', 'tasks/cfuse_workunit_suites_ffsb.yaml}', 'tasks/cfuse_workunit_suites_fsstress.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

Ignore k-testing and inline/yes.yaml above, the scrape tool merged this test failure with the above: http://pulpito.ceph.com/pdonnell-2019-03-07_15:13:09-multimds-wip-pdonnell-testing-20190307.041917-distro-basic-smithi/3679312/

From: http://pulpito.ceph.com/pdonnell-2019-03-07_15:13:09-multimds-wip-pdonnell-testing-20190307.041917-distro-basic-smithi/

Zheng suggested it might be related to inline data:

2019-03-07 16:00:35.377 7fde44e5e700  7 mds.1.locker issue_caps loner client.4591 allowed=pAsxLsXsxFsxcrwb, xlocker allowed=pAsxLsXsxFsxcrwb, others allowed=pLs on [inode 0x100000001f0 [2,head] /client.0/tmp/blogbench-1.0/src/blogtest_in/blog-2/article-59.xml auth{0=1} v190 dirtyparent s=0 n(v0 rc2019-03-07 16:00:35.311945 1=1+0)/n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4591=0-4194304@1} caps={4591=pAsxLsXsxFsxcrwb/pAsxXsxFxcwb@1},l=4591 | importingcaps=1 caps=1 dirtyrstat=1 dirtyparent=1 replicated=1 dirty=1 0x5599cb630000]
2019-03-07 16:00:35.377 7fde44e5e700 20 mds.1.locker  client.4591 pending pAsxLsXsxFsxcrwb allowed pAsxLsXsxFsxcb wanted pAsxXsxFxcwb
2019-03-07 16:00:35.377 7fde44e5e700  7 mds.1.locker    sending MClientCaps to client.4591 seq 2 new pending pAsxLsXsxFsxcb was pAsxLsXsxFsxcrwb
2019-03-07 16:00:35.377 7fde44e5e700 20 mds.1.cache.ino(0x100000001f0) encode_cap_message pfile 1 pauth 0 plink 0 pxattr 0 ctime 2019-03-07 16:00:35.311945
2019-03-07 16:00:35.377 7fde44e5e700 10 mds.1.7 send_message_client_counted client.4591 seq 117 client_caps(revoke ino 0x100000001f0 151 seq 2 caps=pAsxLsXsxFsxcb dirty=- wanted=pAsxXsxFxcwb follows 0 mseq 1 size 0/4194304 ts 1/18446744073709551615 mtime 2019-03-07 16:00:35.306945) v11

"Inline data related bug. You can see Frw magically disappeared. I suspect session->get_connection() is null." -Zheng


Related issues 2 (0 open2 closed)

Has duplicate CephFS - Bug #38636: Inline data compatibly check in Locker::issue_caps is buggyDuplicate03/08/2019

Actions
Copied to CephFS - Backport #39225: nautilus: mds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug on RHEL 7.5ResolvedNathan CutlerActions
Actions

Also available in: Atom PDF