Actions
Bug #38652
closedmds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug on RHEL 7.5
% Done:
0%
Source:
Q/A
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Failure: "2019-03-07 20:19:55.015557 mon.b (mon.0) 310 : cluster [WRN] Health check failed: 1 clients failing to respond to capability release (MDS_CLIENT_LATE_RELEASE)" in cluster log 23 jobs: ['3679196', '3679259', '3679211', '3679328', '3679383', '3679117', '3679469', '3679375', '3679414', '3679274', '3679156', '3679289', '3679282', '3679297', '3679242', '3679159', '3679070', '3679446', '3679461', '3679125', '3679417', '3679110', '3679312'] suites intersection: ['conf/{client.yaml', 'fuse-default-perm-no.yaml}', 'mds.yaml', 'mon.yaml', 'mount/kclient/{mount.yaml', 'ms-die-on-skipped.yaml}}', 'multimds/basic/{begin.yaml', 'osd.yaml}', 'overrides/{basic/{frag_enable.yaml', 'q_check_counter/check_counter.yaml', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}'] suites union: ['clusters/3-mds.yaml', 'clusters/9-mds.yaml', 'conf/{client.yaml', 'fuse-default-perm-no.yaml}', 'inline/no.yaml', 'inline/yes.yaml', 'k-distro.yaml}', 'mds.yaml', 'mon.yaml', 'mount/kclient/{mount.yaml', 'ms-die-on-skipped.yaml}}', 'multimds/basic/{begin.yaml', 'objectstore-ec/bluestore-bitmap.yaml', 'objectstore-ec/bluestore-comp-ec-root.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd.yaml}', 'overrides/{basic/{frag_enable.yaml', 'overrides/{distro/random/{k-testing.yaml', 'overrides/{distro/rhel/{7.5.yaml', 'q_check_counter/check_counter.yaml', 'supported$/{rhel_latest.yaml}}', 'supported$/{ubuntu_16.04.yaml}}', 'tasks/cfuse_workunit_kernel_untar_build.yaml}', 'tasks/cfuse_workunit_misc.yaml}', 'tasks/cfuse_workunit_suites_blogbench.yaml}', 'tasks/cfuse_workunit_suites_dbench.yaml}', 'tasks/cfuse_workunit_suites_ffsb.yaml}', 'tasks/cfuse_workunit_suites_fsstress.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
Ignore k-testing and inline/yes.yaml above, the scrape tool merged this test failure with the above: http://pulpito.ceph.com/pdonnell-2019-03-07_15:13:09-multimds-wip-pdonnell-testing-20190307.041917-distro-basic-smithi/3679312/
Zheng suggested it might be related to inline data:
2019-03-07 16:00:35.377 7fde44e5e700 7 mds.1.locker issue_caps loner client.4591 allowed=pAsxLsXsxFsxcrwb, xlocker allowed=pAsxLsXsxFsxcrwb, others allowed=pLs on [inode 0x100000001f0 [2,head] /client.0/tmp/blogbench-1.0/src/blogtest_in/blog-2/article-59.xml auth{0=1} v190 dirtyparent s=0 n(v0 rc2019-03-07 16:00:35.311945 1=1+0)/n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4591=0-4194304@1} caps={4591=pAsxLsXsxFsxcrwb/pAsxXsxFxcwb@1},l=4591 | importingcaps=1 caps=1 dirtyrstat=1 dirtyparent=1 replicated=1 dirty=1 0x5599cb630000] 2019-03-07 16:00:35.377 7fde44e5e700 20 mds.1.locker client.4591 pending pAsxLsXsxFsxcrwb allowed pAsxLsXsxFsxcb wanted pAsxXsxFxcwb 2019-03-07 16:00:35.377 7fde44e5e700 7 mds.1.locker sending MClientCaps to client.4591 seq 2 new pending pAsxLsXsxFsxcb was pAsxLsXsxFsxcrwb 2019-03-07 16:00:35.377 7fde44e5e700 20 mds.1.cache.ino(0x100000001f0) encode_cap_message pfile 1 pauth 0 plink 0 pxattr 0 ctime 2019-03-07 16:00:35.311945 2019-03-07 16:00:35.377 7fde44e5e700 10 mds.1.7 send_message_client_counted client.4591 seq 117 client_caps(revoke ino 0x100000001f0 151 seq 2 caps=pAsxLsXsxFsxcb dirty=- wanted=pAsxXsxFxcwb follows 0 mseq 1 size 0/4194304 ts 1/18446744073709551615 mtime 2019-03-07 16:00:35.306945) v11
"Inline data related bug. You can see Frw magically disappeared. I suspect session->get_connection() is null." -Zheng
Updated by Patrick Donnelly about 5 years ago
- Has duplicate Bug #38636: Inline data compatibly check in Locker::issue_caps is buggy added
Updated by Patrick Donnelly about 5 years ago
See also Zheng's analysis in the ticket he opened: #38636
Updated by Zheng Yan about 5 years ago
- Status changed from 12 to Fix Under Review
- Pull request ID set to 26811
Updated by Zheng Yan about 5 years ago
new issue that can cause this warning (file lock become sync state while Fcb is issued)
/ceph/teuthology-archive/pdonnell-2019-03-16_00:19:15-multimds-wip-pdonnell-testing-20190315.213331-distro-basic-smithi/3730992/
2019-03-17 13:21:55.857 1e672700 20 mds.2.migrator did replicate_relax_locks, now [inode 0x2000000039a [2,head] /client.0/tmp/clients/client9/~dmtmp/PARADOX/COURSES.DB auth v103 dirtyparent s=260096 n(v0 rc2019-03-17 13:19:01.074278 b260096 1=1+0) (iauth excl) (ixattr excl) (iversion lock) cr={4564=0-4194304@1} caps={4564=pAsxLsXsxFcb/pAsxXsxFsxcrwb@5},l=4564 | ptrwaiter=2 lock=0 caps=1 dirtyparent=1 replicated=0 dirty=1 waiter=0 authpin=0 0x210b9c90] 2019-03-17 13:21:55.858 1e672700 20 mds.2.migrator encode_export_inode_caps [inode 0x2000000039a [2,head] /client.0/tmp/clients/client9/~dmtmp/PARADOX/COURSES.DB auth v103 dirtyparent s=260096 n(v0 rc2019-03-17 13:19:01.074278 b260096 1=1+0) (iauth excl) (ixattr excl) (iversion lock) cr={4564=0-419
Updated by Patrick Donnelly about 5 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport changed from nautilus,mimic,luminous to nautilus
Updated by Nathan Cutler about 5 years ago
- Copied to Backport #39225: nautilus: mds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug on RHEL 7.5 added
Updated by Nathan Cutler about 5 years ago
- Pull request ID changed from 26811 to 26881
Updated by Nathan Cutler about 5 years ago
- Status changed from Pending Backport to Resolved
Actions