Bug #23210
closedceph-fuse: exported nfs get "stale file handle" when mds migrating
0%
Description
[Test OS]:
1.CentOS7.2(ceph cluster)
2.CentOS7.2(client)
[Ceph version]: 10.2.7
[Problem description]
1.Single active mds.
2.Using ceph-fuse mount cephFS in a node which is a part of ceph cluster.
3.Export a subdirectory of cephFS as nfs.
4.Mount nfs in client, and use "dd if=/dev/zero of=/mnt/nfs/test bs=1M count=102400" to write a test file.
5.Switch active mds manually a few times, dd will get a "Stale file handle" error and IO interrupt
I see issues 21995:[[http://tracker.ceph.com/issues/21995]]. It seems related to this problem, but i still can not handle this.
Files
Updated by lei gu about 6 years ago
I have backport it and make a full test.
The "Stale file handle" problem disappear, but it still have some problems( i based on v10.2.7):
1.I enable posix_acl, so I will get "Permission denied" error;
2.When I disable posix_acl, I will get "Not a directory" error.
Is there other relatied issues i ignored?
Updated by Zheng Yan about 6 years ago
looks like kernel tried to get parent for non-directory inode. which version of kernel do you use?
Updated by Jos Collin about 6 years ago
- Status changed from New to Need More Info
Updated by lei gu about 6 years ago
My kernel release version is 3.10.0-327.el7.x86_64.
I have tried to find out what causes these error:
1."Permission denied" is related to posix_acl, which happens in function 'Inode::check_mode', it seems inode mode changed when mds migrating;
2."Not a directory" is happed in function 'Client::_lookup', because of 'dir->is_dir()' return false(looks like special directory ".").
Updated by Zheng Yan about 6 years ago
I checked code of 3.10.0-327.el7 kernel. It calls get_parent() for non-directory inode. Recent kernels only call get_parent() for directory inode.
Updated by Zheng Yan about 6 years ago
recent rhel kernel does not call get_parent() for non-directory inode. please upgrade your kernel
Updated by lei gu about 6 years ago
OK, I will try and feedback later. Could you pleaese tell me which rhel kernel version changed the get_parent() call or do what's your kernel version?
Updated by lei gu about 6 years ago
I have test the patch on kernel 4.15.0 using dd and vdbench. dd is OK, vdbench still have the "Stale file handle" problem.
[vdbench config]
messagescan=no
hd=default,vdbench=/root/shawn/vdbench50406,user=root,shell=ssh
hd=hd1,system=192.169.30.210
fsd=default,depth=1,width=1,files=10000,size=512M,shared=no,openflag=o_direct
fsd=fsd1,anchor=/mnt/nfs
fwd=default,rdpct=60,xfersizes=2M,fileio=random, fileselect=sequential,threads=4
fwd=fwd1,fsd=fsd1,host=hd1
rd=rd1,fwd=fwd1,fwdrate=200,format=yes,warmup=5,elapsed=28800,interval=1
[vdbech error info]
Write error using file /mnt/nfs/vdb.1_1.dir/vdb_f0060.file
Error: ESTALE: 'Stale NFS file handle'
lba: 340918272
xfersize: 131072
blocks_done: 2583
bytes_done: 338558976
open_for_read: false
fhandle: 21
[ceph-fuse log]
2018-03-06 23:06:39.477988 7f07e75a2700 3 client.641150 lookup_ino enter(100000000c5) =
2018-03-06 23:06:39.478022 7f07e75a2700 10 client.641150 choose_target_mds resend_mds specified as mds.
0
2018-03-06 23:06:39.478027 7f07e75a2700 10 client.641150 send_request rebuilding request 487 for mds.0
2018-03-06 23:06:39.478033 7f07e75a2700 10 client.641150 send_request client_request(unknown.0:487 look
upino #100000000c5 2018-03-06 23:06:39.478018) v3 to mds.0
2018-03-06 23:06:39.478892 7f07f1cdc700 10 client.641150 insert_trace from 2018-03-06 23:06:39.478032 m
ds.0 is_target=1 is_dentry=0
2018-03-06 23:06:39.478897 7f07f1cdc700 10 client.641150 features 0x7ffffffefdfbfff
2018-03-06 23:06:39.478899 7f07f1cdc700 10 client.641150 update_snap_trace len 48
2018-03-06 23:06:39.478903 7f07f1cdc700 10 client.641150 update_snap_trace snaprealm(1 nref=55 c=0 seq=
1 parent=0 my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING
2018-03-06 23:06:39.478907 7f07f1cdc700 10 client.641150 hrm is_target=1 is_dentry=0
2018-03-06 23:06:39.478916 7f07f1cdc700 10 client.641150 update_inode_file_bits 100000000c5.head(faked_
ino=0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=0/0 mtime=0.000000 caps=- objectset[100000000
c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300) - mtime 2018-03-06 22:59:43.709871
2018-03-06 23:06:39.478924 7f07f1cdc700 10 client.641150 size 0 -> 68
2018-03-06 23:06:39.478928 7f07f1cdc700 10 client.641150 add_update_cap issued - -> pAsLsXsFscr from md
s.0 on 100000000c5.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018
-03-06 22:59:43.709871 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_
tx 0] 0x555cd0846300)
2018-03-06 23:06:39.478967 7f07e75a2700 3 client.641150 lookup_ino exit(100000000c5) = 0
2018-03-06 23:06:39.478974 7f07e75a2700 3 client.641150 ll_lookup 0x555cd0846300 .
2018-03-06 23:06:39.478977 7f07e75a2700 10 client.641150 _getattr mask AsXs issued=1
2018-03-06 23:06:39.478983 7f07e75a2700 3 client.641150 may_lookup 0x555cd0846300 = 0
2018-03-06 23:06:39.478987 7f07e75a2700 10 client.641150 _lookup 100000000c5.head(faked_ino=0 ref=3 ll_
ref=1 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(0=pAs
LsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300) . = -20
2018-03-06 23:06:39.478997 7f07e75a2700 3 client.641150 ll_lookup 0x555cd0846300 . -> -20 (0)
2018-03-06 23:06:39.479008 7f07f1cdc700 10 client.641150 put_inode on 100000000c5.head(faked_ino=0 ref=
3 ll_ref=1 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(
0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479021 7f07f1cdc700 10 client.641150 put_inode on 100000000c5.head(faked_ino=0 ref=
2 ll_ref=1 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(
0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479051 7f07e75a2700 3 client.641150 ll_forget 100000000c5 1
2018-03-06 23:06:39.479063 7f07e75a2700 10 client.641150 put_inode on 100000000c5.head(faked_ino=0 ref=
1 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(
0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479073 7f07e75a2700 10 client.641150 remove_cap mds.0 on 100000000c5.head(faked_ino
=0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLs
XsFscr(0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479085 7f07e75a2700 10 client.641150 put_inode deleting 100000000c5.head(faked_ino=
0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=- obje
ctset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
I found "ll_lookup 0x555cd0846300 . -> -20 (0)". But I have no idea what happend.
Updated by Zheng Yan about 6 years ago
could you try below patch
diff --git a/src/client/Client.cc b/src/client/Client.cc index b6bc15dbae..ae7877b2ea 100644 --- a/src/client/Client.cc +++ b/src/client/Client.cc @@ -6083,11 +6083,6 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, InodeRef *target, int r = 0; Dentry *dn = NULL; - if (!dir->is_dir()) { - r = -ENOTDIR; - goto done; - } - if (dname == "..") { if (dir->dentries.empty()) { MetaRequest *req = new MetaRequest(CEPH_MDS_OP_LOOKUPPARENT); @@ -6116,6 +6111,11 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, InodeRef *target, goto done; } + if (!dir->is_dir()) { + r = -ENOTDIR; + goto done; + } + if (dname.length() > NAME_MAX) { r = -ENAMETOOLONG; goto done;
Updated by lei gu about 6 years ago
I have tested the patch, error log likes "ll_lookup 0x555cd0846300 . -> -20 (0)" disappeared, but there still have "Stale file handle" error.
I upload the log of ceph-fuse after mds migration.
Updated by lei gu about 6 years ago
May this is related to ceph fuse acl. When I disable "client_acl_type" and enable "fuse_default_permissions", everything works well.
My ceph.conf is as below:
client_acl_type = posix_acl
fuse_default_permissions = false
Updated by lei gu about 6 years ago
After debugging kernel(version 4.14), I found something in call relation "exportfs_decode_fh()->fuse_fh_to_dentry()->fuse_get_dentry()->fuse_lookup_name()" that a non-directory inode was passed.
I think this problem just likes issue 21995. Is there any suggestions?
Updated by lei gu about 6 years ago
I add some debug info in the call relation above. So I get the log both of kernel and ceph-fuse when error happend.
The size of the log is a little big, so I upload it in google drive which url is [[https://drive.google.com/open?id=1kdehTiqGaPnl1xH1BlxlLiRpFirz3uCA]]
Updated by Zheng Yan about 6 years ago
but with above patch, "lookup ." on non-directory inode should works
Updated by Zheng Yan about 6 years ago
try modifying Client::ll_lookup, ignore may_lookup check for lookup . and lookup ..
Updated by Jos Collin about 6 years ago
- Status changed from Need More Info to Fix Under Review
Updated by Patrick Donnelly about 6 years ago
- Status changed from Fix Under Review to Resolved
- Backport set to luminous
- Component(FS) Client added
- Component(FS) deleted (
ceph-fuse)
Updated by Nathan Cutler about 6 years ago
- Status changed from Resolved to Pending Backport
@Patrick, please confirm that this should be backported to luminous and which master PR.
Updated by Patrick Donnelly about 6 years ago
- Status changed from Pending Backport to Resolved
- Backport deleted (
luminous)
Sorry, this should not be backported. ceph-fuse "support" for NFS is only for Mimic.