Project

General

Profile

Actions

Bug #23210

closed

ceph-fuse: exported nfs get "stale file handle" when mds migrating

Added by lei gu about 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
NFS (Linux Kernel)
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

[Test OS]:
1.CentOS7.2(ceph cluster)
2.CentOS7.2(client)

[Ceph version]: 10.2.7

[Problem description]
1.Single active mds.
2.Using ceph-fuse mount cephFS in a node which is a part of ceph cluster.
3.Export a subdirectory of cephFS as nfs.
4.Mount nfs in client, and use "dd if=/dev/zero of=/mnt/nfs/test bs=1M count=102400" to write a test file.
5.Switch active mds manually a few times, dd will get a "Stale file handle" error and IO interrupt

I see issues 21995:[[http://tracker.ceph.com/issues/21995]]. It seems related to this problem, but i still can not handle this.


Files

ceph-client.admin.log.rar (525 KB) ceph-client.admin.log.rar lei gu, 03/08/2018 06:05 AM
Actions #1

Updated by Zheng Yan about 6 years ago

we haven't backported that fix

Actions #2

Updated by lei gu about 6 years ago

I have backport it and make a full test.

The "Stale file handle" problem disappear, but it still have some problems( i based on v10.2.7):
1.I enable posix_acl, so I will get "Permission denied" error;
2.When I disable posix_acl, I will get "Not a directory" error.

Is there other relatied issues i ignored?

Actions #3

Updated by Zheng Yan about 6 years ago

looks like kernel tried to get parent for non-directory inode. which version of kernel do you use?

Actions #4

Updated by Jos Collin about 6 years ago

  • Status changed from New to Need More Info
Actions #5

Updated by lei gu about 6 years ago

My kernel release version is 3.10.0-327.el7.x86_64.

I have tried to find out what causes these error:
1."Permission denied" is related to posix_acl, which happens in function 'Inode::check_mode', it seems inode mode changed when mds migrating;
2."Not a directory" is happed in function 'Client::_lookup', because of 'dir->is_dir()' return false(looks like special directory ".").

Actions #6

Updated by Zheng Yan about 6 years ago

I checked code of 3.10.0-327.el7 kernel. It calls get_parent() for non-directory inode. Recent kernels only call get_parent() for directory inode.

Actions #7

Updated by Zheng Yan about 6 years ago

recent rhel kernel does not call get_parent() for non-directory inode. please upgrade your kernel

Actions #8

Updated by lei gu about 6 years ago

OK, I will try and feedback later. Could you pleaese tell me which rhel kernel version changed the get_parent() call or do what's your kernel version?

Actions #9

Updated by lei gu about 6 years ago

I have test the patch on kernel 4.15.0 using dd and vdbench. dd is OK, vdbench still have the "Stale file handle" problem.

[vdbench config]

messagescan=no
hd=default,vdbench=/root/shawn/vdbench50406,user=root,shell=ssh
hd=hd1,system=192.169.30.210
fsd=default,depth=1,width=1,files=10000,size=512M,shared=no,openflag=o_direct
fsd=fsd1,anchor=/mnt/nfs
fwd=default,rdpct=60,xfersizes=2M,fileio=random, fileselect=sequential,threads=4
fwd=fwd1,fsd=fsd1,host=hd1
rd=rd1,fwd=fwd1,fwdrate=200,format=yes,warmup=5,elapsed=28800,interval=1

[vdbech error info]

Write error using file /mnt/nfs/vdb.1_1.dir/vdb_f0060.file
Error:         ESTALE: 'Stale NFS file handle'
lba:           340918272
xfersize:      131072
blocks_done:   2583
bytes_done:    338558976
open_for_read: false
fhandle:       21

[ceph-fuse log]

2018-03-06 23:06:39.477988 7f07e75a2700  3 client.641150 lookup_ino enter(100000000c5) =
2018-03-06 23:06:39.478022 7f07e75a2700 10 client.641150 choose_target_mds resend_mds specified as mds.
0
2018-03-06 23:06:39.478027 7f07e75a2700 10 client.641150 send_request rebuilding request 487 for mds.0
2018-03-06 23:06:39.478033 7f07e75a2700 10 client.641150 send_request client_request(unknown.0:487 look
upino #100000000c5 2018-03-06 23:06:39.478018) v3 to mds.0
2018-03-06 23:06:39.478892 7f07f1cdc700 10 client.641150 insert_trace from 2018-03-06 23:06:39.478032 m
ds.0 is_target=1 is_dentry=0
2018-03-06 23:06:39.478897 7f07f1cdc700 10 client.641150  features 0x7ffffffefdfbfff
2018-03-06 23:06:39.478899 7f07f1cdc700 10 client.641150 update_snap_trace len 48
2018-03-06 23:06:39.478903 7f07f1cdc700 10 client.641150 update_snap_trace snaprealm(1 nref=55 c=0 seq=
1 parent=0 my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING
2018-03-06 23:06:39.478907 7f07f1cdc700 10 client.641150  hrm  is_target=1 is_dentry=0
2018-03-06 23:06:39.478916 7f07f1cdc700 10 client.641150 update_inode_file_bits 100000000c5.head(faked_
ino=0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=0/0 mtime=0.000000 caps=- objectset[100000000
c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300) - mtime 2018-03-06 22:59:43.709871
2018-03-06 23:06:39.478924 7f07f1cdc700 10 client.641150 size 0 -> 68
2018-03-06 23:06:39.478928 7f07f1cdc700 10 client.641150 add_update_cap issued - -> pAsLsXsFscr from md
s.0 on 100000000c5.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018
-03-06 22:59:43.709871 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_
tx 0] 0x555cd0846300)
2018-03-06 23:06:39.478967 7f07e75a2700  3 client.641150 lookup_ino exit(100000000c5) = 0
2018-03-06 23:06:39.478974 7f07e75a2700  3 client.641150 ll_lookup 0x555cd0846300 .
2018-03-06 23:06:39.478977 7f07e75a2700 10 client.641150 _getattr mask AsXs issued=1
2018-03-06 23:06:39.478983 7f07e75a2700  3 client.641150 may_lookup 0x555cd0846300 = 0
2018-03-06 23:06:39.478987 7f07e75a2700 10 client.641150 _lookup 100000000c5.head(faked_ino=0 ref=3 ll_
ref=1 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(0=pAs
LsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300) . = -20
2018-03-06 23:06:39.478997 7f07e75a2700  3 client.641150 ll_lookup 0x555cd0846300 . -> -20 (0)
2018-03-06 23:06:39.479008 7f07f1cdc700 10 client.641150 put_inode on 100000000c5.head(faked_ino=0 ref=
3 ll_ref=1 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(
0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479021 7f07f1cdc700 10 client.641150 put_inode on 100000000c5.head(faked_ino=0 ref=
2 ll_ref=1 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(
0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479051 7f07e75a2700  3 client.641150 ll_forget 100000000c5 1
2018-03-06 23:06:39.479063 7f07e75a2700 10 client.641150 put_inode on 100000000c5.head(faked_ino=0 ref=
1 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLsXsFscr(
0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479073 7f07e75a2700 10 client.641150 remove_cap mds.0 on 100000000c5.head(faked_ino
=0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=pAsLs
XsFscr(0=pAsLsXsFscr) objectset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)
2018-03-06 23:06:39.479085 7f07e75a2700 10 client.641150 put_inode deleting 100000000c5.head(faked_ino=
0 ref=0 ll_ref=0 cap_refs={} open={} mode=100777 size=68/0 mtime=2018-03-06 22:59:43.709871 caps=- obje
ctset[100000000c5 ts 0/0 objects 0 dirty_or_tx 0] 0x555cd0846300)

I found "ll_lookup 0x555cd0846300 . -> -20 (0)". But I have no idea what happend.

Actions #10

Updated by Patrick Donnelly about 6 years ago

  • Assignee set to Zheng Yan
Actions #11

Updated by Zheng Yan about 6 years ago

could you try below patch

diff --git a/src/client/Client.cc b/src/client/Client.cc
index b6bc15dbae..ae7877b2ea 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -6083,11 +6083,6 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, InodeRef *target,
   int r = 0;
   Dentry *dn = NULL;

-  if (!dir->is_dir()) {
-    r = -ENOTDIR;
-    goto done;
-  }
-
   if (dname == "..") {
     if (dir->dentries.empty()) {
       MetaRequest *req = new MetaRequest(CEPH_MDS_OP_LOOKUPPARENT);
@@ -6116,6 +6111,11 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, InodeRef *target,
     goto done;
   }

+  if (!dir->is_dir()) {
+    r = -ENOTDIR;
+    goto done;
+  }
+
   if (dname.length() > NAME_MAX) {
     r = -ENAMETOOLONG;
     goto done;

Actions #12

Updated by lei gu about 6 years ago

I have tested the patch, error log likes "ll_lookup 0x555cd0846300 . -> -20 (0)" disappeared, but there still have "Stale file handle" error.

I upload the log of ceph-fuse after mds migration.

Actions #13

Updated by lei gu about 6 years ago

May this is related to ceph fuse acl. When I disable "client_acl_type" and enable "fuse_default_permissions", everything works well.

My ceph.conf is as below:

client_acl_type = posix_acl
fuse_default_permissions = false

Actions #14

Updated by lei gu about 6 years ago

After debugging kernel(version 4.14), I found something in call relation "exportfs_decode_fh()->fuse_fh_to_dentry()->fuse_get_dentry()->fuse_lookup_name()" that a non-directory inode was passed.

I think this problem just likes issue 21995. Is there any suggestions?

Actions #15

Updated by lei gu about 6 years ago

I add some debug info in the call relation above. So I get the log both of kernel and ceph-fuse when error happend.

The size of the log is a little big, so I upload it in google drive which url is [[https://drive.google.com/open?id=1kdehTiqGaPnl1xH1BlxlLiRpFirz3uCA]]

Actions #16

Updated by Zheng Yan about 6 years ago

but with above patch, "lookup ." on non-directory inode should works

Actions #17

Updated by Zheng Yan about 6 years ago

try modifying Client::ll_lookup, ignore may_lookup check for lookup . and lookup ..

Actions #18

Updated by Jos Collin about 6 years ago

  • Status changed from Need More Info to Fix Under Review
Actions #19

Updated by Patrick Donnelly about 6 years ago

  • Status changed from Fix Under Review to Resolved
  • Backport set to luminous
  • Component(FS) Client added
  • Component(FS) deleted (ceph-fuse)
Actions #20

Updated by Nathan Cutler about 6 years ago

  • Status changed from Resolved to Pending Backport

@Patrick, please confirm that this should be backported to luminous and which master PR.

Actions #21

Updated by Patrick Donnelly about 6 years ago

  • Status changed from Pending Backport to Resolved
  • Backport deleted (luminous)

Sorry, this should not be backported. ceph-fuse "support" for NFS is only for Mimic.

Actions

Also available in: Atom PDF