Project

General

Profile

Bug #37333

fuse client can't read file due to can't acquire Fr

Added by Ivan Guan 24 days ago. Updated 1 day ago.

Status:
Need Review
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
2.00 h
Source:
Community (dev)
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:

Description

ceph version: jewel:10.2.2

logs:
client.log

...
2018-11-11 07:38:28.031066 7f2093011700  3 client.3984608 ll_getattr 10000d51941.head
2018-11-11 07:38:28.032805 7f2093011700  3 client.3984608 ll_getattr 10000d51941.head = 0
2018-11-11 07:38:28.032926 7f2093812700  3 client.3984608 ll_open 10000d51941.head 32768
2018-11-11 07:38:28.034921 7f2093812700  3 client.3984608 ll_open 10000d51941.head 32768 = 0 (0x7f20c5a22000)
2018-11-11 07:38:28.039225 7f208e303700  3 client.3984608 ll_read 0x7f20c5a22000 10000d51941  0~131072
2018-11-11 07:38:28.039247 7f208e303700  5 client.3984608 waiting for caps need Fr want Fc on 10000d51941.head(faked_ino=0 ref=3 ll_ref=4 cap_refs={} open={1=1} mode=100644 size=352877624/0 mtime=2018-11-11 07:38:15.229476 mds_cap_wanted- caps=pAsLsXs(0=pAsLsXs/3/3) objectset[10000d51941 ts 0/0 objects 0 dirty_or_tx 0] parents=0x7f20c91bd790 0x7f20c8bcea00)
2018-11-11 07:38:28.039360 7f2093011700  3 client.3984608 ll_read 0x7f20c5a22000 10000d51941  131072~131072
2018-11-11 07:38:28.039380 7f2093011700  5 client.3984608 waiting for caps need Fr want Fc on 10000d51941.head(faked_ino=0 ref=3 ll_ref=4 cap_refs={} open={1=1} mode=100644 size=352877624/0 mtime=2018-11-11 07:38:15.229476 mds_cap_wanted- caps=pAsLsXs(0=pAsLsXs/3/3) objectset[10000d51941 ts 0/0 objects 0 dirty_or_tx 0] parents=0x7f20c91bd790 0x7f20c8bcea00)
...

mds log:

2018-11-11 07:38:28.034928 7f512d70c700  1 -- 10.1.2.2:6831/15857 <== client.3984608 10.1.0.3:0/661089634 1281176 ==== client_request(client.3984608:560675 getattr pAsLsXsFs #10000d51941 2018-11-11 07:38:28.035134) v3 ==== 122+0+0 (244172323 0 0) 0x7f53b5737c80 con 0x7f5384807b00
2018-11-11 07:38:28.034965 7f512d70c700  7 mds.0.cache request_start request(client.3984608:560675 cr=0x7f53b5737c80)
2018-11-11 07:38:28.034966 7f512d70c700  7 mds.0.server dispatch_client_request client_request(client.3984608:560675 getattr pAsLsXsFs #10000d51941 2018-11-11 07:38:28.035134) v3
2018-11-11 07:38:28.034971 7f512d70c700 10 mds.0.server rdlock_path_pin_ref request(client.3984608:560675 cr=0x7f53b5737c80) #10000d51941
2018-11-11 07:38:28.034974 7f512d70c700  7 mds.0.cache traverse: opening base ino 10000d51941 snap head
2018-11-11 07:38:28.034976 7f512d70c700 10 mds.0.cache path_traverse finish on snapid head
2018-11-11 07:38:28.034977 7f512d70c700 10 mds.0.server ref is [inode 10000d51941 [2,head] /rendong/project/66gene/1822975/2.mapping/1822976.dup.bam auth v47 dirtyparent s=352877624 n(v0 b352877624 1=1+0) (ifile xsyn) (iversion lock) cr={3983320=0-1073741824@1} caps={3983320=pAsLsXsFcb/pFsxcrb@43,3983323=pAsLsXs/pFscr@7,3984485=pAsLsXs/pFscr@13,3984608=pAsLsXs/pFscr@6,4147782=pAsLsXs/-@5},l=3983320(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x7f521f8ad540 mode: 33188]

2018-11-11 07:38:28.035710 7f512d70c700  7 mds.0.locker rdlock_start  on (ifile xsyn) on [inode 10000d51941 [2,head] /rendong/project/66gene/1822975/2.mapping/1822976.dup.bam auth v47 ap=1+0 dirtyparent s=352877624 n(v0 b352877624 1=1+0) (ifile xsyn) (iversion lock) cr={3983320=0-1073741824@1} caps={3983320=pAsLsXsFcb/pFsxcrb@43,3983323=pAsLsXs/pFscr@7,3984485=pAsLsXs/pFscr@13,3984608=pAsLsXs/pFscr@6,4147782=pAsLsXs/-@5},l=3983320(-1) | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x7f521f8ad540 mode: 33188]
2018-11-11 07:38:28.035723 7f512d70c700 10 mds.0.locker  got rdlock on (ifile xsyn r=1) [inode 10000d51941 [2,head] /rendong/project/66gene/1822975/2.mapping/1822976.dup.bam auth v47 ap=1+0 dirtyparent s=352877624 n(v0 b352877624 1=1+0) (ifile xsyn r=1) (iversion lock) cr={3983320=0-1073741824@1} caps={3983320=pAsLsXsFcb/pFsxcrb@43,3983323=pAsLsXs/pFscr@7,3984485=pAsLsXs/pFscr@13,3984608=pAsLsXs/pFscr@6,4147782=pAsLsXs/-@5},l=3983320(-1) | ptrwaiter=0 request=1 lock=2 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x7f521f8ad540 mode: 33188]
2018-11-11 07:38:28.035936 7f512d70c700 10 mds.0.server reply to stat on client_request(client.3984608:560675 getattr pAsLsXsFs #10000d51941 2018-11-11 07:38:28.035134) v3 ref: [inode 10000d51941 [2,head] /rendong/project/66gene/1822975/2.mapping/1822976.dup.bam auth v47 ap=1+0 dirtyparent s=352877624 n(v0 b352877624 1=1+0) (iauth sync r=1) (ilink sync r=1) (ifile xsyn r=1) (ixattr sync r=1) (iversion lock) cr={3983320=0-1073741824@1} caps={3983320=pAsLsXsFcb/pFsxcrb@43,3983323=pAsLsXs/pFscr@7,3984485=pAsLsXs/pFscr@13,3984608=pAsLsXs/pFscr@6,4147782=pAsLsXs/-@5},l=3983320(-1) | ptrwaiter=0 request=1 lock=5 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x7f521f8ad540 mode: 33188]
2018-11-11 07:38:28.035976 7f512d70c700 10 mds.0.cache.ino(10000d51941) encode_inodestat issuing pAsLsXs seq 7
2018-11-11 07:38:28.035978 7f512d70c700 10 mds.0.cache.ino(10000d51941) encode_inodestat caps pAsLsXs seq 7 mseq 0 xattrv 1 len 4
2018-11-11 07:38:28.035982 7f512d70c700 10 mds.0.cache.ino(10000d51941) including xattrs version 1

2018-11-11 07:38:28.036349 7f512d70c700  7 mds.0.locker rdlock_finish on (ifile xsyn) on [inode 10000d51941 [2,head] /rendong/project/66gene/1822975/2.mapping/1822976.dup.bam auth v47 ap=1+0 dirtyparent s=352877624 n(v0 b352877624 1=1+0) (ifile xsyn) (ixattr sync r=1) (iversion lock) cr={3983320=0-1073741824@1} caps={3983320=pAsLsXsFcb/pFsxcrb@43,3983323=pAsLsXs/pFscr@7,3984485=pAsLsXs/pFscr@13,3984608=pAsLsXs/pFscr@7,4147782=pAsLsXs/-@5},l=3983320(-1) | ptrwaiter=0 request=1 lock=2 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x7f521f8ad540 mode: 33188]
2018-11-11 07:38:28.036363 7f512d70c700  7 mds.0.locker file_eval wanted=sxcrb loner_wanted=sxcrb other_wanted=scr  filelock=(ifile xsyn) on [inode 10000d51941 [2,head] /rendong/project/66gene/1822975/2.mapping/1822976.dup.bam auth v47 ap=1+0 dirtyparent s=352877624 n(v0 b352877624 1=1+0) (ifile xsyn) (ixattr sync r=1) (iversion lock) cr={3983320=0-1073741824@1} caps={3983320=pAsLsXsFcb/pFsxcrb@43,3983323=pAsLsXs/pFscr@7,3984485=pAsLsXs/pFscr@13,3984608=pAsLsXs/pFscr@7,4147782=pAsLsXs/-@5},l=3983320(-1) | ptrwaiter=0 request=1 lock=2 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x7f521f8ad540 mode: 33188]

Description:
One client write the file and other client read it may lead readers can't read file and report
"waiting for caps need Fr want Fc on 10000d51941.head" all the time.

The matter is filelock stuck at LOCK_XSYN all the time and the client can't acquire Fr.In my situation the
file has loner but its wanted_loner_cap is -1 and the wanted is sxcrb not have CEPH_CAP_GWR so it can't
change to LOCK_MIX.

History

#1 Updated by Greg Farnum 18 days ago

  • Project changed from Ceph to fs
  • Status changed from New to Need Review

#2 Updated by Patrick Donnelly 1 day ago

  • Due date deleted (11/20/2018)
  • Assignee set to Ivan Guan
  • Start date deleted (11/20/2018)
  • Source set to Community (dev)
  • Backport set to mimic,luminous
  • Pull request ID set to 25174
  • Component(FS) Client added

#3 Updated by Patrick Donnelly 1 day ago

  • Description updated (diff)

Also available in: Atom PDF