Project

General

Profile

Actions

Bug #61584

open

mds: the parent dir's mtime will be overwrote by update cap request when making dirs

Added by Xiubo Li 12 months ago. Updated 8 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The ceph-user mail list link: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/WZWGNLT2XEPV2BYBNL6MWEP5GNQ66CVR/

How to reproduce it:

Reproduction steps:
1. Create a directory - mkdir dir1
2. Modify mtime using the touch command - touch dir1
3. Create a file or directory inside of 'dir1' - mkdir dir1/dir2
Expected result:
mtime for dir1 should change to the time the file or directory was created in step 3
Actual result:
there was no change to the mtime for 'dir1'

Note : For more detail, kindly find the attached logs.

Our queries are :
1. Is this expected behavior for CephFS?
2. If so, can you explain why the directory behavior is inconsistent depending on whether
the mtime for the directory has previously been manually updated.
Actions #1

Updated by Xiubo Li 12 months ago

Tested with latest ceph code, I only could reproduce this with kclient, the libcephfs worked well.

Actions #2

Updated by Xiubo Li 12 months ago

I reproduced it by creating dirk4444/dirk5555 and found the root cause:

$ stat dirk4444/
  File: dirk4444/
  Size: 0             Blocks: 1          IO Block: 4096   directory
Device: 2ch/44d    Inode: 1099511627791  Links: 3
Access: (0755/drwxr-xr-x)  Uid: ( 1000/  xiubli)   Gid: ( 1000/  xiubli)
Access: 2023-06-05 10:21:35.051903832 +0800
Modify: 2023-06-05 10:21:35.051903832 +0800
Change: 2023-06-05 10:21:47.732492563 +0800
 Birth: -

When making dir dirk5555 the mkdir request will nest a cap release:

13244 2023-06-05T10:21:47.735+0800 7fd70cf6a640  1 -- v1:192.168.0.104:6813/2845063001 <== client.4316 v1:192.168.0.104:0/2423448585 62 ==== client_request(client.4316:7 mkdir #0x1000000000f/dirk5555 2023-06-05T10:21:47.732492+0800       caller_uid=1000, caller_gid=1000{10,1000,}) v6 ==== 216+0+38 (unknown 1824772708 0 298234240) 0x5645c46e8900 con 0x5645c47bd800
13245 2023-06-05T10:21:47.735+0800 7fd70cf6a640  4 mds.0.server handle_client_request client_request(client.4316:7 mkdir #0x1000000000f/dirk5555 2023-06-05T10:21:47.732492+0800 caller_uid=1000, caller_gid=1000{10,1000,}) v6        
13246 2023-06-05T10:21:47.735+0800 7fd70cf6a640 20 mds.0.3 get_session have 0x5645c4651200 client.4316 v1:192.168.0.104:0/2423448585 state open
13247 2023-06-05T10:21:47.735+0800 7fd70cf6a640 15 mds.0.server  oldest_client_tid=7
13248 2023-06-05T10:21:47.735+0800 7fd70cf6a640  7 mds.0.cache request_start request(client.4316:7 nref=2 cr=0x5645c46e8900)
13249 2023-06-05T10:21:47.735+0800 7fd70cf6a640 10 mds.0.locker process_request_cap_release client.4316 pAsLsXsxFsx on [inode 0x1000000000f [...2,head] /dirk4444/ auth v88 DIRTYPARENT f() n(v0 1=0+1) (iauth excl) (inest lock) (ifil      e excl) (ixattr excl) (iversion lock) caps={4316=pAsxLsXsxFsx/-@3},l=4316 | request=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 0x5645c46ef080]

Then the cap release will be run first just before handle the mkdir, please see above log with process_request_cap_release.

And then the cap release will trigger to revoke the Xx caps for dir dirk4444:

13263 2023-06-05T10:21:47.735+0800 7fd70cf6a640 10 mds.0.cache.ino(0x1000000000f) auth_pin by 0x5645c46ef3c0 on [inode 0x1000000000f [...2,head] /dirk4444/ auth v88 ap=1 DIRTYPARENT f() n(v0 1=0+1) (inest lock) (ifile excl) (ixattr       excl->sync) (iversion lock) caps={4316=pAsLsXsxFsx/pAsLsXsFsxcral@3},l=4316 | request=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5645c46ef080] now 1
13264 2023-06-05T10:21:47.735+0800 7fd70cf6a640 15 mds.0.cache.dir(0x1) adjust_nested_auth_pins 1 on [dir 0x1 / [2,head] auth v=89 cv=1/1 REP dir_auth=0 ap=0+1 state=1610874881|complete f(v0 m2023-06-05T10:21:23.030729+0800 3=0+3)       n(v6 rc2023-06-05T10:21:23.030729+0800 5=0+5) hs=3+3,ss=0+0 dirty=6 | child=1 subtree=1 dirty=1 authpin=0 0x5645bf6a6880] by 0x5645c46ef080 count now 0/1
13265 2023-06-05T10:21:47.735+0800 7fd70cf6a640  7 mds.0.locker get_allowed_caps loner client.4316 allowed=pAsLsXsFsxcrwb, xlocker allowed=pAsLsXsFsxcrwb, others allowed=pAsLs on [inode 0x1000000000f [...2,head] /dirk4444/ auth v88       ap=1 DIRTYPARENT f() n(v0 1=0+1) (inest lock) (ifile excl) (ixattr excl->sync) (iversion lock) caps={4316=pAsLsXsxFsx/pAsLsXsFsxcral@3},l=4316 | request=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5645c46ef080]
13266 2023-06-05T10:21:47.735+0800 7fd70cf6a640 20 mds.0.locker  client.4316 pending pAsLsXsxFsx allowed pAsLsXsFsxwb wanted pAsLsXsFsxcral
13267 2023-06-05T10:21:47.735+0800 7fd70cf6a640  7 mds.0.locker    sending MClientCaps to client.4316 seq 4 new pending pAsLsXsFsx was pAsLsXsxFsx
13268 2023-06-05T10:21:47.735+0800 7fd70cf6a640 20 mds.0.cache.ino(0x1000000000f) encode_cap_message pfile 1 pauth 0 plink 0 pxattr 0 mtime 2023-06-05T10:21:23.030729+0800 ctime 2023-06-05T10:21:23.030729+0800 change_attr 0
13269 2023-06-05T10:21:47.735+0800 7fd70cf6a640 10 mds.0.3 send_message_client_counted client.4316 seq 3 client_caps(revoke ino 0x1000000000f 1 seq 4 caps=pAsLsXsFsx dirty=- wanted=pAsLsXsFsxcral follows 0 size 0/0 ts 1/18446744073      709551615 mtime 2023-06-05T10:21:23.030729+0800 ctime 2023-06-05T10:21:23.030729+0800 change_attr 0) v12
13270 2023-06-05T10:21:47.735+0800 7fd70cf6a640  1 -- v1:192.168.0.104:6813/2845063001 --> v1:192.168.0.104:0/2423448585 -- client_caps(revoke ino 0x1000000000f 1 seq 4 caps=pAsLsXsFsx dirty=- wanted=pAsLsXsFsxcral follows 0 size 0      /0 ts 1/18446744073709551615 mtime 2023-06-05T10:21:23.030729+0800 ctime 2023-06-05T10:21:23.030729+0800 change_attr 0) v12 -- 0x5645c46e1500 con 0x5645c47bd800
13271 2023-06-05T10:21:47.735+0800 7fd70cf6a640 10 mds.0.locker eval done
13272 2023-06-05T10:21:47.735+0800 7fd70cf6a640  7 mds.0.server dispatch_client_request client_request(client.4316:7 mkdir #0x1000000000f/dirk5555 2023-06-05T10:21:47.732492+0800 caller_uid=1000, caller_gid=1000{10,1000,}) v6      
13273 2023-06-05T10:21:47.735+0800 7fd70cf6a640 10 mds.0.server rdlock_path_xlock_dentry request(client.4316:7 nref=2 cr=0x5645c46e8900) #0x1000000000f/dirk5555
13274 2023-06-05T10:21:47.735+0800 7fd70cf6a640  7 mds.0.cache traverse: opening base ino 0x1000000000f snap head

And then handle mkdir request immediately as above. But the mkdir won't wait the cap revocation to finish and set the correct mtime of 2023-06-05T10:21:47.732492+0800 and replied to kclient:

13422 2023-06-05T10:21:47.735+0800 7fd70cf6a640 10 mds.0.server early_reply 0 ((0) Success) client_request(client.4316:7 mkdir #0x1000000000f/dirk5555 2023-06-05T10:21:47.732492+0800 caller_uid=1000, caller_gid=1000{10,1000,}) v6
13423 2023-06-05T10:21:47.735+0800 7fd70cf6a640 20 mds.0.server set_trace_dist snapid head
13424 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.server set_trace_dist snaprealm snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} last_modified 2023-06-02T16:29:54.169966+0800 change_attr 0 0x5645c4408d80) len=48
13425 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10  mds.0.cache.snaprealm(0x1 seq 1 0x5645c4408d80) get_snaps  (seq 1 cached_seq 1)
13426 2023-06-05T10:21:47.736+0800 7fd70cf6a640 20 mds.0.cache.ino(0x1000000000f)  pfile 1 pauth 1 plink 1 pxattr 1 plocal 1 mtime 2023-06-05T10:21:47.732492+0800 ctime 2023-06-05T10:21:47.732492+0800 change_attr 1 valid=1
13427 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.cache.ino(0x1000000000f) encode_inodestat issuing pAsLsXsFsx seq 5
13428 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.cache.ino(0x1000000000f) encode_inodestat caps pAsLsXsFsx seq 5 mseq 0 xattrv 0
13429 2023-06-05T10:21:47.736+0800 7fd70cf6a640 20 mds.0.server set_trace_dist added diri [inode 0x1000000000f [...2,head] /dirk4444/ auth v88 pv90 ap=2 DIRTYPARENT f() n(v0 1=0+1) (iauth sync r=1) (isnap sync r=1) (inest lock w=1)       (ifile excl w=1) (ixattr excl->sync) (iversion lock w=1 last_client=4316) caps={4316=pAsLsXsFsx/pAsLsXsxFsx/pAsLsXsFsxcral@5},l=4316 | request=0 lock=5 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5645c46ef080]
13430 2023-06-05T10:21:47.736+0800 7fd70cf6a640 20 mds.0.server set_trace_dist added dir  [dir 0x1000000000f /dirk4444/ [2,head] auth pv=3 v=1 cv=0/0 ap=1+2 state=1610612737|complete f() n() hs=0+1,ss=0+0 | child=1 dirty=1 authpin=      1 0x5645c4670d00]
13431 2023-06-05T10:21:47.736+0800 7fd70cf6a640 20 mds.0.locker issue_client_lease no/null lease on [dentry #0x1/dirk4444/dirk5555 [2,head] auth NULL (dn xlockdone x=1) (dversion lock w=1 last_client=4316) pv=2 v=1 ap=2 ino=(nil) s      tate=1073741824 | request=1 lock=2 authpin=1 0x5645c43e9900]
13432 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10  mds.0.cache.snaprealm(0x1 seq 1 0x5645c4408d80) get_snaps  (seq 1 cached_seq 1)
13433 2023-06-05T10:21:47.736+0800 7fd70cf6a640 20 mds.0.cache.ino(0x10000000010)  pfile 1 pauth 1 plink 1 pxattr 1 plocal 0 mtime 2023-06-05T10:21:47.732492+0800 ctime 2023-06-05T10:21:47.732492+0800 change_attr 0 valid=1
13434 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.cache.ino(0x10000000010) encode_inodestat issuing pAsxLsXsxFsx seq 1
13435 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.cache.ino(0x10000000010) encode_inodestat caps pAsxLsXsxFsx seq 1 mseq 0 xattrv 1
13436 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.cache.ino(0x10000000010) including xattrs version 1

And then the MDS released the mds_lock and then immediately run the update cap request, which the ack for the cap revocation from kclient, but with the stale mtime of 2023-06-05T10:21:35.051903+0800:

13453 2023-06-05T10:21:47.736+0800 7fd70cf6a640  1 -- v1:192.168.0.104:6813/2845063001 <== client.4316 v1:192.168.0.104:0/2423448585 63 ==== client_caps(update ino 0x1000000000f 1 seq 4 tid 2 caps=pAsLsXsFsx dirty=Fx wanted=pAsLsXsFsxcral follows 1 size 0/0 mtime 2023-06-05T10:21:35.051903+0800 ctime 2023-06-05T10:21:35.051903+0800 change_attr 1 tws 2 xattrs(v=18446612692466989120 l=0)) v12 ==== 268+0+0 (unknown 682564261 0 0) 0x5645c46e1500 con 0x5645      c47bd800
13454 2023-06-05T10:21:47.736+0800 7fd70cf6a640  7 mds.0.locker handle_client_caps  on 0x1000000000f tid 2 follows 1 op update flags 0x0
13455 2023-06-05T10:21:47.736+0800 7fd70cf6a640 20 mds.0.3 get_session have 0x5645c4651200 client.4316 v1:192.168.0.104:0/2423448585 state open
13456 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.locker  head inode [inode 0x1000000000f [...2,head] /dirk4444/ auth v88 pv90 ap=2 DIRTYPARENT f() n(v0 1=0+1) (isnap sync r=1) (inest lock w=1) (ifile excl w=1) (ixattr excl->sync) (iversion lock w=1 last_client=4316) caps={4316=pAsLsXsFsx/pAsLsXsxFsx/pAsLsXsFsxcral@5},l=4316 | request=0 lock=4 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5645c46ef080]
13457 2023-06-05T10:21:47.736+0800 7fd70cf6a640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x1000000000f [...2,head] /dirk4444/ auth v88 pv90 ap=2 DIRTYPARENT f() n(v0 1=0+1) (isnap sync r=1) (inest lock w=1) (ifile excl w=1) (ixattr excl->sync) (iversion lock w=1 last_client=4316) caps={4316=pAsLsXsFsx/pAsLsXsxFsx/pAsLsXsFsxcral@5},l=4316 | request=0 lock=4 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5645c46ef080]

The ack should be came early during MDS handling the mkdir request but just waiting the mds_lock to be release. Since the dirty=Fx and then MDS will allow the mtime to be force updated instead of increasing the mtime monotonically.

This is why the mtime not changed.

The libcephfs has a different logic, so do not have this issue.

I am still going through the code to find one way to fix it.

Actions #3

Updated by Xiubo Li 12 months ago

Well after I fix the issue in https://tracker.ceph.com/issues/61584#note-2, the issue still exists.

I checked the kclient code again and found that when the Fx caps is issued the kclient will cache the changes triggered by touch. And then for the following mkdir request it won't revoke the parent dir's Fx caps, which means it won't make sure that the dirty metadata in client side be flushed.

So just after the mkdir request finishes and then if the dirty Fx caps flushed it will overwrote the metadatas updated by mkdir requests.

Actions #4

Updated by Xiubo Li 12 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 51931
Actions #5

Updated by Venky Shankar 8 months ago

Xiubo Li wrote:

I reproduced it by creating dirk4444/dirk5555 and found the root cause:

[...]

When making dir dirk5555 the mkdir request will nest a cap release:

[...]

Then the cap release will be run first just before handle the mkdir, please see above log with process_request_cap_release.

And then the cap release will trigger to revoke the Xx caps for dir dirk4444:

Sounds fine till here.

[...]

And then handle mkdir request immediately as above. But the mkdir won't wait the cap revocation to finish and set the correct mtime of 2023-06-05T10:21:47.732492+0800 and replied to kclient:

This behaviour is not kclient specific, yes?

[...]

And then the MDS released the mds_lock and then immediately run the update cap request, which the ack for the cap revocation from kclient, but with the stale mtime of 2023-06-05T10:21:35.051903+0800:

[...]

The ack should be came early during MDS handling the mkdir request but just waiting the mds_lock to be release. Since the dirty=Fx and then MDS will allow the mtime to be force updated instead of increasing the mtime monotonically.

The MDS seems to be doing the right thing - its only ctime that needs to be monotonically updated. mtime can go backwards, but in this case the "stale" mtime is the cached mtime held by the client (when it has the Fx caps) and that is overwriting the parent mtime updated by the mkdir request.

This is why the mtime not changed.

The libcephfs has a different logic, so do not have this issue.

I am still going through the code to find one way to fix it.

AFAICS, this issues should also be reproducible with libcephfs.

Actions #6

Updated by Xiubo Li 8 months ago

Venky Shankar wrote:

Xiubo Li wrote:

I reproduced it by creating dirk4444/dirk5555 and found the root cause:

[...]

When making dir dirk5555 the mkdir request will nest a cap release:

[...]

Then the cap release will be run first just before handle the mkdir, please see above log with process_request_cap_release.

And then the cap release will trigger to revoke the Xx caps for dir dirk4444:

Sounds fine till here.

[...]

And then handle mkdir request immediately as above. But the mkdir won't wait the cap revocation to finish and set the correct mtime of 2023-06-05T10:21:47.732492+0800 and replied to kclient:

This behaviour is not kclient specific, yes?

No, both the kclient and libcephfs could be reproduced.

[...]

And then the MDS released the mds_lock and then immediately run the update cap request, which the ack for the cap revocation from kclient, but with the stale mtime of 2023-06-05T10:21:35.051903+0800:

[...]

The ack should be came early during MDS handling the mkdir request but just waiting the mds_lock to be release. Since the dirty=Fx and then MDS will allow the mtime to be force updated instead of increasing the mtime monotonically.

The MDS seems to be doing the right thing - its only ctime that needs to be monotonically updated. mtime can go backwards, but in this case the "stale" mtime is the cached mtime held by the client (when it has the Fx caps) and that is overwriting the parent mtime updated by the mkdir request.

Yeah, correct.

This is why the mtime not changed.

The libcephfs has a different logic, so do not have this issue.

I am still going through the code to find one way to fix it.

AFAICS, this issues should also be reproducible with libcephfs.

Yeah. It is.

Actions #7

Updated by Venky Shankar 8 months ago

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo Li wrote:

I reproduced it by creating dirk4444/dirk5555 and found the root cause:

[...]

When making dir dirk5555 the mkdir request will nest a cap release:

[...]

Then the cap release will be run first just before handle the mkdir, please see above log with process_request_cap_release.

And then the cap release will trigger to revoke the Xx caps for dir dirk4444:

Sounds fine till here.

[...]

And then handle mkdir request immediately as above. But the mkdir won't wait the cap revocation to finish and set the correct mtime of 2023-06-05T10:21:47.732492+0800 and replied to kclient:

This behaviour is not kclient specific, yes?

No, both the kclient and libcephfs could be reproduced.

[...]

And then the MDS released the mds_lock and then immediately run the update cap request, which the ack for the cap revocation from kclient, but with the stale mtime of 2023-06-05T10:21:35.051903+0800:

[...]

The ack should be came early during MDS handling the mkdir request but just waiting the mds_lock to be release. Since the dirty=Fx and then MDS will allow the mtime to be force updated instead of increasing the mtime monotonically.

The MDS seems to be doing the right thing - its only ctime that needs to be monotonically updated. mtime can go backwards, but in this case the "stale" mtime is the cached mtime held by the client (when it has the Fx caps) and that is overwriting the parent mtime updated by the mkdir request.

Yeah, correct.

This is why the mtime not changed.

The libcephfs has a different logic, so do not have this issue.

I am still going through the code to find one way to fix it.

AFAICS, this issues should also be reproducible with libcephfs.

Yeah. It is.

ACK. I got worried when I read https://tracker.ceph.com/issues/61584#note-1 :)

Actions

Also available in: Atom PDF