Project

General

Profile

Bug #61781

mds: couldn't successfully calculate the locker caps

Added by Xiubo Li 8 months ago. Updated 7 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

100%

Source:
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):


Subtasks

Linux kernel client - Bug #61892: [testing] qa: test_snapshot_remove (tasks.cephfs.test_strays.TestStrays)DuplicateXiubo Li


Related issues

Copied to CephFS - Backport #62197: quincy: mds: couldn't successfully calculate the locker caps Duplicate
Copied to CephFS - Backport #62198: reef: mds: couldn't successfully calculate the locker caps Duplicate
Copied to CephFS - Backport #62199: pacific: mds: couldn't successfully calculate the locker caps Duplicate

History

#1 Updated by Xiubo Li 8 months ago

After partial caps was revoked, it couldn't correctly update the caps:

2023-06-15T06:33:23.297+0000 7fde93b16700  1 -- [v2:172.21.15.29:6832/1469037096,v1:172.21.15.29:6833/1469037096] <== client.4941 192.168.0.1:0/1731885576 4979 ==== client_caps(update ino 0x3000000025b 1 seq 18 caps=pAsLsXsFcrb dirty=- wanted=Fcr follows 0 size 5984600/12582912 ts 1/18446744073709551615 mtime 2023-06-15T06:33:20.586624+0000 ctime 2023-06-15T06:33:20.586624+0000 change_attr 136 tws 1) v12 ==== 260+0+0 (crc 0 0 0) 0x556a82154380 con 0x556a81cc5c00
2023-06-15T06:33:23.297+0000 7fde93b16700  7 mds.2.locker handle_client_caps  on 0x3000000025b tid 0 follows 0 op update flags 0x0
2023-06-15T06:33:23.297+0000 7fde93b16700 20 mds.2.12 get_session have 0x556a81c89900 client.4941 192.168.0.1:0/1731885576 state open
2023-06-15T06:33:23.297+0000 7fde93b16700 10 mds.2.locker  head inode [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/pAsLsXsFxcrb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000]
2023-06-15T06:33:23.297+0000 7fde93b16700 10 Capability revocation is not totally finished yet on [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/pAsLsXsFcrb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000], the session  (4941)
2023-06-15T06:33:23.297+0000 7fde93b16700 10 mds.2.locker  follows 0 retains pAsLsXsFcrb dirty - on [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000]
2023-06-15T06:33:23.297+0000 7fde93b16700 10 mds.2.locker _do_cap_update dirty - issued pAsLsXsFcb wanted Fcr on [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000]
2023-06-15T06:33:23.297+0000 7fde93b16700 20 mds.2.locker inode is file
2023-06-15T06:33:23.297+0000 7fde93b16700 20 mds.2.locker client has write caps; m->get_max_size=12582912; old_max=12582912
2023-06-15T06:33:23.297+0000 7fde93b16700 10 mds.2.locker eval 3648 [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000]
2023-06-15T06:33:23.297+0000 7fde93b16700 10 mds.2.locker eval_gather (ifile excl->lock) on [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000]
2023-06-15T06:33:23.297+0000 7fde93b16700 10 mds.2.locker  next state is lock issued/allows loner cb/cb xlocker /cb other /cb
2023-06-15T06:33:23.297+0000 7fde93b16700  7 mds.2.locker eval_gather finished gather on (ifile excl->lock) on [inode 0x3000000025b [...2,head] /client.0/tmp/tmp.o8Xb5OZu0n/p8/db/d23/f28 auth v128 ap=2 snaprealm=0x556a8355cb40 DIRTYPARENT s=5984600 nl=2 n(v0 rc2023-06-15T06:33:20.586624+0000 b5984600 1=1+0) (ifile excl->lock) (iversion lock) cr={4941=0-12582912@1} caps={4941=pAsLsXsFcb/Fcr@18},l=4941 | ptrwaiter=0 request=1 lock=1 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x556a83664000]

The cap update msg tell MDS that it still holding the caps=pAsLsXsFcrb caps, but the MDS was trying to revoke the Fxr caps and only the Fx caps was released and still holding the Fr caps and then the MDS just overridded the inode's issued caps as caps={4941=pAsLsXsFcb/Fcr@18}, which is incorrect.

#2 Updated by Xiubo Li 8 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 52176

#3 Updated by Venky Shankar 8 months ago

Xiubo - simialr failure here: /a/vshankar-2023-07-12_07:14:06-fs-wip-vshankar-testing-20230712.041849-testing-default-smithi/7334887

Could you confirm if its the same issue as detailed in this tracker?

#4 Updated by Xiubo Li 8 months ago

Venky Shankar wrote:

Xiubo - simialr failure here: /a/vshankar-2023-07-12_07:14:06-fs-wip-vshankar-testing-20230712.041849-testing-default-smithi/7334887

Could you confirm if its the same issue as detailed in this tracker?

Sure, will do it tomorrow. Thanks.

#5 Updated by Venky Shankar 8 months ago

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo - simialr failure here: /a/vshankar-2023-07-12_07:14:06-fs-wip-vshankar-testing-20230712.041849-testing-default-smithi/7334887

Could you confirm if its the same issue as detailed in this tracker?

Sure, will do it tomorrow. Thanks.

No rush. The PRs under testing were user-space changes, so its not related to that change. But just wanted to double-check if its not a new bug we are running into.

#6 Updated by Xiubo Li 8 months ago

Venky Shankar wrote:

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo - simialr failure here: /a/vshankar-2023-07-12_07:14:06-fs-wip-vshankar-testing-20230712.041849-testing-default-smithi/7334887

Could you confirm if its the same issue as detailed in this tracker?

Sure, will do it tomorrow. Thanks.

No rush. The PRs under testing were user-space changes, so its not related to that change. But just wanted to double-check if its not a new bug we are running into.

This is a known issue https://tracker.ceph.com/issues/61818.

#7 Updated by Venky Shankar 7 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to reef,quincy,pacific

#8 Updated by Backport Bot 7 months ago

  • Copied to Backport #62197: quincy: mds: couldn't successfully calculate the locker caps added

#9 Updated by Backport Bot 7 months ago

  • Copied to Backport #62198: reef: mds: couldn't successfully calculate the locker caps added

#10 Updated by Backport Bot 7 months ago

  • Copied to Backport #62199: pacific: mds: couldn't successfully calculate the locker caps added

#11 Updated by Backport Bot 7 months ago

  • Tags set to backport_processed

Also available in: Atom PDF