Actions
Bug #45935
closedmds: cap revoking requests didn't success when the client doing reconnection
% Done:
0%
Source:
Development
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The kclient node log:
53 2020-06-03T01:04:06.610312+00:00 smithi070 kernel: [ 825.023480] ceph: mds0 reconnect start 54 2020-06-03T01:04:06.658346+00:00 smithi070 kernel: [ 825.069176] ceph: mds0 reconnect success 55 2020-06-03T01:04:08.634315+00:00 smithi070 kernel: [ 827.045544] ceph: mds0 recovery completed
mds.c daemon send the revoke request:
181639 2020-06-03T01:04:08.637+0000 7fbd51f2a700 20 mds.0.locker client.4394 pending pAsLsXsFsc allowed pAsxLsXsxFrw wanted pAsxXsxFxcwb 181640 2020-06-03T01:04:08.637+0000 7fbd51f2a700 7 mds.0.locker sending MClientCaps to client.4394 seq 1 new pending pAsLsXs was pAsLsX sFsc 181641 2020-06-03T01:04:08.637+0000 7fbd51f2a700 20 mds.0.cache.ino(0x10000000392) encode_cap_message pfile 0 pauth 0 plink 0 pxattr 0 ctim e 2020-06-03T01:00:43.722299+0000 181642 2020-06-03T01:04:08.637+0000 7fbd51f2a700 10 mds.0.cache.ino(0x10000000392) including xattrs v 1 181643 2020-06-03T01:04:08.637+0000 7fbd51f2a700 10 mds.0.23 send_message_client_counted client.4394 seq 5 client_caps(revoke ino 0x1000000 0392 65 seq 1 caps=pAsLsXs dirty=- wanted=pAsxXsxFxcwb follows 0 size 104857600/213909504 ts 1/18446744073709551615 mtime 2020-06-03 T01:00:43.722299+0000 xattrs(v=1 l=4)) v11 181644 2020-06-03T01:04:08.637+0000 7fbd51f2a700 1 -- [v2:172.21.15.107:6834/313709009,v1:172.21.15.107:6835/313709009] --> v1:172.21.15.7 0:0/3065332795 -- client_caps(revoke ino 0x10000000392 65 seq 1 caps=pAsLsXs dirty=- wanted=pAsxXsxFxcwb follows 0 size 104857600/21 3909504 ts 1/18446744073709551615 mtime 2020-06-03T01:00:43.722299+0000 xattrs(v=1 l=4)) v11 -- 0x56150dcf3c00 con 0x56150c865800 181645 2020-06-03T01:04:08.637+0000 7fbd51f2a700 10 mds.0.locker eval done 181646 2020-06-03T01:04:08.637+0000 7fbd51f2a700 1 -- [v2:172.21.15.107:6834/313709009,v1:172.21.15.107:6835/313709009] --> [v2:172.21.15. 107:6802/15489,v1:172.21.15.107:6803/15489] -- osd_op(unknown.0.23:31 3.3 3:ddb4e4d7:::200.00000002:head [write 1516350~1094 in=1094 b] snapc 0=[] ondisk+write+known_if_redirected+full_force e26) v8 -- 0x56150d6172c0 con 0x56150c864400
mds.c received the cap update request from kclient, but didn't trigger cap->confirm_receipt() to remove the cap from the revoking list:
189429 2020-06-03T01:04:09.249+0000 7fbd51f2a700 1 -- [v2:172.21.15.107:6834/313709009,v1:172.21.15.107:6835/313709009] <== client.4394 v1 :172.21.15.70:0/3065332795 78 ==== client_caps(update ino 0x10000000392 65 seq 1 caps=pAsLsXs dirty=- wanted=pAsxXsxFxwb follows 0 s ize 104857600/0 mtime 2020-06-03T01:04:03.072493+0000 xattrs(v=17179869185 l=0)) v10 ==== 236+0+0 (unknown 2146284701 0 0) 0x56150d5 17300 con 0x56150c865800 189430 2020-06-03T01:04:09.249+0000 7fbd51f2a700 7 mds.0.locker handle_client_caps on 0x10000000392 tid 0 follows 0 op update flags 0x0 189431 2020-06-03T01:04:09.249+0000 7fbd51f2a700 20 mds.0.23 get_session have 0x56150c8e3a80 client.4394 v1:172.21.15.70:0/3065332795 state open 189432 2020-06-03T01:04:09.249+0000 7fbd51f2a700 10 mds.0.locker head inode [inode 0x10000000392 [2,head] /client.0/tmp/tmp/data/datafile7 1 auth v2721 dirtyparent s=104857600 n(v0 rc2020-06-03T01:04:03.072493+0000 b104857600 1=1+0)/n(v0 rc2020-06-03T01:00:43.722299+0000 b104857600 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4394=0-213909504@1} caps={4394=pAsxLsXsxFsxcrwb/pAsxX sxFxcwb@3},l=4394 | ptrwaiter=0 lock=0 caps=1 dirtyrstat=1 dirtyparent=1 dirty=1 authpin=0 0x56150dc1d000] 189433 2020-06-03T01:04:09.249+0000 7fbd51f2a700 10 mds.0.locker follows 0 retains pAsLsXs dirty - on [inode 0x10000000392 [2,head] /clien t.0/tmp/tmp/data/datafile71 auth v2721 dirtyparent s=104857600 n(v0 rc2020-06-03T01:04:03.072493+0000 b104857600 1=1+0)/n(v0 rc2020- 06-03T01:00:43.722299+0000 b104857600 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4394=0-213909504@1} caps={4 394=pAsxLsXsxFsxcrwb/pAsxXsxFxcwb@3},l=4394 | ptrwaiter=0 lock=0 caps=1 dirtyrstat=1 dirtyparent=1 dirty=1 authpin=0 0x56150dc1d000] 189434 2020-06-03T01:04:09.249+0000 7fbd51f2a700 10 mds.0.locker wanted pAsxXsxFxcwb -> pAsxXsxFxwb 189435 2020-06-03T01:04:09.249+0000 7fbd51f2a700 10 mds.0.locker _do_cap_update dirty - issued pAsxLsXsxFsxcrwb wanted pAsxXsxFxwb on [inod e 0x10000000392 [2,head] /client.0/tmp/tmp/data/datafile71 auth v2721 dirtyparent s=104857600 n(v0 rc2020-06-03T01:04:03.072493+0000 b104857600 1=1+0)/n(v0 rc2020-06-03T01:00:43.722299+0000 b104857600 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4394=0-213909504@1} caps={4394=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@3},l=4394 | ptrwaiter=0 lock=0 caps=1 dirtyrstat=1 dirtyparent=1 dir ty=1 authpin=0 0x56150dc1d000]
Then at last the revoking caps timeout:
278783 2020-06-03T01:05:11.103+0000 7fbd50727700 20 mds.0.locker caps_tick age = 62.463717 client.4394.0x10000000392 278784 2020-06-03T01:05:11.103+0000 7fbd50727700 0 log_channel(cluster) log [WRN] : client.4394 isn't responding to mclientcaps(revoke), i no 0x10000000392 pending pAsxLsXsxFsxcrwb issued pAsxLsXsxFsxcrwb, sent 62.463717 seconds ago 278785 2020-06-03T01:05:11.103+0000 7fbd50727700 20 mds.0.locker caps_tick client.4394 isn't responding to mclientcaps(revoke), ino 0x10000 000392 pending pAsxLsXsxFsxcrwb issued pAsxLsXsxFsxcrwb, sent 62.463717 seconds ago
Actions