Project

General

Profile

Bug #55240

mds: stuck 2 seconds and keeps retrying to find ino from auth MDS

Added by Venky Shankar 8 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

100%

Source:
Tags:
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds, task(medium)
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Seen here: https://pulpito.ceph.com/vshankar-2022-04-07_05:07:33-fs-master-testing-default-smithi/6780578/

Its an upgrade test. cephadm disables standby-replay and reduced max_mds to 1 before upgrade (waits for a single active mds by checking mdsmap). The test above has 2 active MDSs, with each MDS configured with a standby-replay MDS daemon. Disabling standby-replay goes fine - MDSs transition to standby. However, after setting `max_mds = 1', one of the active MDS is stuck in `up:stopping' and never transitions to `down:stopped'. The test fails after hitting "max job timeout"

Problematic MDS log: ./remote/smithi176/log/aa4d8d7c-b63b-11ec-8c36-001a4aab830c/ceph-mds.cephfs.smithi176.ttvthc.log.gz

2022-04-07T06:41:43.092+0000 7fe6f18dc700 10 mds.cephfs.smithi176.ttvthc my gid is 24327
2022-04-07T06:41:43.092+0000 7fe6f18dc700 10 mds.cephfs.smithi176.ttvthc map says I am mds.1.8 state up:stopping
2022-04-07T06:41:43.092+0000 7fe6f18dc700 10 mds.cephfs.smithi176.ttvthc msgr says I am [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087]
2022-04-07T06:41:43.092+0000 7fe6f18dc700 10 mds.cephfs.smithi176.ttvthc handle_mds_map: handling map as rank 1
2022-04-07T06:41:43.092+0000 7fe6f18dc700 10 notify_mdsmap: mds.metrics
2022-04-07T06:41:43.092+0000 7fe6f18dc700 10 notify_mdsmap: mds.metrics: rank0 is mds.cephfs.smithi116.vbikdi
2022-04-07T06:41:43.092+0000 7fe6ed0d3700  7 mds.1.8 mds has 1 queued contexts
2022-04-07T06:41:43.092+0000 7fe6ed0d3700 10 mds.1.8  finish 0x55c7560da940

The MDS seems to be waiting for some event to reach completion (maybe exporting a dir?).


Related issues

Copied to Linux kernel client - Bug #55377: kclient: mds revoke Fwb caps stuck after the kclient tries writebcak once Resolved
Copied to CephFS - Backport #55658: quincy: mds: stuck 2 seconds and keeps retrying to find ino from auth MDS Resolved
Copied to CephFS - Backport #55659: pacific: mds: stuck 2 seconds and keeps retrying to find ino from auth MDS Resolved

History

#1 Updated by Venky Shankar 8 months ago

  • Status changed from New to Triaged
  • Assignee set to Xiubo Li

#2 Updated by Xiubo Li 7 months ago

  • Status changed from Triaged to In Progress

#3 Updated by Xiubo Li 7 months ago

The file #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress was created in mds.0 at 2022-04-07T06:34:53.393 by client.14520:

2022-04-07T06:34:53.393+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] <== client.14520 v1:192.168.0.1:0/3995310513 23837 ==== client_request(client.14520:11993 create #0x10000005033/fsstress 2022-04-07T06:34:53.392872+0000 caller_uid=1000, caller_gid=1258{6,36,1000,1258,}) v4 ==== 176+0+62 (unknown 1577985864 0 1524826435) 0x55b26523e000 con 0x55b2619b9400
...
2022-04-07T06:34:53.393+0000 7f73ae424700 10 mds.0.server prepare_new_inode used_prealloc 0x1000000615a (1037 left)
2022-04-07T06:34:53.393+0000 7f73ae424700 10 mds.0.server  dir mode 040775 new mode 0100664
2022-04-07T06:34:53.393+0000 7f73ae424700 10 mds.0.server prepare_new_inode setting xattrs {security.selinux=buffer::ptr(0~34 0x55b266e834d0 in raw 0x55b266e834d0 len 34 nref 1)}
2022-04-07T06:34:53.393+0000 7f73ae424700 10 mds.0.server prepare_new_inode [inode 0x1000000615a [2,head] #1000000615a auth v1 s=0 n() (iversion lock) 0x55b2658e6680]

But immediately the client.14520 just send a getattr request to mds.1:

2022-04-07T06:34:53.430+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] <== client.14520 v1:192.168.0.1:0/3995310513 343 ==== client_request(client.14520:12001 getattr Fsr #0x1000000615a 2022-04-07T06:34:53.430871+0000 caller_uid=1000, caller_gid=1258{6,36,1000,1258,}) v4 ==== 168+0+0 (unknown 1001837266 0 0) 0x55c755025b80 con 0x55c750114400
2022-04-07T06:34:53.431+0000 7fe6f18dc700  4 mds.1.server handle_client_request client_request(client.14520:12001 getattr Fsr #0x1000000615a 2022-04-07T06:34:53.430871+0000 caller_uid=1000, caller_gid=1258{6,36,1000,1258,}) v4
...
2022-04-07T06:34:53.431+0000 7fe6f18dc700  7 mds.1.cache traverse: opening base ino 0x1000000615a snap head
2022-04-07T06:34:53.431+0000 7fe6f18dc700 10 mds.1.server rdlock_path_pin_ref request(client.14520:12001 nref=2 cr=0x55c755025b80) #0x1000000615a
2022-04-07T06:34:53.431+0000 7fe6f18dc700  7 mds.1.cache traverse: opening base ino 0x1000000615a snap head
2022-04-07T06:34:53.431+0000 7fe6f18dc700 10 mds.1.server FAIL on CEPHFS_ESTALE but attempting recovery
2022-04-07T06:34:53.431+0000 7fe6f18dc700  5 mds.1.cache find_ino_peers 0x1000000615a hint -1
2022-04-07T06:34:53.431+0000 7fe6f18dc700 10 mds.1.cache _do_find_ino_peer 1 0x1000000615a active 0,1 all 0,1 checked
2022-04-07T06:34:53.431+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] send_to--> mds [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] -- findino(1 0x1000000615a) v1 -- ?+0 0x55c754910580
2022-04-07T06:34:53.431+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] --> [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] -- findino(1 0x1000000615a) v1 -- 0x55c754910580 con 0x55c750115000
2022-04-07T06:34:53.431+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] <== mds.0 v2:172.21.15.116:6826/4138533272 3206 ==== findinoreply(1 #0x1000000615a) v1 ==== 21+0+0 (crc 0 0 0) 0x55c74ffa36c0 con 0x55c750115000
2022-04-07T06:34:53.431+0000 7fe6f18dc700 10 mds.1.cache handle_find_ino_reply findinoreply(1 #0x1000000615a) v1
2022-04-07T06:34:53.431+0000 7fe6f18dc700  7 mds.1.cache traverse: opening base ino 0x1000000615a snap head
2022-04-07T06:34:53.431+0000 7fe6f18dc700  0 mds.1.cache handle_find_ino_reply failed with -116 on #0x1000000615a, retrying
...

It failed with -116 on #0x1000000615a, retrying for hundreds times by getting path findinoreply(1 #0x1000000615a), until around 2 seconds later it get the full path findinoreply(1 #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress) from mds.0

2022-04-07T06:34:54.886+0000 7fe6f18dc700  0 mds.1.cache handle_find_ino_reply failed with -116 on #0x1000000615a, retrying
2022-04-07T06:34:54.886+0000 7fe6f18dc700 10 mds.1.cache _do_find_ino_peer 1 0x1000000615a active 0,1 all 0,1 checked
2022-04-07T06:34:54.886+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] send_to--> mds [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] -- findino(1 0x1000000615a) v1 -- ?+0 0x55c7526a7ce0
2022-04-07T06:34:54.886+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] --> [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] -- findino(1 0x1000000615a) v1 -- 0x55c7526a7ce0 con 0x55c750115000
2022-04-07T06:34:55.020+0000 7fe6f48e2700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] <== mds.0 v2:172.21.15.116:6826/4138533272 8896 ==== mdsping v1 ==== 8+0+0 (crc 0 0 0) 0x55c754aeadc0 con 0x55c750115000
2022-04-07T06:34:55.021+0000 7fe6f48e2700 20 set_next_seq: mds.metrics: current sequence number 23, setting next sequence number 24
2022-04-07T06:34:55.027+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] <== mds.0 v2:172.21.15.116:6826/4138533272 8897 ==== findinoreply(1 #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress) v1 ==== 98+0+0 (crc 0 0 0) 0x55c7540b8820 con 0x55c750115000
2022-04-07T06:34:55.027+0000 7fe6f18dc700 10 mds.1.cache handle_find_ino_reply findinoreply(1 #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress) v1

At the same in mds.0, the mdlog was just flushed triggered by MDCache::upkeep_main() every 5 seconds:

2022-04-07T06:34:54.619+0000 7f73acc21700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] --> [v2:172.21.15.176:6808/3144053492,v1:172.21.15.176:6809/3144053492] -- osd_op(unknown.0.5:594 2.1c 2:3939b497:::200.00000014:head [write 0~3329427 in=3329427b] snapc 0=[] ondisk+write+known_if_redirected+full_force e43) v8 -- 0x55b266519400 con 0x55b2619a0800
2022-04-07T06:34:54.620+0000 7f73acc21700 10 mds.0.log trim 43 / 128 segments, 40033 / -1 events, 0 (0) expiring, 0 (0) expired
2022-04-07T06:34:54.620+0000 7f73acc21700 10 mds.0.openfiles commit log_seq 39492
2022-04-07T06:34:54.626+0000 7f73acc21700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] --> [v2:172.21.15.116:6810/4272700933,v1:172.21.15.116:6811/4272700933] -- osd_op(unknown.0.5:595 2.6 2:654134d2:::mds0_openfiles.0:head [omap-set-header in=42b,omap-set-vals in=46678b,omap-rm-keys in=39214b] snapc 0=[] ondisk+write+known_if_redirected+full_force e43) v8 -- 0x55b266519800 con 0x55b2619b2400
...
2022-04-07T06:34:55.026+0000 7f73a8418700 10 MDSIOContextBase::complete: 18C_MDS_openc_finish
2022-04-07T06:34:55.026+0000 7f73a8418700 10 MDSContext::complete: 18C_MDS_openc_finish
...
2022-04-07T06:34:55.026+0000 7f73a8418700  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.14520:11993 create #0x10000005033/fsstress 2022-04-07T06:34:53.392872+0000 caller_uid=1000, caller_gid=1258{6,36,1000,1258,}) v4
...
2022-04-07T06:34:55.026+0000 7f73ae424700 10 mds.0.cache handle_find_ino findino(1 0x1000000615a) v1
2022-04-07T06:34:55.026+0000 7f73ae424700 10 mds.0.cache  have #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress auth v18 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={14520=0-4194304@1} caps={14520=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@2},l=14520 | request=0 caps=1 dirtyparent=1 dirty=1 0x55b2658e6680]
2022-04-07T06:34:55.026+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] send_to--> mds [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] -- findinoreply(1 #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress) v1 -- ?+0 0x55b267b7fba0
2022-04-07T06:34:55.026+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] --> [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] -- findinoreply(1 #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress) v1 -- 0x55b267b7fba0 con 0x55b2619b7000

We need to trigger to flush the mdlog manually in mds.0 in case find ino and the CInode hasn't a parent.

#4 Updated by Xiubo Li 7 months ago

Another issue in this failure:

In mds.1 after it find the inode for #0x1/client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress *, it will retry the *getarrt client request:

2022-04-07T06:34:55.052+0000 7fe6f18dc700  7 mds.1.server dispatch_client_request client_request(client.14520:12001 getattr Fsr #0x1000000615a 2022-04-07T06:34:53.430871+0000 caller_uid=1000, caller_gid=1258{6,36,1000,1258,}) v4
2022-04-07T06:34:55.052+0000 7fe6f18dc700 10 mds.1.server rdlock_path_pin_ref request(client.14520:12001 nref=3 cr=0x55c755025b80) #0x1000000615a
2022-04-07T06:34:55.052+0000 7fe6f18dc700 10 mds.1.locker acquire_locks request(client.14520:12001 nref=3 cr=0x55c755025b80)

But it immediately stuck waiting for rdlock, because it's lock state in replica mds.1 is LOCK:

2022-04-07T06:34:55.052+0000 7fe6f18dc700 10 mds.1.locker requesting rdlock from auth on (ifile lock) on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress rep@0.1 v18 s=0 n(v0 1=1+0) (iauth lock) (ifile lock) (ixattr lock) (iversion lock) cr={14520=0-4194304@1} | request=1 lock=1 0x55c75521cc00]

The source code for the LOCK:

                       // stable     loner  rep state  r     rp   rd   wr   fwr  l    x    caps(any,loner,xlocker,replica)
     [LOCK_LOCK]      = { 0,         false, LOCK_LOCK, AUTH, 0,   REQ, AUTH,0,   0,   0,   CEPH_CAP_GCACHE|CEPH_CAP_GBUFFER,0,0,0 },

The rd is REQ and it needs to do the lock state change from auth mds.0:

2022-04-07T06:34:55.052+0000 7fe6f18dc700  7 mds.1.locker rdlock_start  on (ifile lock) on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress rep@0.1 v18 s=0 n(v0 1=1+0) (iauth lock) (ifile lock) (ixattr lock) (iversion lock) cr={14520=0-4194304@1} | request=1 lock=1 0x55c75521cc00]
2022-04-07T06:34:55.052+0000 7fe6f18dc700 10 mds.1.locker requesting rdlock from auth on (ifile lock) on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress rep@0.1 v18 s=0 n(v0 1=1+0) (iauth lock) (ifile lock) (ixattr lock) (iversion lock) cr={14520=0-4194304@1} | request=1 lock=1 0x55c75521cc00]
2022-04-07T06:34:55.052+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] send_to--> mds [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] -- lock(a=reqrdlock ifile 0x1000000615a.head) v1 -- ?+0 0x55c751f85340
2022-04-07T06:34:55.052+0000 7fe6f18dc700  1 -- [v2:172.21.15.176:6824/2074966087,v1:172.21.15.176:6825/2074966087] --> [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] -- lock(a=reqrdlock ifile 0x1000000615a.head) v1 -- 0x55c751f85340 con 0x55c750115000
2022-04-07T06:34:55.052+0000 7fe6f18dc700  7 mds.1.locker rdlock_start waiting on (ifile lock) on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress rep@0.1 v18 s=0 n(v0 1=1+0) (iauth lock) (ifile lock) (ixattr lock) (iversion lock) cr={14520=0-4194304@1} | request=1 lock=1 0x55c75521cc00]
2022-04-07T06:34:55.052+0000 7fe6f18dc700 10 mds.1.cache.ino(0x1000000615a) add_waiter tag 2000000040000000 0x55c7540bd680 !ambig 1 !frozen 1 !freezing 1
2022-04-07T06:34:55.052+0000 7fe6f18dc700 15 mds.1.cache.ino(0x1000000615a) taking waiter here

And in mds.0, since the mds.0 is the auth MDS and the client.14520 is the loner client, so the client was issued the pAsxLsXsxFsxcrwb caps:

2022-04-07T06:34:55.054+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] <== mds.1 v2:172.21.15.176:6824/2074966087 17387 ==== lock(a=reqrdlock ifile 0x1000000615a.head) v1 ==== 69+0+0 (unknown 0 0 0) 0x55b263062fc0 con 0x55b2619b7000
2022-04-07T06:34:55.054+0000 7f73ae424700  7 mds.0.locker handle_file_lock a=reqrdlock on (ifile excl) from mds.1 [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress auth{1=1} v18 ap=1 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={14520=0-4194304@1} caps={14520=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@2},l=14520 | request=0 caps=1 dirtyparent=1 replicated=1 dirty=1 authpin=1 0x55b2658e6680]

And when trying to do the filelock state change (ifile excl->sync), it will need to do revoke Fxwb caps from client.14520:

2022-04-07T06:34:55.054+0000 7f73ae424700  7 mds.0.locker simple_sync on (ifile excl) on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress auth{1=1} v18 ap=1 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={14520=0-4194304@1} caps={14520=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@2},l=14520 | request=0 caps=1 dirtyparent=1 replicated=1 dirty=1 authpin=1 0x55b2658e6680]
2022-04-07T06:34:55.054+0000 7f73ae424700  7 mds.0.locker issue_caps loner client.14520 allowed=pAsxLsXsxFscr, xlocker allowed=pAsxLsXsxFscr, others allowed=pLs on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress auth{1=1} v18 ap=1 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={14520=0-4194304@1} caps={14520=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@2},l=14520 | request=0 caps=1 dirtyparent=1 replicated=1 dirty=1 authpin=1 0x55b2658e6680]
2022-04-07T06:34:55.054+0000 7f73ae424700 10 mds.0.5 send_message_client_counted client.14520 seq 25328 client_caps(revoke ino 0x1000000615a 1 seq 3 caps=pAsxLsXsxFscr dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 0/4194304 ts 1/18446744073709551615 mtime 2022-04-07T06:34:53.392872+0000) v11
2022-04-07T06:34:55.054+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] --> v1:192.168.0.1:0/3995310513 -- client_caps(revoke ino 0x1000000615a 1 seq 3 caps=pAsxLsXsxFscr dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 0/4194304 ts 1/18446744073709551615 mtime 2022-04-07T06:34:53.392872+0000) v11 -- 0x55b26607c900 con 0x55b2619b9400
2022-04-07T06:34:55.054+0000 7f73ae424700 10 mds.0.cache.ino(0x1000000615a) auth_pin by 0x55b2658e6990 on [inode 0x1000000615a [2,head] /client.0/tmp/fsstress/ltp-full-20091231/testcases/kernel/fs/fsstress/fsstress auth{1=1} v18 ap=2 DIRTYPARENT s=0 n(v0 1=1+0) (iauth excl) (ifile excl->sync) (ixattr excl) (iversion lock) cr={14520=0-4194304@1} caps={14520=pAsxLsXsxFscr/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@3},l=14520 | request=0 caps=1 dirtyparent=1 replicated=1 dirty=1 authpin=1 0x55b2658e6680] now 2

After 35 seconds later the cap update message came, but only the Fx caps is released:

2022-04-07T06:35:30.002+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] <== client.14520 v1:192.168.0.1:0/3995310513 24236 ==== client_caps(update ino 0x1000000615a 1 seq 3 tid 11103 caps=pAsxLsXsxFscrwb dirty=Fw wanted=pAsxXsxFsxcrwb follows 1 size 51656/0 mtime 2022-04-07T06:34:53.430871+0000 xattrs(v=18446744071603341568 l=0)) v10 ==== 236+0+0 (unknown 1066047011 0 0) 0x55b2643dac00 con 0x55b2619b9400
2022-04-07T06:35:30.002+0000 7f73ae424700  7 mds.0.locker handle_client_caps  on 0x1000000615a tid 11103 follows 1 op update flags 0x0

That means the kclient was still holding the Fwb caps until around 3 hours later:

2022-04-07T09:34:23.874+0000 7f73ae424700  1 -- [v2:172.21.15.116:6826/4138533272,v1:172.21.15.116:6827/4138533272] <== client.14520 v1:192.168.0.1:0/3995310513 40893 ==== client_caps(update ino 0x1000000615a 1 seq 3 caps=pAsLsXsFsc dirty=- wanted=- follows 0 size 51656/0 mtime 2022-04-07T06:34:53.430871+0000 xattrs(v=18446744071603341568 l=0)) v10 ==== 236+0+0 (unknown 2446509421 0 0) 0x55b26413b800 con 0x55b2619b9400

The kernel code seems buggy, when receiving the Fxwb revoke request it will queue a CEPH_I_WORK_WRITEBACK work for the inode, and then try to write back the dirty pages and clear the queue bit of CEPH_I_WORK_WRITEBACK.

But if after dirty pages were wrote back, if the ci->i_wrbuffer_ref is not 0 it won't release the Fwb caps. And after that the kernel won't trigger to do the writeback. That means the mds.0 may wait the kclient to release the Fwb caps for a very long time until someone triggers to do the dirty pages writeback.

#5 Updated by Xiubo Li 7 months ago

  • Copied to Bug #55377: kclient: mds revoke Fwb caps stuck after the kclient tries writebcak once added

#6 Updated by Xiubo Li 7 months ago

I have create a new tracker#55377 to fix the kernel issue in https://tracker.ceph.com/issues/55240#note-4.

And this tracker will be used to fix the issue in https://tracker.ceph.com/issues/55240#note-3 only.

#7 Updated by Xiubo Li 7 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 45955

#8 Updated by Xiubo Li 7 months ago

  • Subject changed from mds stuck in up:stopping to mds: stuck 2 seconds and keeps retrying to find ino from auth MDS

#9 Updated by Xiubo Li 7 months ago

NOTE: in this test the inline data is enabled:

2022-04-07T09:34:25.794+0000 7f73ae424700  1 mds.cephfs.smithi116.vbikdi Updating MDS map to version 37 from mon.0
2022-04-07T09:34:25.794+0000 7f73ae424700 10 mds.cephfs.smithi116.vbikdi      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}

#10 Updated by Xiubo Li 7 months ago

  • Related to Bug #55421: kclient: kernel BUG at fs/ceph/addr.c:125! when running the ffsb.sh added

#11 Updated by Xiubo Li 7 months ago

  • Related to deleted (Bug #55421: kclient: kernel BUG at fs/ceph/addr.c:125! when running the ffsb.sh)

#12 Updated by Venky Shankar 7 months ago

  • Status changed from Fix Under Review to Pending Backport

#13 Updated by Backport Bot 7 months ago

  • Copied to Backport #55658: quincy: mds: stuck 2 seconds and keeps retrying to find ino from auth MDS added

#14 Updated by Backport Bot 7 months ago

  • Copied to Backport #55659: pacific: mds: stuck 2 seconds and keeps retrying to find ino from auth MDS added

#15 Updated by Backport Bot 4 months ago

  • Tags set to backport_processed

#16 Updated by Konstantin Shalygin about 2 months ago

  • Status changed from Pending Backport to Resolved
  • Tags deleted (backport_processed)

#17 Updated by Konstantin Shalygin about 2 months ago

  • % Done changed from 0 to 100

Also available in: Atom PDF