Bug #50840
closedmds: CephFS kclient gets stuck when getattr() on a certain file
100%
Description
Copied from the mail list:
I deploy a 3-node Ceph cluster (v15.2.9) and the CephFS is mounted via kclient (linux-4.14.24) on all of the 3 nodes. All of the kclients try to update (read/write) a certain file periodically in order to know whether the CephFS is alive or not. After a kclient gets evicted due to abnormal reboot, a new kclient mounts to the CephFS when the node comes back. However, the newly mounted kclient gets stuck when it tries to getattr on the common file. Under such conditions, all of the other kclients are affected and they cannot update the common file, too. From the debugfs entris, a request does get stuck: ------ [/sys/kernel/debug/ceph/1bbb7753-85e5-4d33-a860-84419fdcfd7d.client3230166] # cat mdsc 12 mds0 getattr #100000003ed [/sys/kernel/debug/ceph/1bbb7753-85e5-4d33-a860-84419fdcfd7d.client3230166] # cat osdc REQUESTS 0 homeless 0 LINGER REQUESTS BACKOFFS [/sys/kernel/debug/ceph/1bbb7753-85e5-4d33-a860-84419fdcfd7d.client3230166] # ceph -s cluster: id: 1bbb7753-85e5-4d33-a860-84419fdcfd7d health: HEALTH_WARN 1 MDSs report slow requests services: mon: 3 daemons, quorum Jerry-ceph-n2,Jerry-x85-n1,Jerry-x85-n3 (age 23h) mgr: Jerry-x85-n1(active, since 25h), standbys: Jerry-ceph-n2, Jerry-x85-n3 mds: cephfs:1 {0=qceph-mds-Jerry-ceph-n2=up:active} 1 up:standby-replay 1 up:standby osd: 18 osds: 18 up (since 23h), 18 in (since 23h) ------ The MDS logs (debug_mds =20) are provided: https://drive.google.com/file/d/1aj101NOTzCsfDdC-neqVTvKpEPOd3M6Q/view?usp=sharing Some of the logs wrt client.3230166 and ino#100000003ed are shown as below: 2021-04-27T11:57:03.467+0800 7fccbd3be700 4 mds.0.server handle_client_request client_request(client.3230166:12 getattr pAsLsXsFs #0x100000003ed 2021-04-27T11:57:03.469426+0800 caller_uid=0, caller_gid=0{}) v2 2021-04-27T11:57:03.467+0800 7fccbd3be700 20 mds.0.98 get_session have 0x56130c5ce480 client.3230166 v1:192.168.92.89:0/679429733 state open 2021-04-27T11:57:03.467+0800 7fccbd3be700 15 mds.0.server oldest_client_tid=12 2021-04-27T11:57:03.467+0800 7fccbd3be700 7 mds.0.cache request_start request(client.3230166:12 nref=2 cr=0x56130db96480) 2021-04-27T11:57:03.467+0800 7fccbd3be700 7 mds.0.server dispatch_client_request client_request(client.3230166:12 getattr pAsLsXsFs #0x100000003ed 2021-04-27T11:57:03.469426+0800 caller_uid=0, caller_gid=0{}) v2 2021-04-27T11:57:03.467+0800 7fccbd3be700 10 mds.0.locker acquire_locks request(client.3230166:12 nref=3 cr=0x56130db96480) 2021-04-27T11:57:03.467+0800 7fccbd3be700 10 mds.0.cache.ino(0x100000003ed) auth_pin by 0x56130c584ed0 on [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91583 pv91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync w=1) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1} caps={3198501=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=2 caps=1 dirty=1 waiter=0 authpin=1 0x56130c584a00] now 4 2021-04-27T11:57:03.467+0800 7fccbd3be700 15 mds.0.cache.dir(0x100000003eb) adjust_nested_auth_pins 1 on [dir 0x100000003eb /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ [2,head] auth pv=91586 v=91584 cv=0/0 ap=1+4 state=1610874881|complete f(v0 m2021-04-23T15:00:04.377198+0800 6=6+0) n(v3 rc2021-04-27T11:57:02.625542+0800 b38005818 6=6+0) hs=6+0,ss=0+0 dirty=4 | child=1 dirty=1 waiter=0 authpin=1 0x56130c586a00] by 0x56130c584a00 count now 1/4 2021-04-27T11:57:03.467+0800 7fccbd3be700 10 mds.0 RecoveryQueue::prioritize not queued [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91583 pv91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync w=1) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1} caps={3198501=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=2 caps=1 dirty=1 waiter=0 authpin=1 0x56130c584a00] 2021-04-27T11:57:03.467+0800 7fccbd3be700 7 mds.0.locker rdlock_start waiting on (ifile mix->sync w=1) on [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91583 pv91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync w=1) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1} caps={3198501=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=2 caps=1 dirty=1 waiter=0 authpin=1 0x56130c584a00] 2021-04-27T11:57:03.467+0800 7fccbd3be700 10 mds.0.cache.ino(0x100000003ed) add_waiter tag 2000000040000000 0x56130ea1bbe0 !ambig 1 !frozen 1 !freezing 1 2021-04-27T11:57:03.467+0800 7fccbd3be700 15 mds.0.cache.ino(0x100000003ed) taking waiter here 2021-04-27T11:57:03.468+0800 7fccbd3be700 20 mds.0.locker client.3230166 pending pAsLsXsFr allowed pAsLsXsFrl wanted pAsxXsxFsxcrwb 2021-04-27T11:57:03.468+0800 7fccbd3be700 7 mds.0.locker handle_client_caps on 0x100000003ed tid 0 follows 0 op update flags 0x2 2021-04-27T11:57:03.468+0800 7fccbd3be700 20 mds.0.98 get_session have 0x56130b81f600 client.3198501 v1:192.168.50.108:0/2478094748 state open 2021-04-27T11:57:03.468+0800 7fccbd3be700 10 mds.0.locker head inode [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91583 pv91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync w=1) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1} caps={3198501=pAsLsXsFr/pAsLsXsFrw/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=2 caps=1 dirty=1 waiter=1 authpin=1 0x56130c584a00] 2021-04-27T11:57:03.468+0800 7fccbd3be700 10 mds.0.locker follows 0 retains pAsLsXsFr dirty - on [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91583 pv91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync w=1) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1} caps={3198501=pAsLsXsFr/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=2 caps=1 dirty=1 waiter=1 authpin=1 0x56130c584a00] 2021-04-27T11:57:37.027+0800 7fccbb3ba700 0 log_channel(cluster) log [WRN] : slow request 33.561029 seconds old, received at 2021-04-27T11:57:03.467164+0800: client_request(client.3230166:12 getattr pAsLsXsFs #0x100000003ed 2021-04-27T11:57:03.469426+0800 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
More detail please see thread https://www.spinics.net/lists/ceph-devel/msg51139.html.
Updated by Patrick Donnelly almost 3 years ago
- Target version set to v17.0.0
- Backport set to pacific,octopus
Updated by Xiubo Li almost 3 years ago
From the logs, we can see that the inode 0x100000003ed was trying to recover the size at least 2 minutes ago, the log file is not complete so it should be earlier:
2021-04-27T11:57:01.632+0800 7fccbd3be700 10 mds.0.locker head inode [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91579 ap=1 recovering s=1048576 n(v0 rc2021-04-27T11:56:42.111053+0800 b1048576 1=1+0) (ifile mix) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1} caps={3198501=pAsLsXsFrw/pAsxXsxFsxcrwb@16,3198504=pAsLsXsFrw/pAsxXsxFsxcrwb@8} | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 waiter=0 authpin=1 0x56130c584a00]
When the kclient try to get the attribute of this inode, after finished gathering the file lock, the recover queue still didn't triggered so it will wait it:
2021-04-27T11:57:03.473+0800 7fccb6bb1700 7 mds.0.locker wrlock_finish on (ifile mix->sync w=1) on [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync w=1) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1,3230166=0-4194304@1} caps={3198501=pAsLsXsFr/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=2 caps=1 dirty=1 waiter=1 authpin=1 0x56130c584a00] 2021-04-27T11:57:03.473+0800 7fccb6bb1700 10 mds.0.locker eval_gather (ifile mix->sync) on [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1,3230166=0-4194304@1} caps={3198501=pAsLsXsFr/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=1 caps=1 dirty=1 waiter=1 authpin=1 0x56130c584a00] 2021-04-27T11:57:03.473+0800 7fccb6bb1700 10 mds.0.locker next state is sync issued/allows loner /scrl xlocker /scrl other r/scrl 2021-04-27T11:57:03.473+0800 7fccb6bb1700 7 mds.0.locker eval_gather finished gather on (ifile mix->sync) on [inode 0x100000003ed [2,head] /QTS/VOL_1/.ovirt_data_domain/37065419-e7f3-47ca-97df-8af0c67d30a0/dom_md/ids auth v91585 ap=4 recovering s=1048576 n(v0 rc2021-04-27T11:57:02.625542+0800 b1048576 1=1+0) (ifile mix->sync) (iversion lock) cr={3198501=0-4194304@1,3198504=0-4194304@1,3221169=0-4194304@1,3230166=0-4194304@1} caps={3198501=pAsLsXsFr/pAsxXsxFsxcrwb@17,3198504=pAsLsXsFr/pAsxXsxFsxcrwb@9,3230166=pAsLsXsFr/pAsxXsxFsxcrwb@2} | ptrwaiter=1 request=1 lock=1 caps=1 dirty=1 waiter=1 authpin=1 0x56130c584a00] 2021-04-27T11:57:03.473+0800 7fccb6bb1700 7 mds.0.locker eval_gather finished gather, but still recovering
This will stop returning the finisher callback from the watier list and then will block the getattr() to respond to the kclient, then the kclient get stuck.
After the inode was enqueued to the recovery queue, it should trigger to start the queue to do the recovery immediately or later, but the commit https://github.com/ceph/ceph/pull/37383/commits/83594aee2c8c8027c64e1c029aaaf4e7b177822e didn't.
After Zheng's it works well for me.
Updated by Xiubo Li almost 3 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 41431
Updated by Patrick Donnelly almost 3 years ago
- Status changed from Fix Under Review to Pending Backport
- Source set to Community (user)
Updated by Backport Bot almost 3 years ago
- Copied to Backport #51202: octopus: mds: CephFS kclient gets stuck when getattr() on a certain file added
Updated by Backport Bot almost 3 years ago
- Copied to Backport #51203: pacific: mds: CephFS kclient gets stuck when getattr() on a certain file added
Updated by Konstantin Shalygin over 1 year ago
- Status changed from Pending Backport to Resolved
- % Done changed from 0 to 100
- Tags deleted (
backport_processed)