Project

General

Profile

Bug #51722

mds: slow performance on parallel rm operations for multiple kclients

Added by Xiubo Li over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There has another case that could cause the unlinkat to be delayed for a long time sometimes, such as for the "removalC.test01921" directory, it was delayed for almost 2 seconds:

removalC.log:unlinkat(AT_FDCWD, "removalC.test01921", AT_REMOVEDIR) = 0 <1.722029>

Sometimes the delay could up to 3 seconds or longer from my previous tests.

From the mds.a.log I can see that "lookup" request finished very fast as expected:

2021-07-16 16:46:36.326 1554aac09700 1 -- v1:10.72.47.117:6817/578524851 <== client.4281 v1:10.72.47.117:0/346661127 29425 ==== client_request(client.4281:23536 lookup #0x1/removalC.test01921 2021-07-16 16:46:36.316309 caller_uid=0, caller_gid=0{0,}) v4 ==== 162+0+0 (unknown 4262505302 0 0) 0x5628efca4600 con 0x5628ef411200
...
2021-07-16 16:46:36.326 1554aac09700 10 mds.0.server reply to stat on client_request(client.4281:23536 lookup #0x1/removalC.test01921 2021-07-16 16:46:36.316309 caller_uid=0, caller_gid=0{0,}) v4
2021-07-16 16:46:36.326 1554aac09700 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4281:23536 lookup #0x1/removalC.test01921 2021-07-16 16:46:36.316309 caller_uid=0, caller_gid=0{0,}) v4

And the "rmdir" request came in a very short time too:

2021-07-16 16:46:36.350 1554aac09700 1 -- v1:10.72.47.117:6817/578524851 <== client.4281 v1:10.72.47.117:0/346661127 29429 ==== client_request(client.4281:23539 rmdir #0x1/removalC.test01921 2021-07-16 16:46:36.343310 caller_uid=0, caller_gid=0{0,}) v4 ==== 268+0+0 (unknown 203429810 0 0) 0x5628f0fb1b00 con 0x5628ef411200
...
2021-07-16 16:46:36.382 1554aac09700 7 mds.0.server dispatch_client_request client_request(client.4281:23539 rmdir #0x1/removalC.test01921 2021-07-16 16:46:36.343310 caller_uid=0, caller_gid=0{0,}) v4

And then when trying to revoke the "Ls" cap from kclient and the "rmdir" request was queued:

2021-07-16 16:46:36.384 1554aac09700 7 mds.0.locker simple_lock on (ilink sync) on [inode 0x10000001693 [...2,head] /removalC.test01921/ auth v11560 ap=2 dirtyparent f() n(v0 1=0+1) (isnap xlock x=1 by 0x5628f1c5d600) (inest lock) (ifile sync r=1) (iversion lock w=1 last_client=4281) caps={4281=pAsLsXsFs/pAsLsXsFsx@2},l=4281 | request=1 lock=3 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5628f23b7100]
2021-07-16 16:46:36.384 1554aac09700 7 mds.0.locker issue_caps loner client.4281 allowed=pAsXsFscrl, xlocker allowed=pAsXsFscrl, others allowed=pAsXsFscrl on [inode 0x10000001693 [...2,head] /removalC.test01921/ auth v11560 ap=2 dirtyparent f() n(v0 1=0+1) (ilink sync->lock) (isnap xlock x=1 by 0x5628f1c5d600) (inest lock) (ifile sync r=1) (iversion lock w=1 last_client=4281) caps={4281=pAsLsXsFs/pAsLsXsFsx@2},l=4281 | request=1 lock=3 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5628f23b7100]
2021-07-16 16:46:36.384 1554aac09700 20 mds.0.locker client.4281 pending pAsLsXsFs allowed pAsXsFscrl wanted pAsLsXsFsx
2021-07-16 16:46:36.384 1554aac09700 7 mds.0.locker sending MClientCaps to client.4281 seq 3 new pending pAsXsFs was pAsLsXsFs
2021-07-16 16:46:36.384 1554aac09700 20 mds.0.cache.ino(0x10000001693) encode_cap_message pfile 0 pauth 0 plink 0 pxattr 0 ctime 2021-07-16 16:42:52.406815
2021-07-16 16:46:36.384 1554aac09700 10 mds.0.5 send_message_client_counted client.4281 seq 17280 client_caps(revoke ino 0x10000001693 15513 seq 3 caps=pAsXsFs dirty=- wanted=pAsLsXsFsx follows 0 size 0/0 ts 1/18446744073709551615 mtime 2021-07-16 16:42:52.406815) v11
2021-07-16 16:46:36.384 1554aac09700 1 -- v1:10.72.47.117:6817/578524851 --> v1:10.72.47.117:0/346661127 -- client_caps(revoke ino 0x10000001693 15513 seq 3 caps=pAsXsFs dirty=- wanted=pAsLsXsFsx follows 0 size 0/0 ts 1/18446744073709551615 mtime 2021-07-16 16:42:52.406815) v11 -- 0x5628f1828000 con 0x5628ef411200
2021-07-16 16:46:36.384 1554aac09700 10 mds.0.cache.ino(0x10000001693) auth_pin by 0x5628f23b7688 on [inode 0x10000001693 [...2,head] /removalC.test01921/ auth v11560 ap=3 dirtyparent f() n(v0 1=0+1) (ilink sync->lock) (isnap xlock x=1 by 0x5628f1c5d600) (inest lock) (ifile sync r=1) (iversion lock w=1 last_client=4281) caps={4281=pAsXsFs/pAsLsXsFs/pAsLsXsFsx@3},l=4281 | request=1 lock=3 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x5628f23b7100] now 3
2021-07-16 16:46:36.384 1554aac09700 15 mds.0.cache.dir(0x1) adjust_nested_auth_pins 1 on [dir 0x1 / [2,head] auth pv=23519 v=23271 cv=1/1 REP dir_auth=0 ap=124+625 state=1610874881|complete f(v0 m2021-07-16 16:46:30.954106 365=0+365) n(v0 rc2021-07-16 16:46:30.954106 365=0+365)/n() hs=365+5635,ss=0+0 dirty=6000 | child=1 subtree=1 dirty=1 authpin=1 0x5628ef33ef00] by 0x5628f23b7100 count now 124/625
2021-07-16 16:46:36.384 1554aac09700 10 mds.0.cache.ino(0x10000001693) add_waiter tag 2000000000600000 0x5628ef6159a0 !ambig 1 !frozen 1 !freezing 1
2021-07-16 16:46:36.384 1554aac09700 15 mds.0.cache.ino(0x10000001693) taking waiter here

And the "Ls" cap revoke request also replied from kclient in a very short time:

2021-07-16 16:46:36.385 1554aac09700 1 -- v1:10.72.47.117:6817/578524851 <== client.4281 v1:10.72.47.117:0/346661127 29430 ==== client_caps(update ino 0x10000001693 15513 seq 3 caps=pAsXsFs dirty=- wanted=pAsLsXsFsx follows 0 size 0/0 mtime 2021-07-16 16:42:52.406815 xattrs(v=18446744072645829216 l=0)) v10 ==== 236+0+0 (unknown 3915617700 0 0) 0x5628f1828000 con 0x5628ef411200

But the "handle_client_caps()" was delayed to trigger until 1.7 seconds later:

2021-07-16 16:46:38.042 1554aac09700 7 mds.0.locker handle_client_caps on 0x10000001693 tid 0 follows 0 op update flags 0x0

I just add some debug logs and can see that it was stuck in "MDSDaemon::ms_dispatch2()" and was waiting for the "mds_lock":

The code:

bool MDSDaemon::ms_dispatch2(const Message::ref &m)
{
+ dout(1) << func << " lxb ---- before mds_lock --> " << m << dendl;
std::lock_guard l(mds_lock);
+ dout(1) << func << " lxb ---- after mds_lock --> " << m << dendl;
if (stopping) {
return false;
}

The debug logs:

2021-07-16 16:46:36.385 1554aac09700 1 mds.a ms_dispatch2 lxb ---- before mds_lock --> 0x5628f1828000
...
2021-07-16 16:46:38.042 1554aac09700 1 mds.a ms_dispatch2 lxb ---- after mds_lock --> 0x5628f1828000

Checked the logs, during this time the "mds_lock" was always successfully and continuously acquired by tens or hundreds of "18C_PurgeStrayLogged" finisher call backs, which were called when mdlogs were flushed and replied, while the "MDSDaemon::ms_dispatch2()" thread was hungry and never get a change to be woke up by the "mds_lock".

The implementations of the Mutex (e.g. std::mutex in C++) do not guarantee fairness, they do not guarantee that the lock will be acquired by threads in the order that they called lock().


Related issues

Copied to CephFS - Backport #52441: pacific: mds: slow performance on parallel rm operations for multiple kclients Resolved

History

#1 Updated by Xiubo Li over 2 years ago

  • Status changed from New to Fix Under Review
  • Priority changed from Normal to High
  • Pull request ID set to 42394

#2 Updated by Xiubo Li over 2 years ago

  • Pull request ID changed from 42394 to 42620

#3 Updated by Xiubo Li over 2 years ago

The new PR will switch mds_lock to fair mutex.

#4 Updated by Patrick Donnelly over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Source set to Development
  • Backport set to pacific

#5 Updated by Backport Bot over 2 years ago

  • Copied to Backport #52441: pacific: mds: slow performance on parallel rm operations for multiple kclients added

#6 Updated by Loïc Dachary over 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF