Project

General

Profile

Actions

Bug #20129

closed

Client syncfs is slow (waits for next MDS tick)

Added by dongdong tao almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Performance/Resource Usage
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

in function client::unmount there are codes below:
-------------
while (!mds_requests.empty()) {
ldout(cct, 10) << "waiting on " << mds_requests.size() << " requests" << dendl;
mount_cond.Wait(client_lock);
}
------------
here in my situation, there is one "rename" request left, so unmount is waiting for that rename request to be finished, and it takes almost 5 seconds. let me describe what are trying to do here first:
we are testing the performance about cephfs with hadoop. the hadoop job would call a lot ceph mount and unmount, so it takes too much time on unmount
cause almost each unmount would take 5 seconds to wait for the last write operation (rename, create, mkdir ...)

and i have do a lot analysis here:
1 mds would reply 2 times for write operation:
-> first time is the early reply which is unsafe reply for the client, this is the time when the memory is ready and prepare to submit the journal event, and when the client receive the unsafe reply, client would go ahead and start next request, and don't need to wait untill flushing the journal
-> second reply is the safe reply for the client, this is the time when the journal has been flushed, and my issue is that client::unmount wait for this reply for almost 5 seconds.

so, through some debugging, i have found that mds would call mdlog->flush() to flush the journal.
and i also found that if a write operation is the session's last operation, mds would only flush the journal of that write operation when reach a
mds tick interval which is 5 seconds. so that explains why my unmount always take seconds to finish. and i have try to set the mds tick interval to
2 seconds, and it worked. and we can not just set that mds tick interval to 2 to solve our issue

so my question is, how can we solve the issue completely by modify the source code ? would that be safe if we add mdlog->flush() at the end of handle_client_reply handle_client_mkdir and so on. could you please give the suggestion here first, i believe this is a bug.

Regards,
Dongdong

Actions

Also available in: Atom PDF