Project

General

Profile

Actions

Bug #19240

closed

multimds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid test

Added by Patrick Donnelly about 7 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is possibly not a bug but I thought I would put it here to solicit comments/assistance on what could be causing this disparity. Two graphs showing the client op throughput are on the mira I'm using to host results:

mira092.front.sepia.ceph.com:/mnt/pdonnell/vault/8x8192 20000C MDS 64x4096 Client.2/mds-throughput.png

mira092.front.sepia.ceph.com:/mnt/pdonnell/vault/16x8192 20000C MDS 64x4096 Client/mds-throughput.png

Other graphs are also in the containing directories.

For the 8 MDS case, we're seeing about ~30k maximum aggregate client requests per second and maximum ~14k for 16 MDS. (Note that despite this apparently lack of increased scaling in op throughput, the 16 MDS test still finishes faster. So op throughput isn't telling the whole story on how successful the load distribution is.)

Some ideas for what's happening:

  • Perhaps due to load on the 8xMDS, the clients are issuing more requests. Perhaps caps being revoked and needing to be reissued?
  • The 16 MDS case is slower due to the increased work getting locks.

Related issues 1 (0 open1 closed)

Related to CephFS - Feature #19362: mds: add perf counters for each type of MDS operationResolvedPatrick Donnelly03/23/2017

Actions
Actions #1

Updated by Patrick Donnelly about 7 years ago

  • Assignee set to Patrick Donnelly

I believe I have this one figured out. The client requests graph (mds-request.png) shows that the 8 MDS workflow is responding to ~250% more client requests than the 16MDS workflow. (This was hard to notice because the y-axis ranges are not the same! :) This increase in work obviously indicates the 8MDS workflow is doing something the 16MDS workflow doesn't, so what is it?

It looks like the increased numer of requests is due to inodes being removed from the cache which is introducing churn. This is evident by the explosive growth in expired inodes by the 1 hour mark for 8MDS, 3 million expired inodes. For 16 MDS, 1.4 million inodes expire by the end of the workflow. Also, looking at the number of cached inodes loaded (mds-ino+), the 8MDS workflow loads ~40% more inodes at ~7.8 million vs. ~5.5 million for 16MDS.

Actions #2

Updated by Patrick Donnelly about 7 years ago

Note: inodes loaded is visible in mds-ino+.png in both workflow directories.

Actions #3

Updated by Patrick Donnelly about 7 years ago

  • Related to Feature #19362: mds: add perf counters for each type of MDS operation added
Actions #4

Updated by Patrick Donnelly almost 7 years ago

To close this we should confirm hypothesis with new op tracking from http://tracker.ceph.com/issues/19362

I'll do a run with Linode to check.

Actions #5

Updated by John Spray almost 7 years ago

  • Subject changed from mds: troubling op throughput scaling from 8 to 16 MDS in kernel bulid test to mds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid test
Actions #6

Updated by John Spray almost 7 years ago

  • Subject changed from mds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid test to multimds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid test
Actions #7

Updated by Patrick Donnelly over 5 years ago

  • Status changed from New to Closed

These results are likely obsolete. Closing.

Actions #8

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (90)
  • Labels (FS) multimds added
Actions

Also available in: Atom PDF