Project

General

Profile

Actions

Bug #21745

closed

mds: MDBalancer using total (all time) request count in load statistics

Added by John Spray over 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
balancer
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This was pointed out by Xiaoxi Chen

The get_req_rate() function is returning the value of l_mds_request, which is a counter.

This is then used in the load calculation in MDBalancer, resulting in crazy high values like this:

2017-10-09 10:05:09.128325 7fc899748700  0 mds.0.bal   mds.1 mdsload<[0,0 0]/[0,0 0], req 3.11991e+07, hr 0, qlen 0, cpu 0.12> = 3.11991e+07 ~ 15711.7


Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #23671: luminous: mds: MDBalancer using total (all time) request count in load statisticsResolvedZheng YanActions
Actions #1

Updated by Xiaoxi Chen over 6 years ago

although it is simple to add last_timestamp and last_reqcount so that we can get an average TPS, but TPS may fluctuate a lot, which may result dirfrag ping-pong between multi mdss.

We probably need longer(configurable?) average for high fluctuate value like q_len and req_rate.

Actions #2

Updated by Patrick Donnelly about 6 years ago

  • Subject changed from MDBalancer using total (all time) request count in load statistics to mds: MDBalancer using total (all time) request count in load statistics
  • Category deleted (90)
  • Assignee set to Zheng Yan
  • Target version set to v13.0.0
  • Source set to Community (dev)
  • Tags set to multimds,balancer
  • Severity changed from 3 - minor to 2 - major
  • Affected Versions v12.2.5 added
  • Component(FS) MDS added

https://github.com/ceph/ceph/pull/19220/commits/fb8d07772ffd3b061d2752c6b3375f6cb187be4b

Zheng, please amend the above commit that it fixes this issue.

Actions #4

Updated by Patrick Donnelly about 6 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Tags changed from multimds,balancer to balancer
  • Labels (FS) multimds added
Actions #5

Updated by Nathan Cutler about 6 years ago

  • Copied to Backport #23671: luminous: mds: MDBalancer using total (all time) request count in load statistics added
Actions #6

Updated by Nathan Cutler almost 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF