Bug #20376: last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed - CephFS - Ceph

Actions

Copy link

Bug #20376

closed

last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed

Added by Jianyu Li almost 7 years ago. Updated almost 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Patrick Donnelly

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

When mds0 has failed and started up again, it will reset beat_epoch to zero. In this case, other MDSes should update their last_epoch_(over|under) states, otherwise these stale state will block their balance attempts. Especially when previous last_epoch_under is very large, e.g. the mds cluster has run for a long time, it means balance activities will be delayed for a long time for the new beat_epoch catches up previous last_epoch_under:

// am i over long enough?
if (last_epoch_under && beat_epoch - last_epoch_under < 2) {
dout(5) << " i am overloaded, but only for " << (beat_epoch - last_epoch_under) << " epochs" << dendl;
return;
}
Here is a snip from the actual log which exposes this problem:
[ceph@c152 /var/log/ceph]$ grep 'i am overloaded, but only for' ceph-mds.c152.log-20170621
...
2017-06-20 22:08:03.964654 7f220598a700 5 mds.1.bal i am overloaded, but only for -79 epochs
2017-06-20 22:08:13.964962 7f220598a700 5 mds.1.bal i am overloaded, but only for -78 epochs
2017-06-20 22:08:23.965255 7f220598a700 5 mds.1.bal i am overloaded, but only for -77 epochs
...

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #20376

last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed

Updated by Patrick Donnelly almost 7 years ago

Updated by Jianyu Li almost 7 years ago

Updated by Patrick Donnelly almost 7 years ago

Updated by Patrick Donnelly almost 7 years ago