Bug #24440
common/DecayCounter: set last_decay to current time when decoding decay counter
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Community (dev)
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Recently we found mds load might become zero on another MDS under multi-MDSes scenario. The ceph version is Luminous.
From below log, MDS.1 got its load and would send it to MDS.0.
2018-05-13 17:20:31.252804 7ffa471d7700 0 mds.1.bal mds.1 epoch 17 load mdsload<[0,6245.74 12491.5]/[0,33581.5 67163], req 13440, hr 0, qlen 18, cpu 0.04>
When MDS.0 handled the heartbeat and got MDS.1's load from message, the load became zero.
2018-05-13 17:20:30.988828 7f65d8b45700 0 mds.0.bal mds.0 epoch 17 load mdsload<[5580.31,18907.2 43394.8]/[33213.8,109221 251656], req 42543, hr 0, qlen 36, cpu 0.75> 2018-05-13 17:20:31.280096 7f65db34a700 0 mds.0.bal mds.0 mdsload<[5580.31,18907.2 43394.8]/[33213.8,109221 251656], req 42543, hr 0, qlen 36, cpu 0.75> = 127950 ~ 43394.8 2018-05-13 17:20:31.280113 7f65db34a700 0 mds.0.bal mds.1 mdsload<[0,0 0]/[0,0 0], req 13440, hr 0, qlen 18, cpu 0.04> = 37045.8 ~ 12564.2
We found the last_decay in this message is 0 (utime_t()), so the eclipse time is very large and the original value would be decayed to 0.
Related issues
History
#1 Updated by Zhi Zhang almost 6 years ago
#2 Updated by Patrick Donnelly almost 6 years ago
- Status changed from New to Pending Backport
- Assignee set to Zhi Zhang
- Target version set to v14.0.0
- Backport set to mimic,luminous
#3 Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #24537: mimic: common/DecayCounter: set last_decay to current time when decoding decay counter added
#4 Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #24538: luminous: common/DecayCounter: set last_decay to current time when decoding decay counter added
#5 Updated by Nathan Cutler over 5 years ago
- Status changed from Pending Backport to Resolved