Bug #58008
mds/PurgeQueue: don't consider filer_max_purge_ops when _calculate_ops
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
backport_processed
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
_calculate_ops relying on a config which can be modified on the fly will cause a bug. e.g.
- A file has 20 objects and filer_max_purge_ops config was 10.
- calling PurgeQueue::_execute_item and _calculate_ops returns 10, so ops_in_flight add 10.
- adjust filer_max_purge_ops to 20 on the fly
- calling PurgeQueue::_execute_item_complete and _calculate_ops returns 20, so ops_in_flight dec 20.
- since ops_in_flight is uint64, this cause an overflow which makes ops_in_flight far more greater than max_purge_ops and can't go back to a reasonable value.
filer_max_purge_ops will still work when _do_purge_range, so it's ok to ignore it here.
Related issues
History
#1 Updated by yixing hao about 1 year ago
When increasing filer_max_purge_ops on a pacific version mds, pq_executing_ops/pq_executing_ops_high_water of purge_queue becomes abnormal immediately, but I think it also applies to the main branch.
ceph daemon mds.x perf dump | jq .'purge_queue'
{
"pq_executing_ops": 18446744073709552000,
"pq_executing_ops_high_water": 18446744073709552000,
"pq_executing": 0,
"pq_executing_high_water": 512,
"pq_executed": 687769701,
"pq_item_in_journal": 0
}
#2 Updated by Venky Shankar about 1 year ago
- Category set to Correctness/Safety
- Status changed from New to Fix Under Review
- Assignee set to yixing hao
- Target version set to v18.0.0
- Backport set to pacific,quincy
#3 Updated by Venky Shankar 12 months ago
- Status changed from Fix Under Review to Pending Backport
#4 Updated by Backport Bot 12 months ago
- Copied to Backport #58253: quincy: mds/PurgeQueue: don't consider filer_max_purge_ops when _calculate_ops added
#5 Updated by Backport Bot 12 months ago
- Copied to Backport #58254: pacific: mds/PurgeQueue: don't consider filer_max_purge_ops when _calculate_ops added
#6 Updated by Backport Bot 12 months ago
- Tags set to backport_processed
#7 Updated by Dhairya Parmar 9 months ago
- Status changed from Pending Backport to Resolved