Actions
Bug #45076
closedrados: Sharded OpWQ drops suicide_grace after waiting for work
% Done:
0%
Source:
Tags:
Backport:
mimic, nautilus, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The Sharded OpWQ will opportunistically wait for more work when processing an empty queue. While waiting, the default work queue heartbeat grace and suicide_grace values are modified [0]. The `threadpool_default_timeout` grace is left applied and suicide_grace is disabled.
The original work queue defaults should be re-applied if work is found. This can result in hung operations that do not trigger an OSD suicide recovery.
[0] https://github.com/ceph/ceph/blob/38ae96e1c9a4f8ad3095626c71951a122bdc8fe7/src/osd/OSD.cc#L10451
Actions