Project

General

Profile

Actions

Bug #45076

closed

rados: Sharded OpWQ drops suicide_grace after waiting for work

Added by Dan Hill about 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic, nautilus, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The Sharded OpWQ will opportunistically wait for more work when processing an empty queue. While waiting, the default work queue heartbeat grace and suicide_grace values are modified [0]. The `threadpool_default_timeout` grace is left applied and suicide_grace is disabled.

The original work queue defaults should be re-applied if work is found. This can result in hung operations that do not trigger an OSD suicide recovery.

[0] https://github.com/ceph/ceph/blob/38ae96e1c9a4f8ad3095626c71951a122bdc8fe7/src/osd/OSD.cc#L10451


Related issues 3 (0 open3 closed)

Copied to RADOS - Backport #45357: octopus: rados: Sharded OpWQ drops suicide_grace after waiting for workResolvedDan HillActions
Copied to RADOS - Backport #45358: mimic: rados: Sharded OpWQ drops suicide_grace after waiting for workRejectedDan HillActions
Copied to RADOS - Backport #45359: nautilus: rados: Sharded OpWQ drops suicide_grace after waiting for workResolvedDan HillActions
Actions

Also available in: Atom PDF