Project

General

Profile

Bug #42332

CephContext::CephContextServiceThread might pause for 5 seconds at shutdown

Added by Jason Dillaman almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The entry loop in CephContext::CephContextServiceThread doesn't check for thread exit prior to waiting. This can result in a delay of up to "heartbeat_interval" seconds (default 5 sec). For short-lived CLI processes (i.e. rbd device list), this can periodically pause and slow down ceph-csi operations.

20:25:49.861550 set_robust_list(0x7f5fffbbe9e0, 24) = 0 <0.000059>
20:25:49.861676 gettid()                = 41017 <0.000033>
20:25:49.861753 prctl(PR_SET_NAME, "service\0dump per") = 0 <0.000051>
20:25:49.861857 clock_gettime(CLOCK_REALTIME, {1571171149, 861885650}) = 0 <0.000015>
20:25:49.861921 futex(0x55d144467cdc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1571171154, 861885650}, ffffffff) = -1 ETIMEDOUT (Connection timed out) <5.000105>
20:25:54.862144 clock_gettime(CLOCK_REALTIME, {1571171154, 862205026}) = 0 <0.000041>
20:25:54.862282 futex(0x55d144467cb0, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000037>
20:25:54.862396 madvise(0x7f5fff3be000, 8351744, MADV_DONTNEED) = 0 <0.000047>
20:25:54.862496 exit(0)                 = ?
20:25:54.862595 +++ exited with 0 +++

Related issues

Copied to RADOS - Backport #42393: luminous: CephContext::CephContextServiceThread might pause for 5 seconds at shutdown Resolved
Copied to RADOS - Backport #42394: mimic: CephContext::CephContextServiceThread might pause for 5 seconds at shutdown Resolved
Copied to RADOS - Backport #42395: nautilus: CephContext::CephContextServiceThread might pause for 5 seconds at shutdown Resolved

History

#1 Updated by Jason Dillaman almost 2 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman

#2 Updated by Kefu Chai almost 2 years ago

  • Status changed from In Progress to Pending Backport

#3 Updated by Kefu Chai almost 2 years ago

  • Pull request ID set to 30947

#4 Updated by Nathan Cutler almost 2 years ago

  • Copied to Backport #42393: luminous: CephContext::CephContextServiceThread might pause for 5 seconds at shutdown added

#5 Updated by Nathan Cutler almost 2 years ago

  • Copied to Backport #42394: mimic: CephContext::CephContextServiceThread might pause for 5 seconds at shutdown added

#6 Updated by Nathan Cutler almost 2 years ago

  • Copied to Backport #42395: nautilus: CephContext::CephContextServiceThread might pause for 5 seconds at shutdown added

#7 Updated by Nathan Cutler almost 2 years ago

@Jason - is this issue serious enough to warrant a luminous backport at this stage?

#8 Updated by Jason Dillaman almost 2 years ago

Nathan Cutler wrote:

@Jason - is this issue serious enough to warrant a luminous backport at this stage?

It's a trivial backport and luminous is still supported, so we should backport it.

#9 Updated by Nathan Cutler over 1 year ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF