Project

General

Profile

Bug #19986

PR #14886 creates a SafeTimer thread per PG

Added by Mark Nelson 4 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
OSD
Target version:
-
Start date:
05/19/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
jewel,kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

This appears to be causing huge numbers of idle safe_timer threads to be created on the OSD. While this is likely not ideal for a variety of reasons, the immediate fallout is that it's making wallclock profiling ridiculously slow and also making it really irritating to gather backtraces from gdb.

https://github.com/ceph/ceph/pull/14886


Related issues

Copied to Ceph - Backport #20172: jewel: PR #14886 creates a SafeTimer thread per PG Resolved
Copied to Ceph - Backport #20173: kraken: PR #14886 creates a SafeTimer thread per PG Resolved

History

#1 Updated by Brad Hubbard 4 months ago

Hi Mark, Thanks for the report. The intention was that this should increase performance as it frees up the work queue to make progress by sleeping asynchronously, rather than synchronously as it was doing prior to 14886. I'll see what can be done about the issues you are seeing. Do you see the same sort of issue (to a lesser extent) when snap trimming is in progress as the sleep in that case is implemented in a very similar fashion?

#2 Updated by Greg Farnum 4 months ago

We just don't need a Timer object/thread for every PG, Brad. The snap trim timer does the same thing with sleeps but uses a single Timer per OSD. :)

#3 Updated by Greg Farnum 4 months ago

I guess in particular, the issue is that there is a cost per thread (not a huge one, but it exists). Mark is seeing it pretty aggressively because it dramatically increases the number of threads and he's doing hacks with libunwind on every thread and things, but it also manifests as higher VSS and RSS that we don't want. Should be pretty simple to have a shared Timer object instead of one per PG.
(I'm not sure if there's any particular reason we have more than one Timer to begin with; I guess just because they interact with different threads and locks and we don't want them to contend?)

#4 Updated by Brad Hubbard 4 months ago

Yep, seemed like the simplest implementation but I guess I should have seen this coming... I'll work out a better way to do this ASAP, as you say a shared object with some form of synchronisation.

#5 Updated by Brad Hubbard 4 months ago

This was a stupid error and I have a pretty simple fix. Testing it now and sorry for the mess.

[update]There's currently a problem with Shaman, etc. Will test this and post PR ASAP[/update]

#6 Updated by Brad Hubbard 4 months ago

  • Status changed from New to In Progress

Have https://github.com/ceph/ceph/pull/15217 for now although this may need modification to account for the pg disappearing while the timer is running. Looking into this further.

#7 Updated by Sage Weil 4 months ago

  • Status changed from In Progress to Resolved

#8 Updated by Brad Hubbard 4 months ago

#9 Updated by Brad Hubbard 4 months ago

  • Status changed from Resolved to Pending Backport

#10 Updated by Nathan Cutler 4 months ago

  • Backport set to jewel,kraken

Must be backported together with #19497 - i.e. total of two master PRs

#11 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #20172: jewel: PR #14886 creates a SafeTimer thread per PG added

#12 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #20173: kraken: PR #14886 creates a SafeTimer thread per PG added

#13 Updated by Nathan Cutler about 2 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF