Feature #22973: log lines when hitting "pg overdose protection" - RADOS - Ceph

Actions

Copy link

Feature #22973

closed

log lines when hitting "pg overdose protection"

Added by Dan Stoner about 6 years ago. Updated about 6 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Reviewed:

Affected Versions:

Component(RADOS):

OSD

Pull request ID:

Description

After upgrading to Luminous we ran into situation where 10% of our pgs remained unavailable, stuck in "activating" state.

https://ceph.com/community/new-luminous-pg-overdose-protection/

That blog post says:

"If any individual OSD is ever asked to create more PGs than it should it will simply refuse and ignore the request."

The only non-debug direct evidence was this WARNING in ceph status:

'too many PGs per OSD (221 > max 200)'

(We are aware that we need to fix this situation in our cluster)

Many pgs were stuck in "activating" state which is not documented in the pg state table:

http://docs.ceph.com/docs/master/rados/operations/pg-states/

Feature idea would be that the OSD should write to standard log level when it refuses to create the pg / hits the osd_max_pg_per_osd_hard_ratio.

We saw lots of "stuck" in all of the management command outputs but not the underlying reason.

I would also inquire whether this situation should issues an ERROR rather than a WARNING since the cluster becomes "partially unavailable".

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Greg Farnum about 6 years ago

Status changed from New to Duplicate

You're right that it's bad! This will be fixed in the next luminous release after a belated backport finally happened. :)

Actions

Copy link

Updated by Greg Farnum about 6 years ago

Is duplicate of Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters added

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Feature #22973

log lines when hitting "pg overdose protection"

Updated by Greg Farnum about 6 years ago

Updated by Greg Farnum about 6 years ago