Bug #38931: osd does not proactively remove leftover PGs - RADOS - Ceph

Actions

Copy link

Bug #38931

closed

osd does not proactively remove leftover PGs

Added by Dan van der Ster about 5 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

pacific,octopus,nautilus

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v12.2.11

ceph-qa-suite:

Component(RADOS):

Pull request ID:

42141

Crash signature (v1):

Crash signature (v2):

Description

(Context: cephfs cluster running v12.2.11)

We had an osd go nearfull this weekend. I reweighted it to move out some PGs, but when looking today it's still holding much more data than it should.

The osd currently has 34 PGs mapped to it:

  74   hdd 5.45609  1.00000 5.46TiB 3.86TiB 1.60TiB 70.77 1.37  34

But the OSD itself reports 20 more:

{
    "whoami": 74,
    "state": "active",
    "oldest_map": 46992,
    "newest_map": 47738,
    "num_pgs": 54
}

When I restart the OSD, it reloads those 20, e.g. here is a PG it loads but which is mapped to [22,129,14]. That PG is currently active+clean.

2019-03-25 11:09:26.655955 7fe3fdb23d80 10 osd.74 47719 load_pgs loaded pg[2.6d( v 47685'27177090 (47637'27175587,47685'27177090] lb MIN (bitwise) local-lis/les=47680/47681 n=0 ec=371/371 lis/c 47683/47497 les/c/f 47684/47498/0 47686/47688/43553) [22,129,14] r=-1 lpr=47689 pi=[47497,47688)/1 crt=47685'27177090 lcod 0'0 unknown NOTIFY mbc={}] log((47637'27175587,47685'27177090], crt=47685'27177090)

I found a way to remove those leftover PGs (without using ceph-objectstore-tool): If the PG re-peers, then osd.74 notices he's not in the up/acting set then starts deleting the PG.
So at the moment I'm restarting those former peers to trim this OSD.

Is this all an expected behaviour?
Shouldn't the OSD start removing leftover PGs at boot time?

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #38931

osd does not proactively remove leftover PGs

Updated by Neha Ojha about 5 years ago

Updated by Greg Farnum about 5 years ago

Updated by Neha Ojha about 5 years ago

Updated by Mykola Golub almost 3 years ago

Updated by Kefu Chai almost 3 years ago

Updated by Backport Bot almost 3 years ago

Updated by Backport Bot almost 3 years ago

Updated by Backport Bot almost 3 years ago

Updated by Loïc Dachary over 2 years ago