Bug #42503: There are a lot of OSD downturns on this node. After PG is redistributed, a PG member may cannot be selected. - RADOS - Ceph

Actions

Copy link

Bug #42503

closed

There are a lot of OSD downturns on this node. After PG is redistributed, a PG member may cannot be selected.

Added by he huang over 4 years ago. Updated over 4 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Backfill/Recovery

Target version:

Ceph - v15.0.0

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v10.0.4

ceph-qa-suite:

rados

Component(RADOS):

OSD

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

The phenomenon is a three node ceph environment with 2 + 1 redundancy configuration. Because of our own OSD cache function, when we unplug the SSD disk, half of the OSD of this node will be down, so PG will be reselected.
In fact, it has nothing to do with OSD cache, just a trigger. To put it bluntly, there are a lot of OSD downturns on this node. This is half downturns. After PG is redistributed, a PG member cannot be selected.
In short, the number of hosts and the number of redundant copies are the same. That is to say, each PG must contain one OSD in each host. On this basis, the OSD on the top half of a host is down. This happens. A PG cannot be selected.
It's not necessary. There is a certain probability. Because the input parameters of the CRUCH algorithm are different, the results are different. For example, if the pool ID is 1, it may appear. If the pool ID is 2, it may not appear.

Actions

Copy link

Updated by Greg Farnum over 4 years ago

Status changed from New to Closed

Yes, sometimes CRUSH selection fails when you have a very small number of choices compared to the number of required selections. Nuking half the OSDs in a host will make this an essentially impossible scenario to handle.

Even if CRUSH could select a node, this in general won't work because if your cluster is anywhere near full, losing half a node will mean you don't have the storage space available to correctly rebalance, since you have to be able to store a full copy of all the data on every host.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #42503

There are a lot of OSD downturns on this node. After PG is redistributed, a PG member may cannot be selected.

Updated by Greg Farnum over 4 years ago