Project

General

Profile

Actions

Bug #41255

closed

backfill_toofull seen on cluster where the most full OSD is at 1%

Added by Bryan Stillwell over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic, nautilus, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After upgrading both our test and staging clusters to Nautilus (14.2.2), we've seen both of them report some PGs as backfill_toofull since upgrading when marking a single OSD into the cluster.

The test cluster is currently using just 148 GiB of storage, yet it has over 100 4TB drives (it would be ~4% full if all data was on one drive!) The staging cluster is using 14 TiB of storage, yet it has over 400 4TB drives.

Both clusters are a mixture of FileStore and BlueStore OSDs, which I'm wondering if that has anything to do with it. The PGs end up backfilling eventually, but they shouldn't ever go into a toofull mode on either of these clusters.


Related issues 4 (0 open4 closed)

Related to RADOS - Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)ResolvedDavid Zafman

Actions
Copied to RADOS - Backport #41582: luminous: backfill_toofull seen on cluster where the most full OSD is at 1%RejectedDavid ZafmanActions
Copied to RADOS - Backport #41583: nautilus: backfill_toofull seen on cluster where the most full OSD is at 1%ResolvedNathan CutlerActions
Copied to RADOS - Backport #41584: mimic: backfill_toofull seen on cluster where the most full OSD is at 1%ResolvedDavid ZafmanActions
Actions

Also available in: Atom PDF