Project

General

Profile

Actions

Bug #38582

open

Pool storage MAX AVAIL reduction seems higher when single OSD reweight is done

Added by Nokia ceph-users about 5 years ago. Updated about 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

i have a 5 node ceph 11.2.0 cluster with 335 osds. Each OSD is a 4TB HDD. It has one EC 4+1 pool.

Due to high storage in the cluster, I did a "ceph osd reweight" on the OSD which had was in near-full state. I gradually reduced the weight from 1 to 0.93. After that, we noticed that the pool's MAX AVAIL value had dropped by ~40TB.
I understand that reweighting the OSDs will affect the available storage values. But i am not sure if it is normal to see a high drop since the OSD that i re-weighted was only of 4TB? Wouldn't the drop be lesser/equal to 4TB or is that not the correct calculation?
Is there any other information i should add?


Files

osd-dump.txt (80.8 KB) osd-dump.txt Nokia ceph-users, 03/20/2019 12:05 PM
osd-tree.txt (21.8 KB) osd-tree.txt Nokia ceph-users, 03/20/2019 12:05 PM
crush_map_decompressed (16.2 KB) crush_map_decompressed Nokia ceph-users, 03/20/2019 12:05 PM
Actions #1

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to RADOS

That does seem odd. Can you attach your crush map, "ceph osd tree", and "ceph osd dump" to this ticket?

Updated by Nokia ceph-users about 5 years ago

Sorry for the delay. Attaching the required.
osd 155 is the OSD mentioned in description. The one which was manually re-weighted.

Actions #3

Updated by Nokia ceph-users about 5 years ago

Correction in the description.
It looks like the pools MAX AVAIL value had dropped after there was a hard disk failure for that OSD. A couple of days before we had reweighted the OSD, there was a harddisk failure for this OSD. (we create one OSD on top of one HDD). The disk was replaced later and then only we had reweighted. But the drop in pool storage was when the disk had failed itself.
Even then, is 40TB drop normal since the disk size is actually only 4TB?

Actions

Also available in: Atom PDF