Project

General

Profile

Actions

Bug #65591

open

Pool MAX_AVAIL goes UP when an OSD is marked down+in

Added by Michael Kidd 13 days ago. Updated 3 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
pgmap
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Example:
  • Cluster with 4 OSD nodes, 10 OSDs each
  • 3x replicated pool
  • `max_avail` from `ceph df detail --format=json` output with all OSDs `up+in`: 72158076207104
  • `max_avail` with 1 OSD node (10 OSDs) `down+in`: 96042674552832

1. The `raw_used_rate` is passed in to function `PGMapDigest::dump_object_stat_sum`, which should be equal to: * the number of copies for replicated pools * or ( K + M / K ) for EC pools.

2. At: https://github.com/ceph/ceph/blob/main/src/mon/PGMap.cc#L886

raw_used_rate *= (float)(sum.num_object_copies - sum.num_objects_degraded) / sum.num_object_copies;

  • This applies a scaling factor equal to the percentage of non-degraded object copies ( when compared to the total object copies count ).
  • Using the 'all_down' pgdump for libvirt-pool:
          num_object_copies:    7287540
          num_objects_degraded: 1812426
          Scaling factor applied: 0.7512979688619205
    

3. The 'MAX_AVAIL' value is calculated at: https://github.com/ceph/ceph/blob/main/src/mon/PGMap.cc#L901

   auto avail_res = raw_used_rate ? avail / raw_used_rate : 0;

  • 'avail' is the raw available bytes
  • 'raw_used_rate' is now ~75% of what it was, thus the 'MAX_AVAIL' increases
   avail: ( min(osd_avail_kbytes) * num_osds ) - ( sum(osd_max_kbytes) * ( 1 - mon_osd_full_ratio ))
          (5597462260 * 40) - ( 250048839680 * ( 1 - 0.95 )) = 211396048416
   raw_used_rate: 3 * 0.7512979688619205 = 2.253893907
   max_avail: 211396048416 / 2.253893907 * 1024 = 96042476953095
Actions #1

Updated by Radoslaw Zarzynski 10 days ago

  • Pull request ID set to 57003
Actions #2

Updated by Radoslaw Zarzynski 3 days ago

Bump up.

Actions

Also available in: Atom PDF