Bug #65591: Pool MAX_AVAIL goes UP when an OSD is marked down+in - RADOS - Ceph

Actions

Copy link

Bug #65591

open

Pool MAX_AVAIL goes UP when an OSD is marked down+in

Added by Michael Kidd 13 days ago. Updated 3 days ago.

Status:

New

Priority:

Normal

Assignee:

Michael Kidd

Category:

Administration/Usability

Target version:

Ceph - v19.1.0

% Done:

Source:

Community (dev)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

rados

Component(RADOS):

pgmap

Pull request ID:

57003

Crash signature (v1):

Crash signature (v2):

Description

Example:

Cluster with 4 OSD nodes, 10 OSDs each
3x replicated pool
`max_avail` from `ceph df detail --format=json` output with all OSDs `up+in`: 72158076207104
`max_avail` with 1 OSD node (10 OSDs) `down+in`: 96042674552832

1. The `raw_used_rate` is passed in to function `PGMapDigest::dump_object_stat_sum`, which should be equal to: * the number of copies for replicated pools * or ( K + M / K ) for EC pools.

2. At: https://github.com/ceph/ceph/blob/main/src/mon/PGMap.cc#L886

raw_used_rate *= (float)(sum.num_object_copies - sum.num_objects_degraded) / sum.num_object_copies;

This applies a scaling factor equal to the percentage of non-degraded object copies ( when compared to the total object copies count ).

Using the 'all_down' pgdump for libvirt-pool:

      num_object_copies:    7287540
      num_objects_degraded: 1812426
      Scaling factor applied: 0.7512979688619205

3. The 'MAX_AVAIL' value is calculated at: https://github.com/ceph/ceph/blob/main/src/mon/PGMap.cc#L901

   auto avail_res = raw_used_rate ? avail / raw_used_rate : 0;

'avail' is the raw available bytes
'raw_used_rate' is now ~75% of what it was, thus the 'MAX_AVAIL' increases

   avail: ( min(osd_avail_kbytes) * num_osds ) - ( sum(osd_max_kbytes) * ( 1 - mon_osd_full_ratio ))
          (5597462260 * 40) - ( 250048839680 * ( 1 - 0.95 )) = 211396048416
   raw_used_rate: 3 * 0.7512979688619205 = 2.253893907
   max_avail: 211396048416 / 2.253893907 * 1024 = 96042476953095