Bug #58505: Wrong calculate free space OSD and PG used bytes - RADOS - Ceph

Actions

Copy link

Bug #58505

open

Wrong calculate free space OSD and PG used bytes

Added by Andrey Groshev over 1 year ago. Updated about 1 year ago.

Status:

Need More Info

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v16.2.10

ceph-qa-suite:

Component(RADOS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I added a new node with OSD to the cluster. Now I'm adding several disks each. After a short balancing time , the following problems popped up:

HEALTH_WARN 1 backfillfull osd(s); Low space hindering backfill (add storage if this doesn't resolve itself): 114 pgs backfill_toofull; 12 pool(s) backfillfull
[WRN] OSD_BACKFILLFULL: 1 backfillfull osd(s)
    osd.410 is backfill full
[WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if this doesn't resolve itself): 114 pgs backfill_toofull
    pg 97.8 is active+remapped+backfill_toofull, acting [257,113,212,166,34,384]
    pg 97.e is active+remapped+backfill_toofull, acting [162,84,289,228,379,50]
    pg 97.17 is active+remapped+backfill_toofull, acting [359,287,211,63,140,56]

....skip...

[WRN] POOL_BACKFILLFULL: 12 pool(s) backfillfull
    pool 'pool1' is backfillfull
    pool 'pool2' is backfillfull
    pool 'pool3' is backfillfull
... skip ...

At the same time, the occupancy of this disk is only 37%

# ceph osd df |awk '{if(NR==1 || /^410/){print}}'
ID   CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP      META      AVAIL    %USE   VAR   PGS  STATUS
410    hdd  9.09569   1.00000  9.1 TiB  3.4 TiB  3.4 TiB       0 B   9.2 GiB  5.7 TiB  37.37  0.65   37      up

But the total occupied place of PGs is almost 5TB

# ceph pg ls-by-osd osd.410|awk '{if($6 ~ /[[:digit:]]+/){s+=$6}}END{printf "%'"'"'d\n", s}'
4,990,453,504,665

On older disks, the difference is even greater.

# ceph osd df |awk '{if(NR==1 || /^  0/){print}}'
ID   CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP      META      AVAIL    %USE   VAR   PGS  STATUS
  0    hdd  9.09569   1.00000  9.1 TiB  5.0 TiB  5.0 TiB   2.5 MiB    13 GiB  4.1 TiB  54.71  0.95   34      up

# ceph pg ls-by-osd osd.0|awk '{if($6 ~ /[[:digit:]]+/){s+=$6}}END{printf "%'"'"'d\n", s}'
21,501,114,045,971

It seems to me that something is calculated incorrectly due to the fact that we use floors with ErasureCodes. (4+2).
If you divide the sum of bytes in the PC by 4, you will get more or less correct.
A 10TB disk cannot hold 21TB and be 50% used.

Actions

Copy link

Updated by Casey Bodley over 1 year ago

Project changed from rgw to RADOS

Actions

Copy link

Updated by Andrey Groshev over 1 year ago

NOT quite sure, but it looks like it's calculated here (./src/osd/OSD.cc #1070):

float OSDService::compute_adjusted_ratio(osd_stat_t new_stat, float *pratio,
                                         uint64_t adjust_used)
{
  *pratio =
   ((float)new_stat.statfs.get_used_raw()) / ((float)new_stat.statfs.total);

  if (adjust_used) {
    dout(20) << __func__ << " Before kb_used() " << new_stat.statfs.kb_used()  << dendl;
    if (new_stat.statfs.available > adjust_used)
      new_stat.statfs.available -= adjust_used;
    else
      new_stat.statfs.available = 0;
    dout(20) << __func__ << " After kb_used() " << new_stat.statfs.kb_used() << dendl;
  }

  // Check all pgs and adjust kb_used to include all pending backfill data
  int backfill_adjusted = 0;
  vector<PGRef> pgs;
  osd->_get_pgs(&pgs);
  for (auto p : pgs) {
    backfill_adjusted += p->pg_stat_adjust(&new_stat);
  }
  if (backfill_adjusted) {
    dout(20) << __func__ << " backfill adjusted " << new_stat << dendl;
  }
  return ((float)new_stat.statfs.get_used_raw()) / ((float)new_stat.statfs.total);
}

Actions

Copy link