Project

General

Profile

Bug #50533

osd: check_full_status: check don't cares about RocksDB size

Added by Konstantin Shalygin about 2 months ago. Updated 23 days ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

40%

Source:
Community (user)
Tags:
Backport:
pacific, octopus, nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In case, when OSD bdev include block.db and PG's serve millions of objects - RocksDB may grow to 10-20% of OSD size, when objects match or lower bluefs min alloc size.
This OSD df:
- total: 913 GiB
- data: 629 GiB
- rocksdb: 131 GiB
- usage_percent: 83.24 (RAW USE)

But for check_full_status() seems don't care about META size:

2021-04-27 10:58:58.763 7f0de3ca4700 20 osd.651 pg_epoch: 272496 pg[17.5e(unlocked)] pg_stat_adjust reserved_num_bytes 1572754KiB Before kb_used 659180204KiB
2021-04-27 10:58:58.763 7f0de3ca4700 20 osd.651 pg_epoch: 272496 pg[17.5e(unlocked)] pg_stat_adjust After kb_used 660752958KiB
2021-04-27 10:58:58.763 7f0de3ca4700 20 osd.651 272496 compute_adjusted_ratio backfill adjusted osd_stat(store_statfs(0x25df0f757c/0x20dba79000/0xe443e00000, data 0x11e5c35e19/0x9d292a9000, compress 0x0/0x0/0x0
, omap 0x30de, meta 0x20dba75f22), peers [2,11,12,16,18,19,23,28,29,30,31,37,38,39,45,46,47,650,652,660,664,666,674,675,680,684,685,687,688,698,699,701,702,710,712,714,718,722,725,727,731,732,737,741,748,751,75
3,757,758,759,760,763,765,769,772,774,776,779,780] op hist [])
2021-04-27 10:58:58.763 7f0de3ca4700 20 osd.651 272496 check_full_status cur ratio 0.690144, physical ratio 0.688501, new state none

913 (SIZE) * 0,69 (RATIO) = 629,97, this matches DATA size, not RAW size

 651  nvme    0.91199  0.90999 913 GiB 760 GiB 629 GiB  12 KiB 131 GiB 153 GiB 83.24 1.34  39     up             osd.651
ID   CLASS WEIGHT     REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL   %USE  VAR  PGS STATUS TYPE NAME

OSD perf dump

root@host# ceph daemon osd.651 perf dump | grep numpg
        "numpg": 52,
        "numpg_primary": 12,
        "numpg_replica": 27,
        "numpg_stray": 13,
        "numpg_removing": 1,

This osd serve: 3500000 (avg obj/PG) * 52 (numpg) ~ 182000000 obj
The issue of this behaviour - neafull, backfillfull cluster checks is not triggered.


Related issues

Copied to Ceph - Backport #50601: octopus: osd: check_full_status: check don't cares about RocksDB size Resolved
Copied to Ceph - Backport #50602: pacific: osd: check_full_status: check don't cares about RocksDB size Resolved
Copied to Ceph - Backport #50603: nautilus: osd: check_full_status: check don't cares about RocksDB size Resolved

History

#1 Updated by Konstantin Shalygin about 2 months ago

  • Subject changed from osd: check_full_status: check not take care about RocksDB size to osd: check_full_status: check don't cares about RocksDB size

#2 Updated by Igor Fedotov about 2 months ago

  • Pull request ID set to 41043

#3 Updated by Igor Fedotov about 2 months ago

  • Status changed from New to Fix Under Review
  • Backport set to pacific, octopus, nautilus

#4 Updated by Konstantin Shalygin about 2 months ago

  • % Done changed from 0 to 40

#5 Updated by Kefu Chai about 2 months ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Backport Bot about 2 months ago

  • Copied to Backport #50601: octopus: osd: check_full_status: check don't cares about RocksDB size added

#7 Updated by Backport Bot about 2 months ago

  • Copied to Backport #50602: pacific: osd: check_full_status: check don't cares about RocksDB size added

#8 Updated by Backport Bot about 2 months ago

  • Copied to Backport #50603: nautilus: osd: check_full_status: check don't cares about RocksDB size added

#9 Updated by Igor Fedotov 23 days ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF