Project

General

Profile

Actions

Bug #40203

open

ceph df shows incorrect usage

Added by Momcilo Medic almost 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
ceph-mgr
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

TL;DR

"ceph df" shows pool "itaf" as 1.2TiB used, while "rbd du -p itaf" shows total used as 19TiB
What feedback do I need to provide to help with resolving this?

Longer version:

While trying to resize our placement groups, we noticed sudden significant drop in both pool usage and capacity.
We use LibreNMS as a graphing tool, and we suspected that it might be monitoring misbehaving.
However, when further discussed on IRC, we came to conclusion that Ceph itself has two views on the usage.

Command ceph df shows the same value as monitoring observes.
Autoscaler now suggests 16 PGs, while previously value of 256 was suggested.

We didn't had any actions at that time as far as we can track it.
We experience no issues for the virtualization that is running on top of Ceph.

This surfaced as our goal was to enable suggestions for PG from autoscaler, but we first needed to resize all pools.
We are now reluctant to have any significant actions (especially as autoscaler would still throw warnings).

It is hard to guess which information is relevant so I'll just provide filtered output of few commands:

# rbd du -p itaf
NAME                    PROVISIONED USED     
...
<TOTAL>                      29 TiB   19 TiB 
#

# ceph df
...
    POOL        ID     STORED      OBJECTS     USED        %USED     MAX AVAIL 
    itaf         1     413 GiB       4.91M     1.2 TiB      0.50        79 TiB 
...
#

# ceph osd pool autoscale-status
 POOL       SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE 
...
 itaf      1223G                3.0        332.9T  0.0108                 1.0    2048          16  warn      
#

I am also attaching graph from LibreNMS of exact moment usage and capacity dropped.
Please let me know what information is relevant and I'll make sure to provide it.

Kind regards,
Momo.


Files


Related issues 3 (1 open2 closed)

Related to Dashboard - Bug #45185: mgr/dashboard: fix usage calculation to match "ceph df" wayResolvedErnesto Puerta

Actions
Related to Dashboard - Feature #38697: mgr/dashboard: Enhance info shown in Landing Page cards 'PGs per OSD' & 'Raw Capacity'Closed

Actions
Has duplicate mgr - Bug #41829: ceph df reports incorrect pool usageNew

Actions
Actions #1

Updated by Stephan Müller over 3 years ago

  • Has duplicate Bug #41829: ceph df reports incorrect pool usage added
Actions #2

Updated by Stephan Müller over 3 years ago

  • Related to Bug #42982: Monitoring: alert for "pool full" wrong added
Actions #3

Updated by Stephan Müller over 3 years ago

  • Related to deleted (Bug #42982: Monitoring: alert for "pool full" wrong)
Actions #4

Updated by Stephan Müller over 3 years ago

  • Related to Bug #45185: mgr/dashboard: fix usage calculation to match "ceph df" way added
Actions #5

Updated by Stephan Müller over 3 years ago

  • Related to Feature #38697: mgr/dashboard: Enhance info shown in Landing Page cards 'PGs per OSD' & 'Raw Capacity' added
Actions

Also available in: Atom PDF