Project

General

Profile

Actions

Bug #19487

closed

"GLOBAL %RAW USED" of "ceph df" is not consistent with check_full_status

Added by Pan Liu about 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
David Zafman
Category:
Administration/Usability
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

1) Use vstart.sh to create a cluster, with option: osd failsafe ful ratio = .46

2) Input "ceph df":
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
439G 217G 199G 45.45

3) rbd create image --size 128

4) ceph -w:
2017-04-04 20:57:40:894659 [ERR] OSD full dropping all updates 51% full
2017-04-04 20:57:44:364939 [ERR] pgmap v31: 24 pgs: 24 active+clean: 2068 bytes data, 199 GB used, 217 GB / 439 GB avail

So we could see ceph -w displays it is full(51%), but ceph df only 45.45%.

Actions #1

Updated by Kefu Chai about 7 years ago

  • Status changed from New to Fix Under Review
Actions #2

Updated by David Zafman about 7 years ago

Let's say I set osd failsafe ful ratio = .90. Below I made up these numbers to show how
these percentages won't necessarily match anyway.

dzafman$ bin/ceph df
  • DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
    GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    400G 69G 331G 82.75

[~/ceph/build] (wip-15912-followon)
dzafman$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 100G 80G 20G 60% /var/lib/ceph-0
/dev/sdc1 100G 80G 20G 60% /var/lib/ceph-1
/dev/sdd1 100G 80G 20G 60% /var/lib/ceph-2
/dev/sde1 100G 91G 9G 91% /var/lib/ceph-3

osd.3 will report failsafe full but it won't match the global used percentage.

The only time it would come into play would be if every OSD had the same usage and it was 91% across the board on all OSDs so the "ceph df"
should also be 91%.

Actions #3

Updated by Pan Liu about 7 years ago

I've updated my comment in https://github.com/ceph/ceph/pull/14318.

Actions #4

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to Administration/Usability
  • Status changed from Fix Under Review to In Progress
  • Assignee set to David Zafman

Based on PR comments we expect this to be fixed up by one of David's disk handling branches. Or did that one already merge?

Actions #5

Updated by David Zafman over 6 years ago

  • Status changed from In Progress to Closed

Reopen this if issue hasn't been fixed in the latest code with the understanding that each OSD has its own fullness determination.

Actions

Also available in: Atom PDF