Bug #6751: Pool 'df' statistics go bad after changing PG count - Ceph - Ceph

Actions

Copy link

Bug #6751

closed

Pool 'df' statistics go bad after changing PG count

Added by John Spray over 10 years ago. Updated about 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

To reproduce:

Create a pool with few PGs
Create some objects in the pool
Note the 'ceph df' output
Increase the number of PGs in the pool with ceph osd pool set <name> pg_num <count>
Wait for PG creation
Note the 'ceph df' output: the 'USED' and 'OBJECTS' values for the pool will have increased roughly proportionally with the number of PGs, while no more objects were really created and the actual usage of space on the OSDs is hardly increased at all.

Detail:

Create my pool
ceph osd pool create pbench3 10

root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2789G 263M 0

POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 0 0 0

Put some objects in it:
rados bench --no-cleanup -p pbench3 -b 1000000 60 write

root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6048M 0.21

POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 2910M 0.10 3054

Increase the number of PGs
root@gravel2:~# ceph osd pool set pbench3 pg_num 100
set pool 6 pg_num to 100
root@gravel2:~# ceph osd pool set pbench3 pgp_num 100
Error EAGAIN: currently creating pgs, wait

Let things quiet down:

root@gravel2:~# ceph pg stat
v9127: 292 pgs: 57 creating, 193 active+clean, 42 peering; 10906 MB data, 6122 MB used, 2784 GB / 2790 GB avail; 1020 MB/s wr, 1069 op/s
root@gravel2:~# ceph pg stat
v9132: 292 pgs: 281 active+clean, 11 peering; 31912 MB data, 6119 MB used, 2784 GB / 2790 GB avail; 1617 MB/s wr, 1696 op/s
root@gravel2:~# ceph pg stat
v9133: 292 pgs: 292 active+clean; 31912 MB data, 6119 MB used, 2784 GB / 2790 GB avail; 805 MB/s wr, 844 op/s
root@gravel2:~# ceph pg stat
v9133: 292 pgs: 292 active+clean; 31912 MB data, 6119 MB used, 2784 GB / 2790 GB avail

Now the DF output for my pool is way wrong:

root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6119M 0.21

POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 31912M 1.12 33487

root@gravel2:~# ceph osd pool set pbench3 pgp_num 100
set pool 6 pgp_num to 100

Wait for everything to go to active+clean…

root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6188M 0.22

POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 31912M 1.12 33487

root@gravel2:~# ceph pg stat
v9218: 1192 pgs: 1192 active+clean; 313 GB data, 6112 MB used, 2784 GB / 2790 GB avail

Even more:

ceph osd pool set pbench3 pg_num 1000

root@gravel2:~# ceph pg stat
v9218: 1192 pgs: 1192 active+clean; 313 GB data, 6112 MB used, 2784 GB / 2790 GB avail

root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6112M 0.21

POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 313G 11.23 336809

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Greg Farnum over 10 years ago

This is the "USED" value on the pool pbench3 that you're looking at, right?

Actions

Copy link

Updated by John Spray over 10 years ago

Yes: the USED and OBJECTS values for pbench3 both shoot up.

(sorry it's a bit difficult to read, forgot to put the dumps in code blocks)

Actions

Copy link

Updated by Greg Farnum over 10 years ago

Oh, I missed the object counts entirely. Have you waited a while and made sure these wrong values persist? I have a suspicion that the splitting code isn't bothering to generate accurate counts for the new PGs right away, but it could be doing so later on (although probably it's not, now that I think about the most likely failure modes).

Actions

Copy link

Updated by Loïc Dachary over 10 years ago

Status changed from New to Need More Info

Actions

Copy link

Updated by John Spray over 10 years ago

Status changed from Need More Info to New

(Weird, thought I had updated this at the time but apparently not)

Yeah, the values persisted for tens of minutes at least.

Actions

Copy link

Updated by Greg Farnum over 10 years ago

I spoke with Sam about this when after it came in and the bug is an obvious result of doing fixed-cost pg splits: the children don't know which of the objects they actually own until a scrub has happened.

Sam's suggested "fix" for this is to give each child PG's stats a proportional amount of the parent's objects and to maintain offsets from that for future reporting.

Actions

Copy link