Bug #6751
closedPool 'df' statistics go bad after changing PG count
0%
Description
To reproduce:
- Create a pool with few PGs
- Create some objects in the pool
- Note the 'ceph df' output
- Increase the number of PGs in the pool with ceph osd pool set <name> pg_num <count>
- Wait for PG creation
- Note the 'ceph df' output: the 'USED' and 'OBJECTS' values for the pool will have increased roughly proportionally with the number of PGs, while no more objects were really created and the actual usage of space on the OSDs is hardly increased at all.
Detail:
Create my pool
ceph osd pool create pbench3 10
root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2789G 263M 0
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 0 0 0
Put some objects in it:
rados bench --no-cleanup -p pbench3 -b 1000000 60 write
root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6048M 0.21
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 2910M 0.10 3054
Increase the number of PGs
root@gravel2:~# ceph osd pool set pbench3 pg_num 100
set pool 6 pg_num to 100
root@gravel2:~# ceph osd pool set pbench3 pgp_num 100
Error EAGAIN: currently creating pgs, wait
Let things quiet down:
root@gravel2:~# ceph pg stat
v9127: 292 pgs: 57 creating, 193 active+clean, 42 peering; 10906 MB data, 6122 MB used, 2784 GB / 2790 GB avail; 1020 MB/s wr, 1069 op/s
root@gravel2:~# ceph pg stat
v9132: 292 pgs: 281 active+clean, 11 peering; 31912 MB data, 6119 MB used, 2784 GB / 2790 GB avail; 1617 MB/s wr, 1696 op/s
root@gravel2:~# ceph pg stat
v9133: 292 pgs: 292 active+clean; 31912 MB data, 6119 MB used, 2784 GB / 2790 GB avail; 805 MB/s wr, 844 op/s
root@gravel2:~# ceph pg stat
v9133: 292 pgs: 292 active+clean; 31912 MB data, 6119 MB used, 2784 GB / 2790 GB avail
Now the DF output for my pool is way wrong:
root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6119M 0.21
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 31912M 1.12 33487
root@gravel2:~# ceph osd pool set pbench3 pgp_num 100
set pool 6 pgp_num to 100
Wait for everything to go to active+clean…
root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6188M 0.22
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 31912M 1.12 33487
root@gravel2:~# ceph pg stat
v9218: 1192 pgs: 1192 active+clean; 313 GB data, 6112 MB used, 2784 GB / 2790 GB avail
Even more:
ceph osd pool set pbench3 pg_num 1000
root@gravel2:~# ceph pg stat
v9218: 1192 pgs: 1192 active+clean; 313 GB data, 6112 MB used, 2784 GB / 2790 GB avail
root@gravel1:~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2790G 2784G 6112M 0.21
POOLS:
NAME ID USED %USED OBJECTS
data 0 0 0 0
metadata 1 0 0 0
rbd 2 0 0 0
pbench3 6 313G 11.23 336809
Updated by Greg Farnum over 10 years ago
This is the "USED" value on the pool pbench3 that you're looking at, right?
Updated by John Spray over 10 years ago
Yes: the USED and OBJECTS values for pbench3 both shoot up.
(sorry it's a bit difficult to read, forgot to put the dumps in code blocks)
Updated by Greg Farnum over 10 years ago
Oh, I missed the object counts entirely. Have you waited a while and made sure these wrong values persist? I have a suspicion that the splitting code isn't bothering to generate accurate counts for the new PGs right away, but it could be doing so later on (although probably it's not, now that I think about the most likely failure modes).
Updated by Loïc Dachary over 10 years ago
- Status changed from New to Need More Info
Updated by John Spray over 10 years ago
- Status changed from Need More Info to New
(Weird, thought I had updated this at the time but apparently not)
Yeah, the values persisted for tens of minutes at least.
Updated by Greg Farnum over 10 years ago
I spoke with Sam about this when after it came in and the bug is an obvious result of doing fixed-cost pg splits: the children don't know which of the objects they actually own until a scrub has happened.
Sam's suggested "fix" for this is to give each child PG's stats a proportional amount of the parent's objects and to maintain offsets from that for future reporting.
Updated by Greg Farnum about 7 years ago
We do an estimate of how many each child gets now!