Bug #8103
closedpool has too few PGs warning misleading when using cache pools
0%
Description
When using cache pools on a fresh filesystem, quickly the number of objects in the cache pool can greatly exceed the number of objects in the base pool, causing ceph health to warn:
nhm@burnupiX:/tmp/cbt/ceph/log$ ceph health HEALTH_WARN pool rados-bench-burnupiY-2-cache has too few pgs; pool rados-bench-burnupiY-3-cache has too few pgs
Looking at ceph health detail, we see:
HEALTH_WARN pool rados-bench-burnupiY-2-cache has too few pgs; pool rados-bench-burnupiY-3-cache has too few pgs pool rados-bench-burnupiY-2-cache objects per pg (14) is more than 14 times cluster average (1) pool rados-bench-burnupiY-3-cache objects per pg (14) is more than 14 times cluster average (1)
In reality, this isn't a problem and the warning can be worked around by changing the max object skew, but it's pretty misleading and I would argue not particularly helpful.
Updated by Sage Weil about 10 years ago
Given that this is a transient issue for a new, empty cluster, I'm not sure if it is worth making an exception for the warning...
Updated by Mark Nelson about 10 years ago
It seems like there may be other situations where this is misleading too. Say if you have many mostly empty pools and 1 heavily utilized one, or SSD backed pools with a very high number of small objects vs spinning disk pools with fewer, larger objects.
Are we finding users in situations where skewed object/pg distributions are causing problems?
Updated by Sage Weil about 10 years ago
Mark Nelson wrote:
It seems like there may be other situations where this is misleading too. Say if you have many mostly empty pools and 1 heavily utilized one, or SSD backed pools with a very high number of small objects vs spinning disk pools with fewer, larger objects.
Are we finding users in situations where skewed object/pg distributions are causing problems?
It's just a tool to help users identify when their pg_num values are out of whack. We can bump up the threshold... or, we could exclude any empty pools from the average, perhaps...
Updated by Sage Weil almost 10 years ago
- Status changed from New to Won't Fix
no simple way to avoid this false positive but still maintain this warning. and it is useful.
Updated by Bart van Bragt almost 8 years ago
I'm also running into this:
$ ceph health detail HEALTH_WARN pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?) pool default.rgw.buckets.data objects per pg (446) is more than 12.7429 times cluster average (35)
$ rados df pool name KB objects clones degraded unfound rd rd KB wr wr KB .rgw.root 2 4 0 0 0 30 24 4 5 default.rgw.buckets.data 56094130 57153 0 0 0 572099 59761952 583966 58240927 default.rgw.buckets.index 0 69 0 0 0 6730064 6828134 177526 0 default.rgw.buckets.non-ec 0 0 0 0 0 312 344 786 0 default.rgw.control 0 8 0 0 0 0 0 0 0 default.rgw.data.root 2 6 0 0 0 0 0 15 6 default.rgw.gc 0 32 0 0 0 133602 135155 101032 0 default.rgw.log 0 127 0 0 0 2960042 2959915 1973580 0 default.rgw.meta 4 9 0 0 0 0 0 21 9 default.rgw.users.keys 1 1 0 0 0 3 2 1 1 default.rgw.users.swift 1 1 0 0 0 0 0 1 1 default.rgw.users.uid 1 2 0 0 0 11593 11590 11504 3 rbd 0 0 0 0 0 0 0 0 0 total used 250293020 57412 total avail 11439457452 total space 11689750472
We are using RadosGW with one pool which currently stores all objects. All the other pools are purely administrative and relatively empty. In my opinion this warning can fairly easily be fixed by excluding all pools with less than, for instance, 1000 objects.
But that's from my naive POV. I'm not entirely sure in what scenario this warning would be of help. I would think that it would be more useful if Ceph reported if some performance related threshold is crossed regarding objects/PG. Comparing this to a (local) average feels strange. But I'm hardly an expert. For people starting with Ceph it would be nice if some false positives could be avoided.
Updated by Ben England over 6 years ago
+1. What really matters is that PGs are applied to pools that are doing a lot of I/O to even the load across OSDs. This warning is really trying to provide an indicator that OSD load may not be level because of the imbalance in PGs, but it's really not accurate unless there is I/O actually happening in the pool. So perhaps a more relevant indicator would be a significant imbalance in OSD I/O by a particular storage pool. That may now be easier to implement with a ceph-mgr module perhaps? Also, the new per-pool reweighting option in Luminous may provide an alternative to ever-higher PG counts.