Actions
Documentation #9867
closedPGs per OSD documentation needs clarification
% Done:
0%
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
Documentation in question:
http://ceph.com/docs/master/rados/operations/placement-groups/
http://ceph.com/docs/master/rados/operations/placement-groups/#choosing-the-number-of-placement-groups
- Desired range for total PGs per OSD (including all Pools and Replicas)
- Impact of empty / non-active Pool's PGs on data distribution and/or Memory/CPU overhead
- My current understanding is that the total PGs per OSD (including all replicas of all pools) should be in the target range of 100 to 200.
Thus:(Pool1_pg_num * Pool1_size) + (Pool2_pg_num * Pool2_size) + ... --------------------------------------------------------------- =~ 100 to 200 # of OSDs
- This is to help with ensuring the OSD process' memory and CPU utilization remain in acceptable levels during recovery operations.
- I've also been advised that 500 to 700 total PGs per OSD is generally still OK, but 1000+ total PGs per OSD is considered bad.
- In the last paragraph of section:
http://ceph.com/docs/master/rados/operations/placement-groups/#choosing-the-number-of-placement-groups
50k PGs per OSD is used as an example which would use more resources.. but in reality, I've had experience on a cluster with 9k PGs per OSD that was not able to start and begin stable operations. This example gives an artificially high sense of acceptable norms for PG to OSD count.
- Empty and / or non-active Pools should not be considered helpful toward the overall goal of even data distribution.
Therefore, clusters with only a few active/data containing pools and a number of non-active and / or empty pools may have poor data distribution and thus, 'hot' disks with regard to space utilization. - They do however still cause CPU and memory overhead as noted above.
Actions