Project

General

Profile

Feature #42321

Add a new mode to balance pg layout by primary osds

Added by rosin luo 10 months ago. Updated 10 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic,nautilus
Reviewed:
Affected Versions:
Component(RADOS):
Pull request ID:

Description

There already have upmap optimizer since Luminous version. The upmap optimizer is help for balancing PGs across OSDs, it can get a “perfect” distribution, each OSD have equal number of PGs. But it is not balanced in primary PGs.
The upmap-by-primary-osd optimizer balance primary PG and replica PG in turn. The implementation of upmap-by-primary-osd refers to upmap. It’s behavior is just like upmap does to get a balanced distribution both primary PGs and total PGs. The optimizer balance PGs distribution in the same failure domain. As PG’s primary osd handles the read/write operations, the unbalanced OSDs result in unbalanced load. The OSD have more primary PGs will be the performance bottleneck especially for reading operation.We use fio to do 4M read test in rbd pools, it have about 20%-30% bandwidth improvement vs upmap.
We have a ceph cluster which contain 3 host,4 osds per host.We create a pool with 1024 pgs to do pg balance.
ceph osd tree looks like:

The upmap optimizer to balance pg,result is blow:

The upmap-by-primary-osd optimizer to balance pg,result is blow pic,pg primary osds is not balanced between hosts, host1 has less primary pg and so osd0,osd1,osd2,osd3 has less primary pg nums.

The usage is just like upmap:
osdmaptool osdmap.file --upmap-by-primary-osd out.txt [--upmap-pool <pool>] [--upmap-max <max-count>] [--upmap-deviation <max-deviation>]

ceph_osd_tree.png View (18 KB) rosin luo, 10/15/2019 08:21 AM

pg_balance_use_upmap_by_primary_osd.png View (28 KB) rosin luo, 10/15/2019 08:25 AM

pg_balance_use_upmap.png View (28.6 KB) rosin luo, 10/15/2019 08:25 AM

History

#1 Updated by Greg Farnum 10 months ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSDMap)
  • Status changed from New to Fix Under Review

Also available in: Atom PDF