Project

General

Profile

Bug #47062

Updated by Kefu Chai over 3 years ago

The pg_num check will be done during pool creation, which uses the total number of osd* mon_max_pg_per_osd as an upper limit, so pooling on only a portion of the osd will result in too many pg on a single osd. 
 for example ,i have 6 osd in my cluster,and only 3 osds are added to a root,using the default mon_max_pg_per_osd value of 250. 
 <pre> 
 [root@cc-ljrtest-x86-controller-1 ~]# ceph osd tree 
 ID CLASS WEIGHT     TYPE NAME                              STATUS REWEIGHT PRI-AFF  
 -5         16.52069 root pool-1                                                   
  0     hdd    5.50690       osd.0                                  up    1.00000 1.00000  
  1     hdd    5.50690       osd.1                                  up    1.00000 1.00000  
  2     hdd    5.50690       osd.2                                  up    1.00000 1.00000  
 -1         33.04138 root default                                                  
 -3         33.04138       host cc-ljrtest-x86-controller-1                          
  0     hdd    5.50690           osd.0                              up    1.00000 1.00000  
  1     hdd    5.50690           osd.1                              up    1.00000 1.00000  
  2     hdd    5.50690           osd.2                              up    1.00000 1.00000  
  3     hdd    5.50690           osd.3                              up    1.00000 1.00000  
  4     hdd    5.50690           osd.4                              up    1.00000 1.00000  
  5     hdd    5.50690           osd.5                              up    1.00000 1.00000  
 Now there an 208 pgs in the cluster. 
 [root@cc-ljrtest-x86-controller-1 ~]# ceph pg stat 
 208 pgs: 208 active+clean; 27 KiB data, 300 GiB used, 33 TiB / 33 TiB avail 

 So,when creating a pool with 256 pgs,because (256 + 208) * 3 < 6 * 250 the pool will be created . 
 [root@cc-ljrtest-x86-controller-1 ~]# ceph osd pool create poolt1 256 256 rule-test 
 pool 'poolt1' created 
 Now the pg_num in osd.1-3 will exceed the mon_max_pg_per_osd value of 250. 
 [root@cc-ljrtest-x86-controller-1 x86_64]# ceph osd df 
 ID CLASS WEIGHT    REWEIGHT SIZE      RAW USE DATA      OMAP META    AVAIL     %USE VAR    PGS STATUS  
  0     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 359       up  
  1     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 348       up  
  2     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 349       up  
  0     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 359       up  
  1     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 348       up  
  2     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 349       up  
  3     hdd 5.50690    1.00000 5.5 TiB    51 GiB 8.1 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00    96       up  
  4     hdd 5.50690    1.00000 5.5 TiB    51 GiB 8.3 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00 104       up  
  5     hdd 5.50690    1.00000 5.5 TiB    51 GiB     8 MiB    0 B 1 GiB 5.5 TiB 0.90 1.00    96       up  

 </pre>

Back