Project

General

Profile

Bug #57135

ceph osd pool set <pool> size math error

Added by Brian Woods over 1 year ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Note: This is a duplicate of #57105 as I created it in the wrong location and now can't change it.

Context, I created a pool with a block device and intentionally filled a set of OSDs.

This of course broke things, I deleted the pool with the image on it and then went through my deployment script to move onto my next test.

When attempting to set the pool size, it calculates the pgs incorrectly ( = 10 ^ 19 ! ) blocking the change.

root@backups:# ceph osd pool get BlockDevices-WriteCache size
size: 3
root@backups:# ceph osd pool set BlockDevices-WriteCache size 2
Error ERANGE: pool id 19 pg_num 1 size 2 would mean 18446744073709551615 total pgs, which exceeds max 750 (mon_max_pg_per_osd 250 * num_in_osds 3)

This is an empty pool with nothing special set:
CacheMinSize=2
Cache_max_bytes=5368709120        #5GB
Cache_target_max_objects=1024
Cache_target_dirty_ratio=0.2
Cache_target_dirty_high_ratio=0.5
Cache_target_full_ratio=0.6
Cache_min_flush_age=3000
Cache_min_evict_age=3000
CompressionMode=aggressive
CompressionAlgorithm=zstd

root@backups:# rados df
POOL_NAME                        USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY  UNFOUND  DEGRADED  RD_OPS       RD  WR_OPS       WR  USED COMPR  UNDER COMPR
.mgr                          2.6 MiB        2       0       6                   0        0         0    1634  3.3 MiB    1419   25 MiB         0 B          0 B
.rgw.root                     2.5 KiB        6       0      12                   0        0         0     267  267 KiB       6    6 KiB         0 B          0 B
BlockDevices                      0 B        0       0       0                   0        0         0       0      0 B       0      0 B         0 B          0 B
BlockDevices-WriteCache           0 B        0       0       0                   0        0         0       0      0 B       0      0 B         0 B          0 B

root@backups:# ceph osd pool get BlockDevices-WriteCache min_size
min_size: 2
root@backups:# ceph osd pool get BlockDevices-WriteCache pg_num
pg_num: 1
root@backups:# ceph osd pool get BlockDevices-WriteCache pgp_num
pgp_num: 1

Looks like one of the page groups is "inactive":

    health: HEALTH_WARN
...
            Reduced data availability: 1 pg inactive

And the GUI shoes the pools status as:
1 unknown

History

#1 Updated by Brian Woods over 1 year ago

Interesting note, attempting to repair returns a negative number for the OSD:

#ceph osd pool repair BlockDevices-WriteCache
Error EAGAIN: osd.-1 is not currently connected

I need to move on with my testing soon. If no one wants to debug this soon, I will be purging everything....

Also available in: Atom PDF