Bug #57135
ceph osd pool set <pool> size math error
0%
Description
Note: This is a duplicate of #57105 as I created it in the wrong location and now can't change it.
Context, I created a pool with a block device and intentionally filled a set of OSDs.
This of course broke things, I deleted the pool with the image on it and then went through my deployment script to move onto my next test.
When attempting to set the pool size, it calculates the pgs incorrectly ( = 10 ^ 19 ! ) blocking the change.
root@backups:# ceph osd pool get BlockDevices-WriteCache size size: 3 root@backups:# ceph osd pool set BlockDevices-WriteCache size 2 Error ERANGE: pool id 19 pg_num 1 size 2 would mean 18446744073709551615 total pgs, which exceeds max 750 (mon_max_pg_per_osd 250 * num_in_osds 3)
This is an empty pool with nothing special set:
CacheMinSize=2 Cache_max_bytes=5368709120 #5GB Cache_target_max_objects=1024 Cache_target_dirty_ratio=0.2 Cache_target_dirty_high_ratio=0.5 Cache_target_full_ratio=0.6 Cache_min_flush_age=3000 Cache_min_evict_age=3000 CompressionMode=aggressive CompressionAlgorithm=zstd
root@backups:# rados df POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR .mgr 2.6 MiB 2 0 6 0 0 0 1634 3.3 MiB 1419 25 MiB 0 B 0 B .rgw.root 2.5 KiB 6 0 12 0 0 0 267 267 KiB 6 6 KiB 0 B 0 B BlockDevices 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B BlockDevices-WriteCache 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B
root@backups:# ceph osd pool get BlockDevices-WriteCache min_size min_size: 2 root@backups:# ceph osd pool get BlockDevices-WriteCache pg_num pg_num: 1 root@backups:# ceph osd pool get BlockDevices-WriteCache pgp_num pgp_num: 1
Looks like one of the page groups is "inactive":
health: HEALTH_WARN ... Reduced data availability: 1 pg inactive
And the GUI shoes the pools status as:
1 unknown
History
#1 Updated by Brian Woods over 1 year ago
Interesting note, attempting to repair returns a negative number for the OSD:
#ceph osd pool repair BlockDevices-WriteCache
Error EAGAIN: osd.-1 is not currently connected
I need to move on with my testing soon. If no one wants to debug this soon, I will be purging everything....