[rbd-mirror] simple image map policy doesn't always level-load instances
One observation though (I think not related to this PR but rather how the policy currently works). If I remove images from the pool that are replaying by the same instance there is no any shuffling and you can end up with one completely idle instance. When adding new images it will add them evenly to the instances, so you still will have that previously idle instance underloaded. It looks like the same thing happens when stopping an instance: the policy will try to distribute its images evenly among other instances, not taking into account how each instance is currently loaded. So you can end up with image distribution like below: % for i in 0 1 2 3; do ceph --admin-daemon /tmp/tmp.rbd_mirror/rbd-mirror.cluster1-client.mirror.$i.asok help |grep -c 'status mirror/'; done 7 3 10 2 It looks like the only way to reshuffle them evenly is to restart instances. I suppose this is not what users will expect from "simple" policy -- I think they will want even distribution not depending on the history.
#4 Updated by Venky Shankar about 1 year ago
Adding images after a bunch of image removals should pick an instance which is least loaded -- I think the reason this was not observed in Mykola's test setup was due the fact the on-disk (and in-memory) image map is not purged when removing images. This can be confirmed by checking the number of image map keys (`image_map_*`) in `rbd_mirror_leader` object after removing some images.