Project

General

Profile

Bug #24161

[rbd-mirror] simple image map policy doesn't always level-load instances

Added by Jason Dillaman over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
05/17/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

One observation though (I think not related to this PR but rather how the policy currently works). If I remove images from the pool that are replaying by the same instance there is no any shuffling and you can end up with one completely idle instance. When adding new images it will add them evenly to the instances, so you still will have that previously idle instance underloaded. It looks like the same thing happens when stopping an instance: the policy will try to distribute its images evenly among other instances, not taking into account how each instance is currently loaded.
So you can end up with image distribution like below:

% for i in 0 1 2 3; do ceph --admin-daemon /tmp/tmp.rbd_mirror/rbd-mirror.cluster1-client.mirror.$i.asok help |grep  -c 'status mirror/'; done
7
3
10
2
It looks like the only way to reshuffle them evenly is to restart instances. I suppose this is not what users will expect from "simple" policy -- I think they will want even distribution not depending on the history.

Related issues

Copied to rbd - Backport #24519: mimic: [rbd-mirror] simple image map policy doesn't always level-load instances Resolved

History

#1 Updated by Jason Dillaman over 1 year ago

  • Backport set to mimic

#2 Updated by Venky Shankar over 1 year ago

  • Assignee set to Venky Shankar

#3 Updated by Venky Shankar over 1 year ago

As of now, simple policy does not reshuffle mapped images when images are removed -- that's only done when instances are added or removed.

#4 Updated by Venky Shankar about 1 year ago

PR https://github.com/ceph/ceph/pull/22304

Adding images after a bunch of image removals should pick an instance which is least loaded -- I think the reason this was not observed in Mykola's test setup was due the fact the on-disk (and in-memory) image map is not purged when removing images. This can be confirmed by checking the number of image map keys (`image_map_*`) in `rbd_mirror_leader` object after removing some images.

#5 Updated by Venky Shankar about 1 year ago

  • Status changed from New to Need Review

#6 Updated by Jason Dillaman about 1 year ago

  • Status changed from Need Review to Pending Backport

#7 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #24519: mimic: [rbd-mirror] simple image map policy doesn't always level-load instances added

#8 Updated by Nathan Cutler about 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF