Project

General

Profile

Feature #2939

chef: Write up how cluster shrinking should work

Added by Anonymous over 11 years ago. Updated almost 6 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
chef
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Expanding the cluster is pretty trivial, and practically identical with initial install, but shrinking needs a little bit more care.

If I want to remove physical server node1234, what OSDs do I "ceph osd rm"?

Something should probably run on node1234 to bring the relevant disk back to "prepared" state, so it doesn't keep trying to talk to the cluster? Something like ceph-disk-VERB_HERE that stops the daemons, re-prepares the disk? Would that command also talk to ceph-mon to do the "ceph osd rm"?

Note that in actual usage, admins will probably want to step the crush weight slowly to 0 first.

History

#1 Updated by Anonymous over 11 years ago

  • Description updated (diff)

#2 Updated by Anonymous over 11 years ago

  • Category set to chef

#3 Updated by Anonymous over 11 years ago

  • translation missing: en.field_story_points set to 8

#4 Updated by Anonymous over 11 years ago

Moving content from duplicate #3119:

DH cookbooks do this by setting a node attribute that maps osd.id -> desired action, one of the actions is destroy.

That does run into the annoyances of using Chef as an RPC mechanism, requires admin to manage id->node mapping, etc.

Try to solve this with a core product feature, then make that interop well with Chef/ceph-deploy/Juju.

Once destroyed, osd hotplugging MUST NOT create a new OSD on that disk automatically. That is, the lifecycle is

blank --ceph-disk-prepare--> prepared
prepared --ceph-disk-activate--> active
{prepared, active} --ceph-disk-destroy?--> blank
(the last arrow goes to blank, not to prepared)

and the logic that would automatically trigger ceph-disk-prepare for listed block devices (#2554) MUST NOT re-prepare it

#5 Updated by Sage Weil almost 6 years ago

  • Status changed from New to Rejected

Also available in: Atom PDF