Feature #44929: Add support DG_AFFINITY env var parsing. - ceph-volume - Ceph

Feature #44929

Updated by Jan Fajerski almost 4 years ago

This is currently only relevant for cephadm. 

 We need a way to pass certain flags to ceph-volume that are not particularly useful to be exposed via ceph-volume's arg parser 

 * blacklisted devices  

 This will be a functionality that you can leverage via the orchestrators. We allow to blacklist certain devices on hosts 
 when using drivegroups. 
 One of the reasons you may want to do this is that know that a certain disk will be failing eventually but can't physically remove it yet. 

 To make this work we need a way to broadcast that information down to ceph-volume. 

 <pre> 
 BLACKLISTED_DEVICES=/dev/sda ceph-volume .. 
 </pre> 


 * drivegroup affinity 

 When creating a set of osds using the orchestrator (currently only cephadm) you will end up using OSDSpecs which inherits from DriveGroupSpec. 
 Since it's possible to apply multiple different drivegroups on one node we need a way to map osd daemons to their respective drivegroup persistently 

 The command would look something like this: 

 <pre> 
 DG_AFFINITY='custom_name' ceph-volume lvm batch /dev/sda /dev/sdb 
 </pre> 

 internally this would save the dg_affinity key with the value of "custom_name" in the tags of /dev/sda and /dev/sdb (their respective LV counterpart of course) 

 To retrieve the information we can leverage the existing `inventory` feature. This key will only be exposed when using the    `--format $any_non_plain` flag as this information is almost always useless to the end user. 

 <pre> 
 ceph-volume lvm inventory /dev/sda --format json-pretty 
 { 
 ... # metadata 
 dg_affinity: 'custom_name' 
 ... # more metadata 

 } 
 </pre> 


 CONSIDERATIONS: 

 There is still the question if it makes sense to add proper arg parsing/passing for these flags to make it less awkward. 

 `ceph-volume lvm batch /dev/sda --dg-affinity $custom_name` 

 This seems more natural, but isn't very likely to be used by an actual person as drivegroups are typically leveraged by orchestration systems. 


 It's even more weird and awkward for the BLACKLISTED_DEVICES var env. If a _user_ wants to exclude a certain disk from the command that he's running he can just exclude it from the *ACTUAL* command instead of passing it to ceph-volume as an ENV_VAR. 

 However, using the method described here gives us some advantages for orchestration systems. 

 If we'd do it like a user and just pass the adjusted command down to ceph-volume, we loose reporting functionality.  

 Let's assume we have two disks, /dev/sda, /dev/sdb and we want to blacklist /dev/sdb. 

 A user would do the following. 

 <pre> 

 ceph-volume lvm batch /dev/sda  

 </pre> 

 The orchestrator should however do this: 

 <pre> 

 BLACKLISTED_DEVICES=/dev/sdb ceph-volume lvm batch /dev/sda /dev/sdb 

 </pre> 

 The reason for this is: 

 If you forgot (or for whatever reason) that you blacklisted a certain device, it's hard to debug and find out what's going on. 
 However, if you use the ENV_VAR method you can show that in the inventory output that cephadm also uses to show the device status. 

 <pre> 
 BLACKLISTED_DEVICES=/dev/sdb ceph-volume inventory --format json-pretty 

 </pre> 

 Will now print in the "rejected" field "blacklisted by user".

Back

Project

General

Profile

Ceph » ceph-volume

Feature #44929