Version 1 - History - Ceph CLI Experience - Ceph - Ceph

1

Jessica Mack

h1. Ceph CLI Experience

2

3

h3. Summary

4

5

Enhance CLI features to make shell based troubleshooting easier.

6

7

h3. Owners

8

9

* Kyle Bader (kyle.bader@dreamhost.com)

10

11

h3. Interested Parties

12

13

* Mike Dawson <mike.dawson@cloudapt.com>

14

* Joao Eduardo Luis <joao.luis@inktank.com>

15

* Name (Affiliation)

16

17

h3. Current Status

18

19

h3. Detailed Description

20

21

There are a number of things that could be done to improve the CLI experience for Ceph operators.

22

23

*Parallel Distributed Shell Integration*

24

The parallel distributed shell (pdsh) is a useful tool for managing a ceph cluster, it would be even more useful if gender and machine information could be derived from ceph monitor state.  For example it would be awesome something like this was supported:

25

<pre>

26

$ cephsh osd 1 uptime

27

28

< runs uptime on osd.1 host >

29

30

$ cephsh rack irv-n1 status ceph-osd-all

31

32

< check upstart status on all odds on all hosts in rack lax-n1 >

33

34

$ cephsh pg 5.123 uptime

35

36

< runs uptime on hosts with osds in placement group 5.123 >

37

</pre>

38

39

This could be implemented as a pdsh plugin or a wrapper around pdsh that collects information from ceph monitors and passes hosts as a comma separated list along with the command:  "pdsh -w <hosts> <command>"

40

41

*Ability to set up/down/in/out or noout/nodown/noup/noin based on CRUSH hiearchy*

42

Another feature that would be incredibly useful would be adding the ability to set up/down/in/out based on CRUSH hiearchy.

43

<pre>

44

$ sudo ceph osd down rack lax-a1

45

$ sudo ceph osd out host cephstore1234

46

$ sudo ceph osd set noout rack lax-a1

47

</pre>

48

49

This is usefull when you are performing maintenance operations on an entire node/rack/etc. The noout/noin/nodown/noup part would be nice when your dealing with a large cluster and where you don't want to stop those operations from taking place on the rest of your cluster.

50

51

*Show running Ceph version in "ceph osd tree"*

52

<pre>

53

# id    weight  type name       up/down reweight

54

-1      879.4   pool default

55

-4      451.4           row lax-a

56

-3      117                     rack lax-a1

57

-2      7                               host cephstore1234

58

48      1                                       osd.0 up      1 0.67.4-1precise

59

65      1                                       osd.1 up      1 0.67.4-1precise

60

86      1                                       osd.2 up      1 0.67.4-1precise

61

116     1                                       osd.3 up      1 0.67.4-1precise

62

184     1                                       osd.4 up      1 0.67.4-1precise

63

711     1                                       osd.5 up      1 0.67.4-1precise

64

777     1                                       osd.6 up      1 0.67.4-1precise

65

-5      6                               host cephstore1235

66

</pre>

67

68

*Add drain action to OSD command*

69

It would be really nice to add a drain command that slowly lowers the CRUSH weight of an OSD or hierarchy of OSDs until it reaches a weight of 0.

70

<pre>

71

$ sudo ceph osd drain osd.1 0.1

72

$ sudo ceph osd drain cephstore1234

73

</pre>

74

75

The drain command would lower the CRUSH weight for all members under that subtree by a default decrement or an decrement passed as a second argument. The cluster would wait until all backfills are complete before further decrementing, ad inifinum until weight is 0.

76

77

h3. Work items

78

79

80

h4. Coding tasks

81

82

# Task 1

83

# #6687

84

# #6506

85

86

h4. Build / release tasks

87

88

# Task 1

89

# Task 2

90

# Task 3

91

92

h4. Documentation tasks

93

94

# Task 1

95

# Task 2

96

# Task 3

97

98

h4. Deprecation tasks

99

100

# Task 1

101

# Task 2

102

# Task 3

Project

General

Profile

Ceph

Ceph CLI Experience » History » Version 1