Project

General

Profile

Mon - dispatch messages while waiting for IO to complete » History » Version 1

Jessica Mack, 06/30/2015 11:47 PM

1 1 Jessica Mack
h1. Mon - dispatch messages while waiting for IO to complete
2
3
h3. Summary
4
5
During GIant CDS we discussed dispatching monitor messages independently, increasing concurrency while keeping serializability of operations. After some testing with write-intensive workloads, we realized that what we really, really need is dispatching some monitor messages while the monitor is waiting for IO to complete. This will minimize monitors flapping due to leveldb not being able to keep up with the write workload, especially when compactions are triggered.
6
7
h3. Owners
8
9
* Joao Eduardo Luis (Inktank)
10
* Name (Affiliation)
11
* Name
12
13
h3. Interested Parties
14
15
* Name (Affiliation)
16
* Name (Affiliation)
17
* Name
18
19
h3. Current Status
20
21
Previous blueprint from the Giant CDS didn't get much love, so the current state is pretty much the same as it was back then: dispatch queues, work queues and thread pools already exist and are used in other portions of Ceph, and we can reuse them on the monitors.  We may not even need any of that if we choose to keep handling just a single message at a time while relinquishing control of the big monitor lock.
22
23
h3. Detailed Description
24
 
25
We will have to make sure to dispatch some messages, especially related to election and leases, while performing IO.  Majority of IO will be performed when finishing a paxos transaction and applying the state to leveldb.
26
 
27
Currently, the ideal approach would be to introduce a new state to Paxos identifying that we are currently waiting (or about to wait) for a paxos transaction to complete.  We should wait on a condition, relinquish the mon lock, and let the monitor go back to handling messages.  We must ensure that lease extensions are still propagated to all the monitors, even if the current proposal has not been fully committed (yes, it is okay to do this because until the transaction is fully committed the old value is still valid).
28
 
29
We may also have to look into lease timeouts and adjusting them properly (or even temporarily disabling them) during a transaction commit, as sometimes a transaction can take up to one minute to commit (a write may have to wait for a compaction to finish, for instance) and all our timeouts are considerably shorter than that by default (longest is, iirc, 10 seconds).
30
31
h3. Work items
32
33
h4. Coding tasks
34
35
# Task 1
36
# Task 2
37
# Task 3
38
39
h4. Build / release tasks
40
41
# Task 1
42
# Task 2
43
# Task 3
44
45
h4. Documentation tasks
46
47
# Task 1
48
# Task 2
49
# Task 3
50
51
h4. Deprecation tasks
52
53
# Task 1
54
# Task 2
55
# Task 3