Project

General

Profile

Mon - dispatch messages while waiting for IO to complete » History » Version 1

Jessica Mack, 06/30/2015 11:47 PM

1 1 Jessica Mack
h1. Mon - dispatch messages while waiting for IO to complete
2 1 Jessica Mack
3 1 Jessica Mack
h3. Summary
4 1 Jessica Mack
5 1 Jessica Mack
During GIant CDS we discussed dispatching monitor messages independently, increasing concurrency while keeping serializability of operations. After some testing with write-intensive workloads, we realized that what we really, really need is dispatching some monitor messages while the monitor is waiting for IO to complete. This will minimize monitors flapping due to leveldb not being able to keep up with the write workload, especially when compactions are triggered.
6 1 Jessica Mack
7 1 Jessica Mack
h3. Owners
8 1 Jessica Mack
9 1 Jessica Mack
* Joao Eduardo Luis (Inktank)
10 1 Jessica Mack
* Name (Affiliation)
11 1 Jessica Mack
* Name
12 1 Jessica Mack
13 1 Jessica Mack
h3. Interested Parties
14 1 Jessica Mack
15 1 Jessica Mack
* Name (Affiliation)
16 1 Jessica Mack
* Name (Affiliation)
17 1 Jessica Mack
* Name
18 1 Jessica Mack
19 1 Jessica Mack
h3. Current Status
20 1 Jessica Mack
21 1 Jessica Mack
Previous blueprint from the Giant CDS didn't get much love, so the current state is pretty much the same as it was back then: dispatch queues, work queues and thread pools already exist and are used in other portions of Ceph, and we can reuse them on the monitors.  We may not even need any of that if we choose to keep handling just a single message at a time while relinquishing control of the big monitor lock.
22 1 Jessica Mack
23 1 Jessica Mack
h3. Detailed Description
24 1 Jessica Mack
 
25 1 Jessica Mack
We will have to make sure to dispatch some messages, especially related to election and leases, while performing IO.  Majority of IO will be performed when finishing a paxos transaction and applying the state to leveldb.
26 1 Jessica Mack
 
27 1 Jessica Mack
Currently, the ideal approach would be to introduce a new state to Paxos identifying that we are currently waiting (or about to wait) for a paxos transaction to complete.  We should wait on a condition, relinquish the mon lock, and let the monitor go back to handling messages.  We must ensure that lease extensions are still propagated to all the monitors, even if the current proposal has not been fully committed (yes, it is okay to do this because until the transaction is fully committed the old value is still valid).
28 1 Jessica Mack
 
29 1 Jessica Mack
We may also have to look into lease timeouts and adjusting them properly (or even temporarily disabling them) during a transaction commit, as sometimes a transaction can take up to one minute to commit (a write may have to wait for a compaction to finish, for instance) and all our timeouts are considerably shorter than that by default (longest is, iirc, 10 seconds).
30 1 Jessica Mack
31 1 Jessica Mack
h3. Work items
32 1 Jessica Mack
33 1 Jessica Mack
h4. Coding tasks
34 1 Jessica Mack
35 1 Jessica Mack
# Task 1
36 1 Jessica Mack
# Task 2
37 1 Jessica Mack
# Task 3
38 1 Jessica Mack
39 1 Jessica Mack
h4. Build / release tasks
40 1 Jessica Mack
41 1 Jessica Mack
# Task 1
42 1 Jessica Mack
# Task 2
43 1 Jessica Mack
# Task 3
44 1 Jessica Mack
45 1 Jessica Mack
h4. Documentation tasks
46 1 Jessica Mack
47 1 Jessica Mack
# Task 1
48 1 Jessica Mack
# Task 2
49 1 Jessica Mack
# Task 3
50 1 Jessica Mack
51 1 Jessica Mack
h4. Deprecation tasks
52 1 Jessica Mack
53 1 Jessica Mack
# Task 1
54 1 Jessica Mack
# Task 2
55 1 Jessica Mack
# Task 3