Mon - dispatch messages while waiting for IO to complete » History » Version 1
Jessica Mack, 06/30/2015 11:47 PM
1 | 1 | Jessica Mack | h1. Mon - dispatch messages while waiting for IO to complete |
---|---|---|---|
2 | 1 | Jessica Mack | |
3 | 1 | Jessica Mack | h3. Summary |
4 | 1 | Jessica Mack | |
5 | 1 | Jessica Mack | During GIant CDS we discussed dispatching monitor messages independently, increasing concurrency while keeping serializability of operations. After some testing with write-intensive workloads, we realized that what we really, really need is dispatching some monitor messages while the monitor is waiting for IO to complete. This will minimize monitors flapping due to leveldb not being able to keep up with the write workload, especially when compactions are triggered. |
6 | 1 | Jessica Mack | |
7 | 1 | Jessica Mack | h3. Owners |
8 | 1 | Jessica Mack | |
9 | 1 | Jessica Mack | * Joao Eduardo Luis (Inktank) |
10 | 1 | Jessica Mack | * Name (Affiliation) |
11 | 1 | Jessica Mack | * Name |
12 | 1 | Jessica Mack | |
13 | 1 | Jessica Mack | h3. Interested Parties |
14 | 1 | Jessica Mack | |
15 | 1 | Jessica Mack | * Name (Affiliation) |
16 | 1 | Jessica Mack | * Name (Affiliation) |
17 | 1 | Jessica Mack | * Name |
18 | 1 | Jessica Mack | |
19 | 1 | Jessica Mack | h3. Current Status |
20 | 1 | Jessica Mack | |
21 | 1 | Jessica Mack | Previous blueprint from the Giant CDS didn't get much love, so the current state is pretty much the same as it was back then: dispatch queues, work queues and thread pools already exist and are used in other portions of Ceph, and we can reuse them on the monitors. We may not even need any of that if we choose to keep handling just a single message at a time while relinquishing control of the big monitor lock. |
22 | 1 | Jessica Mack | |
23 | 1 | Jessica Mack | h3. Detailed Description |
24 | 1 | Jessica Mack | |
25 | 1 | Jessica Mack | We will have to make sure to dispatch some messages, especially related to election and leases, while performing IO. Majority of IO will be performed when finishing a paxos transaction and applying the state to leveldb. |
26 | 1 | Jessica Mack | |
27 | 1 | Jessica Mack | Currently, the ideal approach would be to introduce a new state to Paxos identifying that we are currently waiting (or about to wait) for a paxos transaction to complete. We should wait on a condition, relinquish the mon lock, and let the monitor go back to handling messages. We must ensure that lease extensions are still propagated to all the monitors, even if the current proposal has not been fully committed (yes, it is okay to do this because until the transaction is fully committed the old value is still valid). |
28 | 1 | Jessica Mack | |
29 | 1 | Jessica Mack | We may also have to look into lease timeouts and adjusting them properly (or even temporarily disabling them) during a transaction commit, as sometimes a transaction can take up to one minute to commit (a write may have to wait for a compaction to finish, for instance) and all our timeouts are considerably shorter than that by default (longest is, iirc, 10 seconds). |
30 | 1 | Jessica Mack | |
31 | 1 | Jessica Mack | h3. Work items |
32 | 1 | Jessica Mack | |
33 | 1 | Jessica Mack | h4. Coding tasks |
34 | 1 | Jessica Mack | |
35 | 1 | Jessica Mack | # Task 1 |
36 | 1 | Jessica Mack | # Task 2 |
37 | 1 | Jessica Mack | # Task 3 |
38 | 1 | Jessica Mack | |
39 | 1 | Jessica Mack | h4. Build / release tasks |
40 | 1 | Jessica Mack | |
41 | 1 | Jessica Mack | # Task 1 |
42 | 1 | Jessica Mack | # Task 2 |
43 | 1 | Jessica Mack | # Task 3 |
44 | 1 | Jessica Mack | |
45 | 1 | Jessica Mack | h4. Documentation tasks |
46 | 1 | Jessica Mack | |
47 | 1 | Jessica Mack | # Task 1 |
48 | 1 | Jessica Mack | # Task 2 |
49 | 1 | Jessica Mack | # Task 3 |
50 | 1 | Jessica Mack | |
51 | 1 | Jessica Mack | h4. Deprecation tasks |
52 | 1 | Jessica Mack | |
53 | 1 | Jessica Mack | # Task 1 |
54 | 1 | Jessica Mack | # Task 2 |
55 | 1 | Jessica Mack | # Task 3 |