Project

General

Profile

Osd - Transactions » History » Version 1

Jessica Mack, 08/26/2015 01:36 AM

1 1 Jessica Mack
h1. Osd - Transactions
2
3
h3. Summary
4
5
Multi object transactions would be nice.
6
7
h3. Owners
8
9
* Name (Affiliation)
10
* Name (Affiliation)
11
* Name
12
13
h3. Interested Parties
14
15
* Name (Affiliation)
16
* Name (Affiliation)
17
* Name
18
19
h3. Current Status
20
 
21
h3. Detailed Description
22
 
23
transaction is essentially:
24
 
25
struct MultiObjectTransaction {
26
 map<hobject_t, ObjectWriteOperation> object_ops;
27
 hobject_t master;
28
};
29
 
30
each osd/pg has a way to persist in-progress transactions that does not touch the actual object in question.  only when we know that the txn is persisted and can always roll forward in the event of peering or failure do we commit and modify the real objects.
31
 
32
deadlock detection or avoidance?  rgw doesn’t need either, but other users will.
33
 
34
txns: C -> M -> S (disk) -> M (disk) [-> C  … -> S (disk) -> M (disk) ]
35
now: C -> S (disk) -> C -> M (disk) -> C [ -> S (disk) -> C]
36
 
37
model 2
38
- client sends full txn to master
39
- master holds txn in memory, sends PREPAREs to slaves
40
- slaves persist PREPARE on the side, send PREPARE_ACK
41
- master collects all PREPARE_ACKs and applies the txn and marks txn COMMITTING
42
- once persisted, master send COMMITs
43
- master replies to client
44
- slaves get COMMIT and apply, reply with COMMIT_ACK
45
- master collection COMMIT_ACK and closes out txn record
46
    - closes out txn record
47
 
48
slaves:
49
- on pg active:
50
- send NOTIFY to txn masters for fate of prepared txns
51
    - master replies with COMMIT or ROLLBACK, perhaps with delay
52
 
53
master:
54
- resend PREPARE if the slave pg changes
55
 
56
 
57
clients should make the osd with the largest write the master, so that we avoid the prepare cost of writing twice (once for preprare, once to the object)
58
 
59
it might make sense to have the primary delay the ROLLBACK message with the expectation that the client will retry the transaction soon.
60
 
61
the transactions are referenced in the pg metadata on both master and slave so they are pulled into memory on osd restart, and the ObjectContext lock state is always in place
62
63
64
65
 
66
big pieces
67
 
68
pg metadata gets an index of in-flight txns
69
we add somewhere to persist them
70
peering needs to exchange list of in-flight txns and their state
71
some simple logic to roll forward/back
72
73
h3. Work items
74
75
h4. Coding tasks
76
77
# Task 1
78
# Task 2
79
# Task 3
80
81
h4. Build / release tasks
82
83
# Task 1
84
# Task 2
85
# Task 3
86
87
h4. Documentation tasks
88
89
# Task 1
90
# Task 2
91
# Task 3
92
93
h4. Deprecation tasks
94
95
# Task 1
96
# Task 2
97
# Task 3