Actions
Bug #7736
closedmon: can expose stale state
Status:
Resolved
Priority:
High
Assignee:
-
Category:
Monitor
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
emperor, dumpling
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
teuthology-2014-03-14_19:00:49-rados-dumpling-testing-basic-plana/130941
here is a deletion on the primary:
2014-03-14 19:41:37.801055 7f179a1df700 10 mon.a@0(leader).paxos(paxos updating c 1574..1590) handle_accept paxos(accept lc 1590 fc 0 pn 2500 opn 0) v3 2014-03-14 19:41:37.801060 7f179a1df700 10 mon.a@0(leader).paxos(paxos updating c 1574..1590) now 0,3,5,7,8 have accepted 2014-03-14 19:41:37.801064 7f179a1df700 10 mon.a@0(leader).paxos(paxos updating c 1574..1590) got majority, committing ... 2014-03-14 19:41:37.874952 7f179a1df700 20 mon.a@0(leader).osd e396 _pool_op_reply 0 2014-03-14 19:41:37.874955 7f179a1df700 1 -- 10.214.131.2:6789/0 --> 10.214.131.3:0/1022737 -- pool_op_reply(tid 1 (0) Success v396) v1 -- ?+0 0x2c1b6c0 con 0x2605c60
but, on one of the peons, we get a racing create for the same pool that succeeds (as a no-op)
2014-03-14 19:41:37.888531 7fe41f4b9700 1 -- 10.214.131.2:6791/0 <== client.5287 10.214.131.3:0/1022772 7 ==== pool_op(create pool 0 auid 0 tid 1 name foo v0) v4 ==== 68+0+0 (3773679816 0 0) 0x2046000 con 0x2296000 ... 2014-03-14 19:41:37.888617 7fe41f4b9700 20 mon.c@4(peon).osd e395 _pool_op_reply 0 2014-03-14 19:41:37.888620 7fe41f4b9700 1 -- 10.214.131.2:6791/0 --> 10.214.131.3:0/1022772 -- pool_op_reply(tid 1 (0) Success v395) v1 -- ?+0 0x216a000 con 0x2296000
we committed before getting all quorum members to acknowledge the agree, or otherwise agree not to expose the prior state.
Actions