Advertisement
Guest User

Untitled

a guest
Jun 19th, 2019
121
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 15.63 KB | None | 0 0
  1. --------------------------------- MODULE RaftMongo ---------------------------------
  2. \* This is the formal specification for the Raft consensus algorithm in MongoDB
  3.  
  4. EXTENDS Naturals, FiniteSets, Sequences, TLC, Animation, Integers
  5.  
  6. \* The set of server IDs
  7. CONSTANTS Server
  8.  
  9. \* Server states.
  10. \* Candidate is not used, but this is fine.
  11. CONSTANTS Follower, Candidate, Leader
  12.  
  13. \* A reserved value.
  14. CONSTANTS Nil
  15.  
  16. ----
  17. \* Global variables
  18.  
  19. \* The server's term number.
  20. VARIABLE globalCurrentTerm
  21.  
  22. ----
  23. \* The following variables are all per server (functions with domain Server).
  24.  
  25. \* The server's state (Follower, Candidate, or Leader).
  26. VARIABLE state
  27.  
  28. \* The commit point learned by each server.
  29. VARIABLE commitPoint
  30.  
  31. \* The current sync source of each server, if it has one.
  32. VARIABLE syncSource
  33.  
  34. electionVars == <<globalCurrentTerm, state>>
  35. serverVars == <<electionVars, commitPoint, syncSource>>
  36.  
  37. \* A Sequence of log entries. The index into this sequence is the index of the
  38. \* log entry. Unfortunately, the Sequence module defines Head(s) as the entry
  39. \* with index 1, so be careful not to use that!
  40. VARIABLE log
  41. logVars == <<log>>
  42.  
  43. \* End of per server variables.
  44. ----
  45.  
  46. \* All variables; used for stuttering (asserting state hasn't changed).
  47. vars == <<serverVars, logVars>>
  48.  
  49. ----
  50. \* Helpers
  51.  
  52. \* The set of all quorums. This just calculates simple majorities, but the only
  53. \* important property is that every quorum overlaps with every other.
  54. Quorum == {i \in SUBSET(Server) : Cardinality(i) * 2 > Cardinality(Server)}
  55.  
  56. \* The term of the last entry in a log, or 0 if the log is empty.
  57. GetTerm(xlog, index) == IF index = 0 THEN 0 ELSE xlog[index].term
  58. LogTerm(i, index) == GetTerm(log[i], index)
  59. LastTerm(xlog) == GetTerm(xlog, Len(xlog))
  60.  
  61. \* Return the minimum value from a set, or undefined if the set is empty.
  62. Min(s) == CHOOSE x \in s : \A y \in s : x <= y
  63. \* Return the maximum value from a set, or undefined if the set is empty.
  64. Max(s) == CHOOSE x \in s : \A y \in s : x >= y
  65.  
  66. ----
  67. \* Define initial values for all variables
  68.  
  69. InitServerVars == /\ globalCurrentTerm = 0
  70. /\ state = [i \in Server |-> Follower]
  71. /\ commitPoint = [i \in Server |-> [term |-> 0, index |-> 0]]
  72. /\ syncSource = [i \in Server |-> Nil]
  73. InitLogVars == /\ log = [i \in Server |-> << >>]
  74. Init == /\ InitServerVars
  75. /\ InitLogVars
  76.  
  77. ----
  78. \* Message handlers
  79. \* i = recipient, j = sender, m = message
  80.  
  81. \* Is log[i] a prefix of log[j].
  82. IsLogPrefix(i, j) ==
  83. /\ Len(log[i]) < Len(log[j])
  84. /\ LastTerm(log[i]) = LogTerm(j, Len(log[i]))
  85.  
  86. AppendOplog(i, j) ==
  87. \* /\ state[i] = Follower \* Disable primary catchup and draining
  88. /\ syncSource[i] = j \* Optionally enable sync source selection rules.
  89. /\ Len(log[i]) < Len(log[j])
  90. /\ LastTerm(log[i]) = LogTerm(j, Len(log[i]))
  91. /\ log' = [log EXCEPT ![i] = Append(log[i], log[j][Len(log[i]) + 1])]
  92. /\ UNCHANGED <<serverVars>>
  93.  
  94. CanRollbackOplog(i, j) ==
  95. /\ Len(log[i]) > 0
  96. /\ \* The log with later term is more up-to-date
  97. LastTerm(log[i]) < LastTerm(log[j])
  98. /\
  99. \/ Len(log[i]) > Len(log[j])
  100. \* There seems no short-cut of OR clauses, so I have to specify the negative case
  101. \/ /\ Len(log[i]) <= Len(log[j])
  102. /\ LastTerm(log[i]) /= LogTerm(j, Len(log[i]))
  103.  
  104. RollbackOplog(i, j) ==
  105. /\ CanRollbackOplog(i, j)
  106. \* Rollback 1 oplog entry
  107. /\ LET new == [index2 \in 1..(Len(log[i]) - 1) |-> log[i][index2]]
  108. IN log' = [log EXCEPT ![i] = new]
  109. \* For any nodes that are currently syncing from you, clear their sync sources.
  110. /\ syncSource' = [s \in Server |-> IF syncSource[s] = i THEN Nil ELSE syncSource[s]]
  111. /\ UNCHANGED <<electionVars, commitPoint>>
  112.  
  113. \* The set of nodes that has log[me][logIndex] in their oplog
  114. Agree(me, logIndex) ==
  115. { node \in Server :
  116. /\ Len(log[node]) >= logIndex
  117. /\ LogTerm(me, logIndex) = LogTerm(node, logIndex) }
  118.  
  119. IsCommitted(me, logIndex) ==
  120. /\ Agree(me, logIndex) \in Quorum
  121. \* If we comment out the following line, a replicated log entry from old primary will voilate the safety.
  122. \* [ P (2), S (), S ()]
  123. \* [ S (2), S (), P (3)]
  124. \* [ S (2), S (2), P (3)] !!! the log from term 2 shouldn't be considered as committed.
  125. /\ LogTerm(me, logIndex) = globalCurrentTerm
  126.  
  127. \* RollbackCommitted and NeverRollbackCommitted are not actions.
  128. \* They are used for verification.
  129. RollbackCommitted(i) ==
  130. \E j \in Server:
  131. /\ CanRollbackOplog(i, j)
  132. /\ IsCommitted(i, Len(log[i]))
  133.  
  134. NeverRollbackCommitted ==
  135. \A i \in Server: ~RollbackCommitted(i)
  136.  
  137. \* ACTION
  138. \* i = the new primary node.
  139. BecomePrimaryByMagic(i) ==
  140. LET notBehind(me, j) ==
  141. \/ LastTerm(log[me]) > LastTerm(log[j])
  142. \/ /\ LastTerm(log[me]) = LastTerm(log[j])
  143. /\ Len(log[me]) >= Len(log[j])
  144. ayeVoters(me) ==
  145. { index \in Server : notBehind(me, index) }
  146. IN /\ ayeVoters(i) \in Quorum
  147. /\ state' = [index \in Server |-> IF index = i THEN Leader ELSE Follower]
  148. /\ globalCurrentTerm' = globalCurrentTerm + 1
  149. /\ syncSource' = [syncSource EXCEPT ![i] = Nil] \* clear sync source.
  150. /\ UNCHANGED <<commitPoint, logVars>>
  151.  
  152. \* ACTION
  153. \* Leader i receives a client request to add v to the log.
  154. ClientWrite(i) ==
  155. /\ state[i] = Leader
  156. /\ LET entry == [term |-> globalCurrentTerm]
  157. newLog == Append(log[i], entry)
  158. IN log' = [log EXCEPT ![i] = newLog]
  159. /\ UNCHANGED <<serverVars>>
  160.  
  161. \* ACTION
  162. AdvanceCommitPoint ==
  163. \E leader \in Server :
  164. /\ state[leader] = Leader
  165. /\ IsCommitted(leader, Len(log[leader]))
  166. /\ commitPoint' = [commitPoint EXCEPT ![leader] = [term |-> LastTerm(log[leader]), index |-> Len(log[leader])]]
  167. /\ UNCHANGED <<electionVars, logVars, syncSource>>
  168.  
  169. \* Return whether Node i can learn the commit point from Node j.
  170. CommitPointLessThan(i, j) ==
  171. \/ commitPoint[i].term < commitPoint[j].term
  172. \/ /\ commitPoint[i].term = commitPoint[j].term
  173. /\ commitPoint[i].index < commitPoint[j].index
  174.  
  175. \* ACTION
  176. \* Node i learns the commit point from j via heartbeat.
  177. LearnCommitPoint(i, j) ==
  178. /\ CommitPointLessThan(i, j)
  179. /\ commitPoint' = [commitPoint EXCEPT ![i] = commitPoint[j]]
  180. /\ UNCHANGED <<electionVars, logVars, syncSource>>
  181.  
  182. \* ACTION
  183. \* Node i learns the commit point from j via heartbeat with term check
  184. LearnCommitPointWithTermCheck(i, j) ==
  185. /\ LastTerm(log[i]) = commitPoint[j].term
  186. /\ LearnCommitPoint(i, j)
  187.  
  188. \* ACTION
  189. LearnCommitPointFromSyncSource(i, j) ==
  190. /\ ENABLED AppendOplog(i, j)
  191. /\ LearnCommitPoint(i, j)
  192.  
  193. \* ACTION
  194. LearnCommitPointFromSyncSourceNeverBeyondLastApplied(i, j) ==
  195. \* From sync source
  196. /\ ENABLED AppendOplog(i, j)
  197. /\ CommitPointLessThan(i, j)
  198. \* Never beyond last applied
  199. /\ LET myCommitPoint ==
  200. \* If they have the same term, commit point can be ahead.
  201. IF commitPoint[j].term <= LastTerm(log[i])
  202. THEN commitPoint[j]
  203. ELSE [term |-> LastTerm(log[i]), index |-> Len(log[i])]
  204. IN commitPoint' = [commitPoint EXCEPT ![i] = myCommitPoint]
  205. /\ UNCHANGED <<electionVars, logVars>>
  206.  
  207. \* ACTION
  208. AppendEntryAndLearnCommitPointFromSyncSource(i, j) ==
  209. \* Append entry
  210. /\ syncSource[i] = j \* Optionally enable sync source selection rules.
  211. /\ Len(log[i]) < Len(log[j])
  212. /\ LastTerm(log[i]) = LogTerm(j, Len(log[i]))
  213. /\ log' = [log EXCEPT ![i] = Append(log[i], log[j][Len(log[i]) + 1])]
  214. \* Learn commit point
  215. /\ CommitPointLessThan(i, j)
  216. /\ commitPoint' = [commitPoint EXCEPT ![i] = commitPoint[j]]
  217. /\ UNCHANGED <<electionVars, syncSource>>
  218.  
  219. \* ACTION
  220. \* Server i chooses server j as its new sync source.
  221. \* i can choose j as a sync source if log[i] is a prefix of log[j] and
  222. \* log[j] is longer than log[i].
  223. ChooseNewSyncSource(i, j) ==
  224. /\ \/ IsLogPrefix(i, j)
  225. \* If logs are equal, allow choosing sync source if it has a newer commit point.
  226. \/ /\ log[i] = log[j]
  227. /\ commitPoint[j].index > commitPoint[i].index
  228. /\ state[i] = Follower \* leaders don't need to sync oplog entries.
  229. /\ syncSource' = [syncSource EXCEPT ![i] = j]
  230. /\ UNCHANGED <<electionVars, logVars, commitPoint>>
  231.  
  232. \* Does a 2 node sync source cycle exist?
  233. SyncSourceCycleTwoNode ==
  234. \E s, t \in Server :
  235. /\ s # t
  236. /\ syncSource[s] = t
  237. /\ syncSource[t] = s
  238.  
  239. \* The set of all sequences with elements in set 's', of maximum length 'n'.
  240. BoundedSeq(s, n) == [1..n -> s]
  241.  
  242. \* The set of all paths in the graph that consists of edges <<s,t>> where s has t as a sync
  243. \* source.
  244. SyncSourcePaths ==
  245. {p \in BoundedSeq(Server, Cardinality(Server)) :
  246. \A i \in 1..(Len(p)-1) : syncSource[p[i]] = p[i+1]}
  247.  
  248. \* Is there a non-trivial path in the sync source graph from node i to node j?
  249. \* This rules out trivial paths i.e. those of length 1.
  250. SyncSourcePath(i, j) ==
  251. \E p \in SyncSourcePaths :
  252. /\ Len(p) > 1
  253. /\ p[1] = i \* the source node.
  254. /\ p[Len(p)]=j \* the target node.
  255.  
  256. \* Does a general (possibly multi-node) sync source cycle exist?
  257. SyncSourceCycle ==
  258. \E s \in Server : SyncSourcePath(s, s)
  259.  
  260. ----
  261. AppendOplogAction ==
  262. \E i,j \in Server : AppendOplog(i, j)
  263.  
  264. RollbackOplogAction ==
  265. \E i,j \in Server : RollbackOplog(i, j)
  266.  
  267. BecomePrimaryByMagicAction ==
  268. \E i \in Server : BecomePrimaryByMagic(i)
  269.  
  270. ClientWriteAction ==
  271. \E i \in Server : ClientWrite(i)
  272.  
  273. LearnCommitPointAction ==
  274. \E i, j \in Server : LearnCommitPoint(i, j)
  275.  
  276. LearnCommitPointWithTermCheckAction ==
  277. \E i, j \in Server : LearnCommitPointWithTermCheck(i, j)
  278.  
  279. LearnCommitPointFromSyncSourceAction ==
  280. \E i, j \in Server : LearnCommitPointFromSyncSource(i, j)
  281.  
  282. LearnCommitPointFromSyncSourceNeverBeyondLastAppliedAction ==
  283. \E i, j \in Server : LearnCommitPointFromSyncSourceNeverBeyondLastApplied(i, j)
  284.  
  285. AppendEntryAndLearnCommitPointFromSyncSourceAction ==
  286. \E i, j \in Server : AppendEntryAndLearnCommitPointFromSyncSource(i, j)
  287.  
  288. ChooseNewSyncSourceAction ==
  289. \E i, j \in Server : ChooseNewSyncSource(i, j)
  290.  
  291. ----
  292. \* Properties to check
  293.  
  294. RollbackBeforeCommitPoint(i) ==
  295. /\ \E j \in Server:
  296. /\ CanRollbackOplog(i, j)
  297. /\ \/ LastTerm(log[i]) < commitPoint[i].term
  298. \/ /\ LastTerm(log[i]) = commitPoint[i].term
  299. /\ Len(log[i]) <= commitPoint[i].index
  300. \* todo: clean up
  301.  
  302. NeverRollbackBeforeCommitPoint == \A i \in Server: ~RollbackBeforeCommitPoint(i)
  303.  
  304. \* Liveness check
  305.  
  306. \* This isn't accurate for any infinite behavior specified by Spec, but it's fine
  307. \* for any finite behavior with the liveness we can check with the model checker.
  308. \* This is to check at any time, if two nodes' commit points are not the same, they
  309. \* will be the same eventually.
  310. \* This is checked after all possible rollback is done.
  311. CommitPointEventuallyPropagates ==
  312. /\ \A i, j \in Server:
  313. [](commitPoint[i] # commitPoint[j] ~>
  314. <>(~ENABLED RollbackOplogAction => commitPoint[i] = commitPoint[j]))
  315.  
  316. ----
  317. \* Defines how the variables may transition.
  318. Next ==
  319. \* --- Replication protocol
  320. \/ AppendOplogAction
  321. \/ RollbackOplogAction
  322. \/ BecomePrimaryByMagicAction
  323. \/ ClientWriteAction
  324. \/ ChooseNewSyncSourceAction
  325. \*
  326. \* --- Commit point learning protocol
  327. \/ AdvanceCommitPoint
  328. \* \/ LearnCommitPointAction
  329. \/ LearnCommitPointFromSyncSourceAction
  330. \* \/ AppendEntryAndLearnCommitPointFromSyncSourceAction
  331. \* \/ LearnCommitPointWithTermCheckAction
  332. \* \/ LearnCommitPointFromSyncSourceNeverBeyondLastAppliedAction
  333.  
  334. Liveness ==
  335. /\ SF_vars(AppendOplogAction)
  336. /\ SF_vars(RollbackOplogAction)
  337. \* A new primary should eventually write one entry.
  338. /\ WF_vars(\E i \in Server : LastTerm(log[i]) # globalCurrentTerm /\ ClientWrite(i))
  339. \* /\ WF_vars(ClientWriteAction)
  340. \*
  341. \* --- Commit point learning protocol
  342. /\ WF_vars(AdvanceCommitPoint)
  343. \* /\ WF_vars(LearnCommitPointAction)
  344. /\ SF_vars(LearnCommitPointFromSyncSourceAction)
  345. \* /\ SF_vars(AppendEntryAndLearnCommitPointFromSyncSourceAction)
  346. \* /\ SF_vars(LearnCommitPointWithTermCheckAction)
  347. \* /\ SF_vars(LearnCommitPointFromSyncSourceNeverBeyondLastAppliedAction)
  348.  
  349. \* The specification must start with the initial state and transition according
  350. \* to Next.
  351. Spec == Init /\ [][Next]_vars /\ Liveness
  352.  
  353.  
  354. \*
  355. \*
  356. \* TRACE ANIMATION.
  357. \*
  358. \*
  359.  
  360. Pick(S) == CHOOSE x \in S : TRUE
  361.  
  362. RECURSIVE SetToSeq(_)
  363. SetToSeq(S) == IF S = {} THEN <<>>
  364. ELSE LET v == Pick(S) IN <<v>> \o SetToSeq(S \ {v})
  365.  
  366. Injective(f) == \A x, y \in DOMAIN f : f[x] = f[y] => x = y
  367.  
  368.  
  369. ServerId == CHOOSE f \in [Server -> 1..Cardinality(Server)] : Injective(f)
  370.  
  371. \* This animation assumes 4 servers.
  372. Base == [x |-> 50, y |-> 50]
  373. ServerPositions ==
  374. 1 :> [x |-> Base.x, y |-> Base.y] @@
  375. 2 :> [x |-> Base.x + 200, y |-> Base.y + 0] @@
  376. 3 :> [x |-> Base.x + 200, y |-> Base.y + 150] @@
  377. 4 :> [x |-> Base.x + 0, y |-> Base.y + 150]
  378.  
  379.  
  380. \* Colors to represent the state of a server.
  381. StateColor ==
  382. Leader :> "green" @@
  383. Candidate :> "yellow" @@
  384. Follower :> "darkgray"
  385.  
  386. \* Generates a graphic element representing server 'sid'.
  387. ServerElem(sid) ==
  388. LET pos == ServerPositions[ServerId[sid]]
  389. \* circleAttrs == ("stroke" :> "black" @@ "stroke-width" :> "1" @@ "fill" :> "blue")
  390. circleAttrs == [fill |-> StateColor[state[sid]]]
  391. circle == Circle(pos.x, pos.y, 12, circleAttrs) IN
  392. Group(<<circle>>, <<>>)
  393.  
  394.  
  395. ServerElems == {Group(<<ServerElem(s)>>, <<>>) : s \in Server}
  396.  
  397. \* Linearly interpolate a position halfway between two given points. 't', the interpolation factor, is measured
  398. \* on a scale from 0 to 100.
  399. Interp(p1, p2, t) ==
  400. [ x |-> (t * p1.x + (100-t) * p2.x) \div 100 ,
  401. y |-> (t * p1.y + (100-t) * p2.y) \div 100]
  402.  
  403.  
  404. \* The position of a log slot for a server and log index.
  405. LogEntryDims == [w |-> 22, h |-> 22]
  406. LogPos(s, i) == [x |-> ServerPositions[ServerId[s]].x + 40,
  407. y |-> ServerPositions[ServerId[s]].y + 20 - (i * (LogEntryDims.w + 3))]
  408.  
  409. \* Generate a single log entry element at index 'i' for server 'sid'.
  410. LogEntryElem(sid, i) ==
  411. LET pos == LogPos(sid, i)
  412. entry == Rect(pos.x, pos.y, LogEntryDims.w, LogEntryDims.h, [fill |-> "lightblue"])
  413. term == Text((pos.x + LogEntryDims.w \div 2 - 6), (pos.y + LogEntryDims.h \div 2 + 8), ToString(log[sid][i].term), <<>>) IN
  414. Group(<<entry, term>>, <<>>)
  415.  
  416. LogElem(sid) ==
  417. LET entryElems == {LogEntryElem(sid, i) : i \in DOMAIN log[sid]} IN
  418. Group(SetToSeq(entryElems), <<>>)
  419.  
  420. LogElems == {LogElem(s) : s \in Server}
  421.  
  422. \* Log Slot Elements
  423. MaxLogSlots == 2
  424. LogSlot(sid, i) == Rect(LogPos(sid, i).x, LogPos(sid, i).y, LogEntryDims.w + 2, LogEntryDims.h, [stroke |-> (IF commitPoint[sid].index = i THEN "red" ELSE "black"), fill |-> "lightgray"])
  425. LogSlotElem(sid) ==
  426. LET slotElems == {LogSlot(sid, i) : i \in 1..MaxLogSlots} IN
  427. Group(SetToSeq(slotElems), <<>>)
  428. LogSlotElems == {LogSlotElem(s) : s \in Server}
  429.  
  430.  
  431. SyncSource(s) ==
  432. LET src == ServerPositions[ServerId[s]]
  433. target == ServerPositions[ServerId[syncSource[s]]]
  434. line == Line(src.x, src.y, target.x, target.y, [stroke |-> "black"])
  435. headP == Interp(src, target, 5)
  436. head == Circle(headP.x, headP.y, 3, [fill |-> "orange"]) IN
  437. Group(<<line, head>>, <<>>)
  438.  
  439. SyncSourceElems == {SyncSource(i) : i \in {s \in Server : syncSource[s] # Nil}}
  440.  
  441. \*AllElems == SetToSeq(LogElems) \o SetToSeq(ServerElems)
  442. AllElems == SetToSeq(ServerElems)
  443. \o SetToSeq(SyncSourceElems)
  444. \o SetToSeq(LogSlotElems)
  445. \o SetToSeq(LogElems)
  446.  
  447.  
  448. View == Group(AllElems, <<>>)
  449.  
  450.  
  451. ===============================================================================
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement