Advertisement
Guest User

Untitled

a guest
Aug 25th, 2017
486
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 11.48 KB | None | 0 0
  1. Free Ekanayaka <free@ekanayaka.io> (Sat. 08:46) (sent)
  2. Subject: dqlite - SQLite replication and failover library
  3. To: sqlite-users@mailinglists.sqlite.org
  4. Date: Sat, 19 Aug 2017 08:46:59 +0000
  5.  
  6. Hi,
  7.  
  8. first of all let me thank Richard Hipp and the rest of the SQLite team
  9. and community for such a great piece of software. I've been recently
  10. working with SQLite's code base and found it an absolute pleasure to
  11. read (and have learned from it too).
  12.  
  13. In this mail I'd like to:
  14.  
  15. 1) Present dqlite, a library replicating your application's SQLite
  16. database across N nodes and safely surviving any minority of them
  17. dying or disconnecting (only for Go applications for now, see below).
  18.  
  19. 2) Submit a patch to SQLite that introduces a minimal replication API,
  20. and get feedback about its possible inclusion upstream.
  21.  
  22. = dqlite =
  23.  
  24. It's a Go package that uses the Raft algorithm to replicate SQLite WAL
  25. frames across a cluster of nodes. This roughly means that you can open a
  26. SQLite connection using the "database/sql" standard lib API and have
  27. anything you transactionally replicated. No external process needed.
  28.  
  29. Ideally this library should have been written in C or Rust, to support
  30. binding to any language. However, due to the use case and timeline of
  31. the first project that will use it (LXD [0]), and due to the lack of
  32. mature Raft implementations in C/Rust, Go was chosen instead. It should
  33. hopefully at least serve as reference to anyone needing a C/Rust
  34. version.
  35.  
  36. The work has been funded by Canonical, the company behind Ubuntu. Please
  37. see the dqlite's home page [1] for more details.
  38.  
  39. = SQLite replication API patch =
  40.  
  41. This is the SQLite patch that dqlite depends on. It essentially adds a
  42. few key hooks in the pager and write-ahead log to let external libraries
  43. like dqlite implement WAL-based database replication.
  44.  
  45. As you'll quickly see, it's by no means ready for upstream inclusion, in
  46. particular it lacks unit tests and more comprehensive documentation
  47. comments (note however that virtually every code path introduced by the
  48. patch is already exercised indirectly by dqlite's own unit tests).
  49.  
  50. If the SQLite team thinks there is room for upstream inclusion, I'll be
  51. more than glad to do the necessary work to make the patch adhere to
  52. SQLite's standards and go through a review process.
  53.  
  54. The patch has currently 703 additions and 22 deletions, and is published
  55. on GitHub [2].
  56.  
  57. Cheers,
  58.  
  59. Free
  60.  
  61. [0] https://linuxcontainers.org/
  62. [1] https://github.com/CanonicalLtd/dqlite
  63. [2] https://github.com/CanonicalLtd/sqlite/commit/2a9aa8b056f37ae05f38835182a2856ffc95aee4
  64. Wout Mertens <wout.mertens@gmail.com> (Sun. 17:50) (lists lists/sqlite-users replied)
  65. Subject: Re: [sqlite] dqlite - SQLite replication and failover library
  66. To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
  67. Date: Sun, 20 Aug 2017 17:50:24 +0000
  68.  
  69. Very interesting!
  70.  
  71. So how does it behave during conflict situations? Raft selects a winning
  72. WAL write and any others in flight are aborted?
  73.  
  74. And when not enough nodes are available, writes are hung until consensus?
  75.  
  76. I won't be able to use it due to Go but it's great to know that this is on
  77. the horizon of possibilities… Very nice!
  78.  
  79. On Sat, Aug 19, 2017 at 10:47 AM Free Ekanayaka <free@ekanayaka.io> wrote:
  80.  
  81. > Hi,
  82. > first of all let me thank Richard Hipp and the rest of the SQLite team
  83. [ 57 more citation lines. Click/Enter to show. ]
  84. > and community for such a great piece of software. I've been recently
  85. > working with SQLite's code base and found it an absolute pleasure to
  86. > read (and have learned from it too).
  87. >
  88. > In this mail I'd like to:
  89. >
  90. > 1) Present dqlite, a library replicating your application's SQLite
  91. > database across N nodes and safely surviving any minority of them
  92. > dying or disconnecting (only for Go applications for now, see below).
  93. >
  94. > 2) Submit a patch to SQLite that introduces a minimal replication API,
  95. > and get feedback about its possible inclusion upstream.
  96. >
  97. > = dqlite =
  98. >
  99. > It's a Go package that uses the Raft algorithm to replicate SQLite WAL
  100. > frames across a cluster of nodes. This roughly means that you can open a
  101. > SQLite connection using the "database/sql" standard lib API and have
  102. > anything you transactionally replicated. No external process needed.
  103. >
  104. > Ideally this library should have been written in C or Rust, to support
  105. > binding to any language. However, due to the use case and timeline of
  106. > the first project that will use it (LXD [0]), and due to the lack of
  107. > mature Raft implementations in C/Rust, Go was chosen instead. It should
  108. > hopefully at least serve as reference to anyone needing a C/Rust
  109. > version.
  110. >
  111. > The work has been funded by Canonical, the company behind Ubuntu. Please
  112. > see the dqlite's home page [1] for more details.
  113. >
  114. > = SQLite replication API patch =
  115. >
  116. > This is the SQLite patch that dqlite depends on. It essentially adds a
  117. > few key hooks in the pager and write-ahead log to let external libraries
  118. > like dqlite implement WAL-based database replication.
  119. >
  120. > As you'll quickly see, it's by no means ready for upstream inclusion, in
  121. > particular it lacks unit tests and more comprehensive documentation
  122. > comments (note however that virtually every code path introduced by the
  123. > patch is already exercised indirectly by dqlite's own unit tests).
  124. >
  125. > If the SQLite team thinks there is room for upstream inclusion, I'll be
  126. > more than glad to do the necessary work to make the patch adhere to
  127. > SQLite's standards and go through a review process.
  128. >
  129. > The patch has currently 703 additions and 22 deletions, and is published
  130. > on GitHub [2].
  131. >
  132. > Cheers,
  133. >
  134. > Free
  135. >
  136. > [0] https://linuxcontainers.org/
  137. > [1] https://github.com/CanonicalLtd/dqlite
  138. > [2]
  139. > https://github.com/CanonicalLtd/sqlite/commit/2a9aa8b056f37ae05f38835182a2856ffc95aee4
  140. > _______________________________________________
  141. > sqlite-users mailing list
  142. > sqlite-users@mailinglists.sqlite.org
  143. > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
  144. [ 4-line signature. Click/Enter to show. ]
  145. _______________________________________________
  146. sqlite-users mailing list
  147. sqlite-users@mailinglists.sqlite.org
  148. http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
  149. Free Ekanayaka <free@ekanayaka.io> (Sun. 21:44) (replied sent)
  150. Subject: Re: [sqlite] dqlite - SQLite replication and failover library
  151. To: Wout Mertens <wout.mertens@gmail.com>, SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
  152. Date: Sun, 20 Aug 2017 21:44:44 +0000
  153.  
  154. Wout Mertens <wout.mertens@gmail.com> writes:
  155.  
  156. > Very interesting!
  157. > So how does it behave during conflict situations? Raft selects a winning
  158. > WAL write and any others in flight are aborted?
  159.  
  160. Ah yeah this is probably something that was not clear from the docs or
  161. from my presentation.
  162.  
  163. There can't be a conflict situation. Raft's model is that only the
  164. leader can append new log entries, which translated to dqlite means that
  165. only the leader can write new WAL frames. So this means that any attempt
  166. to perform a write transaction on a non-leader node will fail with a
  167. SQLITE_NOT_LEADER error (and in this case clients are supposed to retry
  168. against whoever is the new leader).
  169.  
  170. I'm going to add this to the FAQ.
  171.  
  172. > And when not enough nodes are available, writes are hung until
  173. > consensus?
  174.  
  175. Yes, but there's a (configurable timeout). It's not possible to *not*
  176. have timeout (although you can set it really really high of course :)
  177.  
  178. > I won't be able to use it due to Go but it's great to know that this is on
  179. > the horizon of possibilities… Very nice!
  180.  
  181. Yeah I think Go is somehow limiting, but hopefully once Raft libraries
  182. mature in C/Raft, dqlite can act as reference/prototype.
  183. Free Ekanayaka <free@ekanayaka.io> (Sun. 22:07) (sent)
  184. Subject: Re: [sqlite] dqlite - SQLite replication and failover library
  185. To: Wout Mertens <wout.mertens@gmail.com>, SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
  186. Date: Sun, 20 Aug 2017 22:07:27 +0000
  187.  
  188. Free Ekanayaka <free@ekanayaka.io> writes:
  189.  
  190. >> And when not enough nodes are available, writes are hung until
  191. >> consensus?
  192. >
  193. > Yes, but there's a (configurable timeout).
  194.  
  195. BTW, this is a consequence of Raft sitting in the CP spectrum of the CAP
  196. theorem: in case of a network partition it chooses consistency and
  197. sacrifices availability.
  198. Wout Mertens <wout.mertens@gmail.com> (Sun. 22:11) (replied)
  199. Subject: Re: [sqlite] dqlite - SQLite replication and failover library
  200. To: Free Ekanayaka <free@ekanayaka.io>, SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
  201. Date: Sun, 20 Aug 2017 22:11:34 +0000
  202.  
  203. [ multipart/alternative ]
  204. [ text/plain ]
  205. Oh I see, of course. So I assume the client library automatically sends
  206. write commands to the current leader?
  207.  
  208. I wonder if there is value in setting a preferred leader, but probably
  209. that's messing too much with the Raft protocol.
  210.  
  211. On Sun, Aug 20, 2017, 11:44 PM Free Ekanayaka <free@ekanayaka.io> wrote:
  212.  
  213. > Wout Mertens <wout.mertens@gmail.com> writes:
  214. > > Very interesting!
  215. [ 25 more citation lines. Click/Enter to show. ]
  216. > >
  217. > > So how does it behave during conflict situations? Raft selects a winning
  218. > > WAL write and any others in flight are aborted?
  219. >
  220. > Ah yeah this is probably something that was not clear from the docs or
  221. > from my presentation.
  222. >
  223. > There can't be a conflict situation. Raft's model is that only the
  224. > leader can append new log entries, which translated to dqlite means that
  225. > only the leader can write new WAL frames. So this means that any attempt
  226. > to perform a write transaction on a non-leader node will fail with a
  227. > SQLITE_NOT_LEADER error (and in this case clients are supposed to retry
  228. > against whoever is the new leader).
  229. >
  230. > I'm going to add this to the FAQ.
  231. >
  232. > > And when not enough nodes are available, writes are hung until
  233. > > consensus?
  234. >
  235. > Yes, but there's a (configurable timeout). It's not possible to *not*
  236. > have timeout (although you can set it really really high of course :)
  237. >
  238. > > I won't be able to use it due to Go but it's great to know that this is
  239. > on
  240. > > the horizon of possibilities… Very nice!
  241. >
  242. > Yeah I think Go is somehow limiting, but hopefully once Raft libraries
  243. > mature in C/Raft, dqlite can act as reference/prototype.
  244. [ text/html (hidden) ]
  245. Free Ekanayaka <free@ekanayaka.io> (Sun. 23:28) (sent)
  246. Subject: Re: [sqlite] dqlite - SQLite replication and failover library
  247. To: Wout Mertens <wout.mertens@gmail.com>, SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
  248. Date: Sun, 20 Aug 2017 23:28:35 +0000
  249.  
  250. Wout Mertens <wout.mertens@gmail.com> writes:
  251.  
  252. > Oh I see, of course. So I assume the client library automatically sends
  253. > write commands to the current leader?
  254.  
  255. No, that's up the application for now, the library just returns you an
  256. error if you attempt a write on a non-leader node.
  257.  
  258. > I wonder if there is value in setting a preferred leader, but probably
  259. > that's messing too much with the Raft protocol.
  260.  
  261. I'm not entirely sure to understand, but if you mean "if possible, I
  262. generally would like the leader to be this node, please", no that's
  263. currently not supported. I don't see a reason why it couldn't be added,
  264. but it seems a kind of exotic requirement in today's "cats vs pets" way
  265. of thinking to nodes.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement