Advertisement
Guest User

Untitled

a guest
Apr 22nd, 2018
65
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 3.73 KB | None | 0 0
  1. VoltDB development was based on a real research. The team focused on how the computing changed
  2. in 21st century and how can we leverage that to build a better system. How much faster can we go?
  3.  
  4. Memory was the first key insight because it is getting cheaper while operational stores are
  5. growing. While memory is 100 times faster than SSDs and 10000 times faster than spinning disks,
  6. in memory databases werent similarly faster. Regular RDBMSs built to be in memory databases
  7. disappointed with less than 10 times performance increases.
  8.  
  9. What was holding back those systems? Research showed that traditional databases spent
  10. less than 10 % of their time doing actual work. Most of the time was spent in 2 places.
  11.  
  12. -Page Buffer Managemenet and Concurrency Management.
  13.  
  14. Page Buffer Management
  15. -Page buffer system assigns database records to fixed size pages and organize their placement
  16. and manages which pages are loaded into memory and which are on disk. Pages are tracked as dirty
  17. or clean. It adds nothing to an in-memory system.
  18.  
  19. Concurrency Management
  20. -Database must solve 2 concurrency problems
  21. ---Logical problem of multiple user transactions operating concurrently must not conflict
  22. ---and must read consistent data
  23. -It is achieved by using high level locks on tables and rows,
  24. -data structures must be thread safe which adds to the cost.
  25.  
  26. Horizontal scaling
  27. -Beyond memory, the second major shift is to move horizontal. Many small machines
  28. can be more effective and efficient than one large machine. This were VoltDB is comming from.
  29.  
  30. VoltDB barely reads from disk, most of the traditional workload is removed and the disk IO
  31. is almost 100 % append only stream writes. Even spinning disks can sustain high write throughput
  32. when used like this. The system is never blocked on disk synchronization.
  33.  
  34. This is achieved with 2 mechanisms. Background snapshopts and logical logging. Background snapshopts
  35. are transactional and data are serialized to disk at a single logicla point in time cluster wide
  36. and they dont block ongoing operational work.
  37.  
  38. Logical logging protects data that mutates between snapshots. Logical log of write operations
  39. is stremed to disk. If the cluster fails the most recent snapshot is loaded into memory and the
  40. locigal log is replayed. It has advantage over binary logs because disk IO can begin even
  41. before the operational work has started.
  42.  
  43. One thread
  44. -To go more than 10 times faster the concurrency costs need to be eliminated. All data operations
  45. in VoltDb are single threaded, each operation is run completely before starting the next one.
  46. Just simple data structures with no thread safety are used. Its also much more simple to test and modify such system.
  47.  
  48. This choice is possible only with memory-centric design because the latency of reading and writing latency
  49. from disk is often hiden by shared memory multithreading. when its run single threaded this latency
  50. will make CPU spend most of the time idle.
  51.  
  52. No waiting on users
  53. -VoltdDB operations are full ACID transactions, if the singel threaded work runs continuosly
  54. its necessary to eliminate waiting on user mid-transaction. There is no external transaction control
  55. and rather stored procedures are used.
  56.  
  57. Concurrency through scheduling and not shared memory
  58. -The concurrency problem was solved by doing one thing at a time using single threaded pipeline.
  59. VoltDB can scale to multiple machines. In abstraction each core is treated as a standalone machine and
  60. gets single threaded pipeline. Next it needs to keep the pipelines full. It is done by partitioning the data
  61. which will be described by Erik in following slides. For example a finance app can be partitioned
  62. by specific customers regions and their data can be routed easily to the right pipeline.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement