Advertisement
Guest User

Untitled

a guest
May 19th, 2019
91
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.93 KB | None | 0 0
  1. Chubby
  2. * lock service
  3. * file system (for small files)
  4.  
  5. * limited metadata
  6.  
  7. only one chunkmaster in a given GFS (but can have replicas of the data)Example: GFS Chunkmaster (details)
  8. * open URL: lockservice/which particular cluster you want to be a master within / looking for chunkmaster task you want to get a lock for
  9. * try to get a lock: if yes, you know you are the chunkmaster, make path name point to you so others know who the chunkmaster is now
  10. * Chubby stores a little bit of data on each path (e.g. port number of chunkmaster, etc.)
  11.  
  12. * would also have the address of the chunkmaster
  13. * uses advisory locks: up to the developer to acquire and release locks correctly, Chubby does NOT handle this
  14.  
  15. * why? the locks granted by Chubby are used by third party, it doesn't have control over them, Chubby just responds to requests
  16. * Chubby is oblivious to what some client decides to do with a lock once it has it
  17. * the Chubby service is not sitting within the service in question
  18.  
  19. Above Example Theory Explained
  20. * one way Chubby is used is in the GFS chunkmaster selection
  21. * when GFS goes online, there is a cluster of chunkservers
  22.  
  23. * all chunkservers are capable of becoming master, but we need to choose exactly one
  24. * to choose this master, all chunkservers try to grab the lock for becoming master from Chubby
  25. * whichever one actually gets the lock becomes the master
  26. * this method makes it easy for developers, as Chubby does all of the distributed coordination through Paxos internally to determine who should get the lock
  27.  
  28. * developers simply see whether they got the lock or not
  29.  
  30. Why This Interface?
  31. * why not a library that implements Paxos?
  32.  
  33. * Paxos is slow, developers don't know how to use Paxos (they thing they know how to use locks though)
  34. * need to export result of coordination outside of system
  35.  
  36. * clients need to discover GFS master
  37.  
  38. Chubby Design
  39. * can tolerate only 2 failures (2F + 1 rule)
  40. *
  41. * each of these 5 servers are running copies/replicas of the chubby service
  42. * slightly confusing, but they use Paxos to implement fault-tolerant log, and then each replica is Chubby, which uses Paxos to determine which client gets the lock (the master in the Chubby cell is the one that will do this and then stay coordinated with replicas using Paxos)
  43.  
  44. Read in Paxos-based RSM
  45. * for every key, every replica stores (value, version)
  46. * return latest version out of majority of replicas
  47.  
  48. * why do different replicas have different versions?
  49.  
  50.  
  51. * gap between accept and learn
  52. * read happened between accept and learn so old version seemed like latest
  53. * how to ensure linearizability then?
  54.  
  55. * a read must see the effect of all accepted writes
  56. * Paxos replicated log: every replica executes command in a slot only after executing commands in all prior slots
  57. * we need to get read accepted to one of the slots in the replicated log! (then every read will read the same thing because everything before the read slot will have been executed for sure)
  58. * problem: poor performance at scale
  59.  
  60. * this is why Chubby is used for coarse-grained locks (locks held for days rather than seconds
  61.  
  62. Read/Writes in Chubby
  63. * one of the 5 replicas chosen as the master
  64. * the Paxos used between master and clients is leader-based Paxos, makes it really easy to skip prepare phase and go straight to accept phase
  65. * how to handle master failure?
  66.  
  67. * another replica must first propose itself as the master
  68. * new master must first "catch up" (may not have been aware of certain operations in some of the slots)
  69. * master is the performance bottleneck
  70.  
  71. Scaling Chubby with Caches
  72. * clients cache data they read
  73.  
  74. * subsequent reads don't have to go through Chubby that way
  75. * reading from local cache violates linearizability
  76.  
  77. * how to fix this?
  78.  
  79. * master invalidates cached copies upon update
  80. * this is tricky, what if someone reads while all the invalidations are happening
  81.  
  82. * master will have to tell all these reads not to cache
  83. * master must store knowledge of client cache
  84. * what if master fails?
  85.  
  86. * new master is selected -> if you know all nodes in system, broadcast to all saying invalidate everything
  87. * another thing: aslkdfjasldkfj
  88.  
  89. Scaling Chubby with Proxies
  90. * proxy can serve requests using its cache
  91. * they just cache state essentially
  92. * proxies can't handle writes, client would have to talk to Chubby cluster directly
  93.  
  94. Handling Client Failures
  95. * what if a client acquires a lock and fails?
  96.  
  97. * master will exchange heartbeat messages with clients that have locks
  98. * lock revoked upon client failure
  99. * problem?
  100.  
  101. * network partition / delay is not a client failure
  102. * Chubby associates lock acquisitions with sequence numbers
  103.  
  104. * can distinguish operations from previous lock holders
  105. * whenever the a client that comes back up tries to contact a replica, it needs to check with lock service to confirm lock validity
  106.  
  107. Zookeeper
  108. * open source coordination service
  109. * addresses the need for polling Chubby
  110.  
  111. * e.g. if you cannot acquire a lock, you need to retry
  112. * goal in zookeeper: wait-free coordination
  113.  
  114. Watch Mechanism
  115. * clients can register to "watch" a file
  116.  
  117. * Zookeeper notifies the client when the file is updated
  118. * example
  119.  
  120. * try to acquire a lock by creating a file
  121. * if the file already exists, watch for updates
  122. * upon watch notification, try to re-acquire the lock
  123.  
  124. Problem?
  125. * what if a bunch of clients are all watching to wait for acquiring a lock?
  126.  
  127. * this is the herd effect, only one of them is actually going to succeed
  128. * zookeeper keeps a hierarchy
  129.  
  130. * it will only notify the client with the next highest sequence number that the lock is available to try for
  131. *
  132.  
  133. Difference between invalidation (Chubby) and watch (ZooKeeper)?
  134. * Invalidation
  135.  
  136. * only library receives a notification to update the cache
  137. * Watch
  138.  
  139. * application receives notification
  140. * only application knows what it needs to do
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement