Untitled

Chubby
	* lock service
	* file system (for small files)

		* limited metadata

only one chunkmaster in a given GFS (but can have replicas of the data)Example: GFS Chunkmaster (details)
	* open URL: lockservice/which particular cluster you want to be a master within / looking for chunkmaster task you want to get a lock for
	* try to get a lock: if yes, you know you are the chunkmaster, make path name point to you so others know who the chunkmaster is now
	* Chubby stores a little bit of data on each path (e.g. port number of chunkmaster, etc.)

		* would also have the address of the chunkmaster
	* uses advisory locks: up to the developer to acquire and release locks correctly, Chubby does NOT handle this

		* why? the locks granted by Chubby are used by third party, it doesn't have control over them, Chubby just responds to requests
		* Chubby is oblivious to what some client decides to do with a lock once it has it
	* the Chubby service is not sitting within the service in question

Above Example Theory Explained
	* one way Chubby is used is in the GFS chunkmaster selection
	* when GFS goes online, there is a cluster of chunkservers

		* all chunkservers are capable of becoming master, but we need to choose exactly one
	* to choose this master, all chunkservers try to grab the lock for becoming master from Chubby
	* whichever one actually gets the lock becomes the master
	* this method makes it easy for developers, as Chubby does all of the distributed coordination through Paxos internally to determine who should get the lock

		* developers simply see whether they got the lock or not

Why This Interface?
	* why not a library that implements Paxos?

		* Paxos is slow, developers don't know how to use Paxos (they thing they know how to use locks though)
		* need to export result of coordination outside of system

			* clients need to discover GFS master

Chubby Design
	* can tolerate only 2 failures (2F + 1 rule)
	*
	* each of these 5 servers are running copies/replicas of the chubby service
	* slightly confusing, but they use Paxos to implement fault-tolerant log, and then each replica is Chubby, which uses Paxos to determine which client gets the lock (the master in the Chubby cell is the one that will do this and then stay coordinated with replicas using Paxos)

Read in Paxos-based RSM
	* for every key, every replica stores (value, version)
	* return latest version out of majority of replicas

		* why do different replicas have different versions?


			* gap between accept and learn
			* read happened between accept and learn so old version seemed like latest
	* how to ensure linearizability then?

		* a read must see the effect of all accepted writes
	* Paxos replicated log: every replica executes command in a slot only after executing commands in all prior slots
	* we need to get read accepted to one of the slots in the replicated log! (then every read will read the same thing because everything before the read slot will have been executed for sure)
	* problem: poor performance at scale

		* this is why Chubby is used for coarse-grained locks (locks held for days rather than seconds

Read/Writes in Chubby
	* one of the 5 replicas chosen as the master
	* the Paxos used between master and clients is leader-based Paxos, makes it really easy to skip prepare phase and go straight to accept phase
	* how to handle master failure?

		* another replica must first propose itself as the master
		* new master must first "catch up" (may not have been aware of certain operations in some of the slots)
	* master is the performance bottleneck

Scaling Chubby with Caches
	* clients cache data they read

		* subsequent reads don't have to go through Chubby that way
	* reading from local cache violates linearizability

		* how to fix this?

			* master invalidates cached copies upon update
			* this is tricky, what if someone reads while all the invalidations are happening

				* master will have to tell all these reads not to cache
	* master must store knowledge of client cache
	* what if master fails?

		* new master is selected -> if you know all nodes in system, broadcast to all saying invalidate everything
		* another thing: aslkdfjasldkfj

Scaling Chubby with Proxies
	* proxy can serve requests using its cache
	* they just cache state essentially
	* proxies can't handle writes, client would have to talk to Chubby cluster directly

Handling Client Failures
	* what if a client acquires a lock and fails?

		* master will exchange heartbeat messages with clients that have locks
		* lock revoked upon client failure
		* problem?

			* network partition / delay is not a client failure
	* Chubby associates lock acquisitions with sequence numbers

		* can distinguish operations from previous lock holders
		* whenever the a client that comes back up tries to contact a replica, it needs to check with lock service to confirm lock validity

Zookeeper
	* open source coordination service
	* addresses the need for polling Chubby

		* e.g. if you cannot acquire a lock, you need to retry
	* goal in zookeeper: wait-free coordination

Watch Mechanism
	* clients can register to "watch" a file

		* Zookeeper notifies the client when the file is updated
	* example

		* try to acquire a lock by creating a file
		* if the file already exists, watch for updates
		* upon watch notification, try to re-acquire the lock

Problem?
	* what if a bunch of clients are all watching to wait for acquiring a lock?

		* this is the herd effect, only one of them is actually going to succeed
		* zookeeper keeps a hierarchy

			* it will only notify the client with the next highest sequence number that the lock is available to try for
		*

Difference between invalidation (Chubby) and watch (ZooKeeper)?
	* Invalidation

		* only library receives a notification to update the cache
	* Watch

		* application receives notification
		* only application knows what it needs to do