Untitled

---- ALGORITHMS ----

>> C2: Describe how your cache replacement algorithm chooses a cache
>> block to evict.

  Cached sectors are evicted by the "evict_sector" function. Our cache
  replacement algorithm will always evict the least-recently-used (LRU) sector
  from the buffer cache, excluding sectors with pending disk IOs since those
  can never be evicted.

  We can not evict sectors with pending disk IOs (i.e. sectors with state
  SECTOR_PENDING) since these sectors are either: 1) in the process of being
  read INTO the cache, or 2) in the process of being EVICTED from the cache.

  Our eviction algorithm is made much simpler by virtue of the fact that we
  always keep cached sectors least-recently-used (LRU) order, with the most-
  recently-used sector at the back and the least-recently-used sector at the
  front of the list. Thus, anytime we touch (i.e. read or write) a cached
  sector, we remove it and re-insert it at the back of the list, thus
  maintaining LRU ordering at all times.

  During eviction, we search for an evictable sector starting at the front of
  the sector list. We select the first sector with state != SECTOR_PENDING to
  evict. If the sector is dirty (i.e. it has been modified but not written
  back to disk), then we write the sector back to the disk before returning it.

  If every sector in the cache is pending on disk IOs (i.e. state ==
  SECTOR_PENDING), then we can't evict a sector until at least one of them
  finishes and changes its state to something else. If this happens, the
  evict_sector function waits on the disk_io_finished condition which gets
  broadcast anytime a disk IO finishes. Then, the process wakes up and
  attempts to find an evictable sector once more.

>> C3: Describe your implementation of write-behind.

  When data is written to disk sectors via cache_block_write, we read the
  sector into the cache (if it is not already there) and modify the cached
  sectors in-memory. Changes are not immediately written to disk, but saved in
  memory. We mark these modified sectors as "dirty" by setting their state to
  "SECTOR_USED_DIRTY" and keeping them in the cache. When a dirty sector is
  evicted, it written back to disk.

  Additionally, when the buffer cache is initialized we create a new thread to
  asynchronously flush dirty sectors to disk on a regular interval. This
  thread sits in a while loop with a 30-second sleep between each iteration of
  the loop. On each iteration, the thread acquires the buffer_cache lock,
  increments the time counter, and writes all dirty sectors in the buffer
  cache back to the disk.

  During flushing, we want to take a snapshot of the current buffer_cache
  state and write it all to disk. A simple solution is to acquire the
  buffer_cache lock, write all dirty sectors to disk one at a time, and then
  release the lock. This solution ensures that the other sectors can not be
  dirtied or evicted during our flush operation. However, it is sub-optimal
  because it requires holding the buffer_cache lock for a long period of
  time, during which other buffer cache operations can not proceed.

  Therefore, we devised a better solution. We created a "time" counter which
  starts at 0 and gets incremented on each iteration of the
  "write_behind_thread" loop. Any time a sector is dirtied, we update the
  sector's "update_time" to the current time. Then, we allow flushing to
  proceed asynchronously -- that is we write dirtied sectors to disk one-at-a-
  time, taking care to release the buffer_cache lock before any blocking disk
  operations (so that other buffer_cache operation are free to proceed). After
  each disk operation finishes, we restart our search through the buffer_cache
  only looking for sectors with an update_time less than the current time.
  This way we only flush sectors which have been in memory for more than 30
  seconds.

  When there are no more sectors with an update_time less than the current
  time, we sleep for 30 seconds and loop again.

  Lastly, when Pintos is halted, we also write all dirty sectors back to disk
  in cache_flush_all. Since Pintos is shutting down anyway, we don't take any
  special care to release the buffer_cache lock during blocking writes.


>> C4: Describe your implementation of read-ahead.

  In order to make our implementation of read-ahead easier, we wrote a
  "monitor queue" which is basically a thread-safe implementation of a queue,
  useful for producer-consumer applications. It has "monitor_queue_put",
  "monitor_queue_get", and "monitor_queue_size" functions (see interface in
  lib/kernel/monitor_queue.h). One thread can put stuff into the queue while
  another thread gets stuff out of the queue.

  The "put" function will always return in a timely fashion since there is no
  maximum number of entries that the queue can hold (i.e. there is no waiting
  on a space_available condition variable like in the example in lecture). On
  the other hand, "get" may wait on a condition variable until there are new
  items in the queue, since get must always return an item from the queue.

  Since our queue implementation is thread-safe, we are free to call "put" and
  "get" without worrying about locking, which is handled by the implementation.

  To implement read-ahead in our file reading code, we automatically fetch the
  next sector of a file into the cache whenever inode_read_at is invoked,
  since the next sector is likely to be read soon. We call cache_read_ahead on
  the next disk sector which adds the sector to a "read-ahead queue" (backed
  by our monitor queue implementation).

  When the buffer cache is initialized we create a new thread to process "read-
  ahead jobs" from the read-ahead queue. This thread sits in a while loop and
  continually gets new sector_nums from the queue and optimistically caches
  them. Each time it gets a sector_num it adds it to the buffer cache by
  calling cache_sector(sector_num).

  Note that the read-ahead thread does not busy-wait. The "monitor_queue_get"
  function waits on a condition when there are no read_ahead_jobs to consume
  from the queue. Thus, the file reading code can read the sector it
  immediately requires without waiting to also read the next sector.


---- SYNCHRONIZATION ----

>> C5: When one process is actively reading or writing data in a
>> buffer cache block, how are other processes prevented from evicting
>> that block?

  Before a thread can evict a cached sector in our system, the thread must
  first acquire the buffer_cache lock.

  When one process is actively reading or writing data from a cached sector,
  it first acquires the buffer_cache lock to prevent other processes from
  evicting the same block. Once the read or write is finished, the
  buffer_cache lock is released.

  Note: The buffer_cache lock is NEVER held during disk IO operations. The
  lock is only held during reads and writes to sectors that are cached in
  memory. That is, the lock is only held while data is copied into and out of
  the cached sector using MEMCPY, which is fast).

>> C6: During the eviction of a block from the cache, how are other
>> processes prevented from attempting to access the block?

  Before a thread can evict a cached sector in our system, the thread must
  first acquire the buffer_cache lock. As discussed in "Section C2", to evict
  a sector, we search for a sector with state != SECTOR_PENDING. We always
  exclude sectors with SECTOR_PENDING in our eviction algorithm since these
  sectors have pending disk IOs and should not be evicted.

  If the sector we want to evict is not dirty, then we can immediately return
  it and release the buffer_cache lock. There is no way for other processes to
  access the sector because the current process holds the buffer_cache lock.

  However, if the sector we wish to evict is dirty, then we change the
  sector's state to SECTOR_PENDING to indicate that the sector is waiting on a
  disk IO, release the lock, and write the sector to disk (which is a blocking
  operation). During this write operation, other processes are free to acquire
  the buffer_cache lock and do reads/writes on other cached sectors. All code
  in our system is careful not to use or modify sectors with state
  SECTOR_PENDING. Finally, when the write operation for the evicted sector
  completes, the process re-acquires the buffer_cache lock and is free to use
  the evicted sector (changing it's state to something other than
  SECTOR_PENDING before actually using it).


---- RATIONALE ----

>> C7: Describe a file workload likely to benefit from buffer caching,
>> and workloads likely to benefit from read-ahead and write-behind.

  Repeatedly reading or writing a small set of disk sectors (e.g. during a
  directory lookup, each directory entry is read sequentially) will be much
  faster with buffer caching. The speedup is even greater when the sectors are
  physically far from each other on disk because seeks are very costly.

  Repeatedly writing a small set of disk sectors (e.g. writing to a file one
  byte at a time) will be much faster with write-behind. This is because the
  sectors can be repeatedly updated in memory (perhaps thousands of times) and
  written to disk only a single time.

  Reading or writing a file sequentially (e.g. opening or saving a document in
  a text editor) will be much faster with read-ahead. This is because we
  optimistically and asynchronously cache the next sector of a file when one
  sector is read. Due to the locality properties of file reading, it is
  usually the case that if one file sector is read, the following sector is
  likely to be read in the near future.