daily pastebin goal
30%
SHARE
TWEET

Untitled

a guest Aug 13th, 2017 48 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1.  Backup Buddy Design Document
  2.  ============================
  3.  
  4.  James Stanley 2011
  5.  
  6. Requirements:
  7. -------------
  8.  - Not possible for backup server to read any filenames or data
  9.  - Not waste bandwidth
  10.  - Possible to detect data corruption
  11.  
  12. Server:
  13. -------
  14. Files will be stored on the server in a content-addressible filesystem. The
  15. server will maintain a system of files named by the SHA1 hash of their
  16. contents.
  17.  
  18. When the client wishes to perform a new backup, it connects to the server and
  19. presents a list of SHA1 hashes. The server responds with a subset of this list
  20. containing only the hashes that it does not already have on disk (or that have
  21. become corrupted - hash the contents and check that it matches the filename).
  22. The client then sends the data for these files.
  23.  
  24. After the server has stored the new files, it deletes any files that were not
  25. in the list of hashes sent by the client.
  26.  
  27. The server system presented above allows the client to store arbitrary data on
  28. the server and for the server to detect when data has become corrupted.
  29.  
  30. Client:
  31. -------
  32. The client will start with an empty directory somewhere in the filesystem.
  33. Starting at the backup root, it will create symmetrically-encrypted versions of
  34. each file and directory. Any blocks that are larger than 4K (make this limit
  35. configurable?) are split in to separate blocks, and all blocks are exactly 4K.
  36.  
  37. To allow the client to know the list of files when the backup is restored, an
  38. index file is created. This file contains name-hash pairs (mapping a file name
  39. to a SHA1 hash). Each directory is represented by an index file.
  40.  
  41. Special files like char/block devices and symlinks can have special types of
  42. block.
  43.  
  44. Encoding:
  45. ---------
  46. The file is split in to blocks of size 4K minus sizeof(file_block_header).
  47. Starting with the last block, each block is encrypted and then hashed. The
  48. encrypted block is stored under the temporary directory on the local filesystem
  49. with the name being the SHA1 hash of the contents. Now we move on to the
  50. previous block of file and encrypt it (making sure to set the "next" sha1-hash
  51. correctly), storing it under the temporary directory, etc.
  52.  
  53.  
  54. File block format:
  55. ------------------
  56. [uint32 total_length][uchar160 sha1-hash of next block][data]
  57.  
  58. Index block format:
  59. -------------------
  60. [uint32 num_entries][uchar160 sha1-hash of next block][DIRS]
  61.  
  62. DIR format:
  63. -----------
  64. [uint16 name_length][ucharN name][uchar160 sha1-hash of first block]
RAW Paste Data
Top