Guest User


a guest
Aug 25th, 2021
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. System design interview
  4. Problem
  5. Definition.
  6. Who is the customer?
  7. Pain points.
  8. Use cases
  9. Scenarios that will not be covered
  10. Functional requirements
  11. Entities and verbs.
  12. High-level contract (API)
  13. Make several iterations if possible
  14. Non-functional requirements
  15. Performance
  16. P99 latency for read/write queries?
  17. Write-to-read data delay?
  18. Scalability
  19. Usage patterns, e.g. reads vs writes.
  20. How many users?
  21. How many read queries per second?
  22. How much data is queried per request?
  23. How many video views are processed per second?
  24. Can there be spikes in traffic?
  25. Cost
  26. Maximize cost of {developmet, time-to-market, maintenance}
  27. Availability vs Consistency
  28. Durability
  29. ESTIMATIONS [5 min]
  30. Throughput (QPS for read and write queries)
  31. Latency expected from the system (for read and write queries)
  32. Read/Write ratio
  33. Traffic estimates
  34. Write (QPS, Volume of data)
  35. Read (QPS, Volume of data)
  36. Storage estimates
  37. Memory estimates
  38. If we are using a cache, what is the kind of data we want to store in cache
  39. How much RAM and how many machines do we need for us to achieve this ?
  40. Amount of data you want to store in disk/ssd
  41. HIGH LEVEL DESIGN [5-10 min]
  42. APIs for Read/Write scenarios for crucial components
  43. Database schema
  44. Basic algorithm
  45. High level design for Read/Write scenario
  46. DEEP DIVE [15-20 min]
  47. Scaling the algorithm
  48. Scaling individual components
  49. Availability, Consistency and Scale story for each component
  50. Consistency and availability patterns
  51. Think about the following components, how they would fit in and how it would help
  52. DNS
  53. CDN [Push vs Pull]
  54. Load Balancers [Active-Passive, Active-Active, Layer 4, Layer 7]
  55. Reverse Proxy
  56. Application layer scaling [Microservices, Service Discovery]
  57. DB [RDBMS, NoSQL]
  58. RDBMS
  59. Master-slave, Master-master, Federation, Sharding, Denormalization, SQL Tuning, Indexing
  60. NoSQL (in general - Denormalized data + no-joins)
  61. Key-Value, Wide-Column, Graph, Document
  62. Fast-lookups:
  63. RAM [Bounded size] => Redis, Memcached
  64. Availability [Unbounded size] => Cassandra, RIAK, Voldemort
  65. Consistency [Unbounded size] => HBase, MongoDB, Couchbase, DynamoDB
  66. Caches
  67. Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
  68. Eviction policies:
  69. LRU, LFU, FIFO
  70. Caching patterns:
  71. Cache aside
  72. Write through
  73. Write behind
  74. Refresh ahead
  75. Asynchronism
  76. Message queues
  77. Task queues
  78. Back pressure - Resistance or force opposing the desired flow of data through software("pipes") - buffering vs. dropping
  79. Communication
  80. TCP
  81. UDP
  83. Binary protocols - Apache Avro (evolved from Protocol Buffers and Thrift)
  84. Security
  85. Encryption: during transfer/at rest
  86. Government compliance (EU/China/US)
  87. Authentication/authorization
  88. Firewalls
  89. Payment data storage/handling/compliance
  90. High level threat modeling (obvious ones)
  91. Telemetry/monitoring/logs aggregation/Dashboards
  92. Host level metrics: CPU, Memory, Threads, Disk I/O, Garbage Collection runs
  93. Fleet - AVG. to first byte response, Surge queue on LB, VIP Spillover, Database preassure, cache tier
  94. Alarms/setting up thresholds/canaries
  95. Actions feed/Key business metrics: daily active users, retention, revenue, etc. - Buisness Intelligence
  96. Control the producer (slow down/speed up is decided by consumer)
  97. Buffer (accumulate incoming data spikes temporarily)
  98. Drop (sample a percentage of the incoming data)
  99. Technically there’s a fourth option — ignore the backpressure — which, to be honest, is not a bad idea if the backpressure isn’t causing critical issues. Introducing more complexity comes at a cost too.
  100. Costs/optimizations. When using cloud services, it’s important to keep a lid on your costs.
  101. Autoscaling/adding "elasticity" (discuss traffic patterns: regional/seasonal), failovers
  102. SSD vs. HDD
  103. Commodity hardware vs. specialized ("optimized" for Memory, Disk I/O, CPU, GPU)
  104. Open-source vs Paid vs Built in-house
  105. Experimentation capability sooner or later comes into large scale products
  106. Testing capability/testing tools and hooks/Gremlins/Hogs/Gameday-Outages excersice - intoducing chaos into the system
  107. Deployments/rollbacks/canaries/soak-times/etc.
  108. Pluggable instrumentation
  109. JUSTIFY [5 min]
  110. Throughput of each layer
  111. Latency caused between each layer
  112. Overall latency justification
RAW Paste Data