Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Design Twitter
- 200 million daily active users (DAU), 100 million new tweets every day.
- How many favorites per day?
- 200M users * 5 favorites => 1B favorites
- How many total tweet-views will our system generate?
- 200M DAU * ((2 + 5) * 20 tweets) => 28B/day
- Storage Estimates
- 100M * (280 + 30) bytes => 30GB/day
- Bandwidth Estimates. Since total ingress is 24TB per day, this would translate into 290MB/sec.
- write, new tweets, 100M/86400s => 1150 tweets per second
- read 28B/86400s => 325K tweets per second.
- Database Schema.
- table: tweet, user, user_follow, favorite(tweet_id, user_id)
- What database will you choosing between SQL and NoSQL databases?
- - query performance, easy to maintain, reliability. etc.
- 7. Data Sharding
- How to shard our data to distribute our data onto multiple machines such that we can read/write it efficiently?
- user_id,
- tweet_id,
- tweet_create_date,
- ..
- 8. Cache
- Which cache replacement policy would best fit our needs?
- How can we have a more intelligent cache?
- What if we cache the latest data?
- hash table + doubly linked list: owner_id + tweet_id(s)
- 9. Timeline Generation
- see facebook newsfeed.
- 12. Monitoring
- What metrics/counters can be collected to get an understanding of the performance of our service?
- 13. Extended Requirements
- How do we serve feeds?
- Trending Topics?
- Who to follow? How to give suggestions?
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement