Tuesday, August 13, 2019

Case study: Design Twitter search

August 13, 2019

Introduction


It is almost 5:00 PM. I have to push myself to go out to play one to two hours tennis, so I can come back to home office to continue to study for system design. My next study topic is called Design Twitter search.


Case study 

I like to use 100 sentences to summarize what I have learned through last 14 minutes reading.

1. Requirements and goals of the system -

Assume that Twitter has 1.5 billion total user with 800 million daily active users.
On average Twitter gets 400 million tweets every day.
The average size of a tweet is 300 bytes.
there will be 500M searches every days.
Search query will consist of multiple words combined with AND/OR.

2. Capacity estimation and constraints

Storage capacity:
  new tweets     average tweet 300 bytes
  400M           * 300                                      => 120GB/ day

Total storage per second:
            120GB/24hours/3600 sec ~= 1.38MB/second

3. System APIs

SOAP or RESP APIs - I need to learn how to define the API

Parameters
Return:

4. High level design

Clients
Application server
Index server
Storage server

Lesson learned from high level design (8/13/2019 5:24PM)

I like to write a statement here. High level design is simple, index server is included. But later on, I should estimate how many index server should be included in the architecture design, it is around

Total memory of index: 21 TB. Assuming a high-end server has 144GB of memory, we would need 152 such servers to hold our index.



No comments:

Post a Comment