Sunday, December 15, 2019

Case study: Instagram system design

Dec. 15, 2019

Introduction


It is well-written article how to design instgram. There are 11 pages in grokking system design. I like to take some notes for each page, I just finished the reading of those pages.

Case study


I do believe that best learning is to read the content. I should learn how to read better, and also take some notes if I come cross some concepts, good analysis with explanation.

Simple version of instgram, where a user can share photos and can also follow other uses. The 'News Feed' for each user will consist of top photos from all the people the user follows.

Requirements/ Goals


Functional requirements
1. users should be able to upload/ download/ view photos
2. user can perform searches based on photo/ video titles
3. user can follow other users
4. The system should be able to generate and display a user's News Feed consisting of top photos from all the people the user follows.

Non-functional requirements

High level system design


block storage servers to store photos and also some database servers to store metadata information about the photos.

Database schema


data about users, their uploaded photos, and people they follow.
tables:
Photo
User
UserFollow

RDBMS like MySQL, problems when we need to scale them

A distributed file storage like HDFS or S3

Key-value store

wide-column datastore like Cassandra
relationships - users and photos, who owns which photo
a list of people a user follows

UserPhoto - key is UserID, value will be a list of photoIDs
UserFollow table -

Cassandra or key-value stores in general, always maintain a certain number of replicas to offer reliability. Also, deletes don't get applied instantly, data is retained for certain days (to support undeleting)

I will google Cassandra and have a short review later on.

Component design


web servers have a connection limit, for example 500 connections

To handle this bottleneck we can split reads and writes into separate services. Dedicated servers for reads and different servers for writes to ensure that uploads don't hog the system.

Reliability and redundancy 


multiple copies of each file, multiple replicas of services running in the system.

single point of failure - redundancy removes the possible problem.

Data sharding 

one DB shard is 4TB, and then 178 shards for 712 TB.

append shard number with each PhotoID

No comments:

Post a Comment