Introduction
It was my mock interview as a system design interviewer. I have to push myself hard to perform better as an interviewer. I had chance to interview an engineer with strong coding skills, since I am not sure how good he is in terms of system design, I also asked a coding question to double check. I like to write a short case study on this mock interview.
Case study
I will add more detail later.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Design twitter, 1000 users, | |
- users : tweet, follow people, and have followers | |
users: | |
userId(twitterId) : Name, location, email address | |
tweets table: | |
tweetId -> userId, content of tweet(text), time | |
Likedtweets | |
tweetId -> userId, time | |
Retweets | |
tweetId -> retrweeUserId, time | |
FollowTable | |
user1->user2 | |
user2->user3 | |
user3->user1 | |
POST CreateUser(email, name, password) OK (200) | |
UpdateUser() | |
Login(email, password) -> return auth token OK (200) | |
POST WriteTweet() -> OK(200) | |
DeleteTweet(tweetId) | |
Retweet(tweetId) | |
LikeTweet(tweetId) | |
GET GetTweets(int GetNext) JSON {tweets} | |
Post FollowUser | |
TwitterService | |
CAP theorom | |
push message onto serach service queue with tweetID | |
- analysis of words and hastags that are in tweet | |
- go ahead and perform some sort of ML model on tweet | |
- store searchable word -> tweetId | |
- intwerviewing.io -> 50 tweets | |
- system desing - > 30 tweets | |
- rolling average of the last 60 minutes of trending stuff | |
- we can start fresh at the start of every hour and count things that are trending | |
Search Service, reads the tweet, parses out hashtags, words and word/hashtag -> tweetId | |
trending -> count | |
celebrity, topic, hashtag | |
- availability over consistency | |
show case your design: | |
celebrity with milions of followers, tweet, notification to millions - bottleneck, network, cache, push/pull, search, ... | |
- user A has 20M followers | |
we're doing push? then we end up with 20M requests sent down | |
- push down to users who phave been active in last 24 hours | |
- periodically prepopulate feed for users | |
Database scaling | |
- shard based on userID consistent hashing | |
- cluser of databases (read replicas) (master slave replication, where we're opting mostly for availaiblity) - async replication | |
cache | |
- search service (trending) | |
- tweet itself | |
- tweetID does exist in chace? yes, get from cache, no, hit database | |
- cache users and their tweets | |
database, storage - build index, | |
- transaction | |
deposit | |
withdraw | |
- consistency | |
storage design - cloud storage | |
- balance binary search tree - B-tree | |
- transaction - log - | |
log-structure merge tree - | |
CA P | |
- | |
- large data stream, integer, | |
1, 2, 3, 4, 5, 6, -> six, 3, 4, | |
3 + 4/ 2 | |
1, 2,3,4,5 , odd number, median - | |
assume that you store all integer into a few data structure, order can be lost, it is ok. | |
Add(int, ) -> do not require -> | |
findMedian() -> O(1) time complexity | |
minheap, maxheap, keep size of two heap max diff < 1 | |
determine how to get median, | |
LSM-tree -> video, Microsoft, merge, Apple employee -> | |
Log -> append -> merge <- backward -> | |
client-> load balanace-> reverse proxy -> twitter, socail profile servcice -> cache -> storage -> database | |
long polling -> ... | |
booking in United statemet | |
diagram | |
- hand | |
https://gist.github.com/jianminchen/ff21b0a1cad67216376999512dd1b696 | |
booking | |
design twitter -> | |
URL - |
Actionable Items
The interviewee is Microsoft SDE II. I should learn from his performance. Good analysis and very good communication skills.
No comments:
Post a Comment