Wednesday, July 31, 2019

System Design distributed web crawler to crawl Billions of web pages | web crawler system design

Here is the link.

Politeness/ crawl rate
DNS query
Distributed crawling

How should I prepare Facebook onsite interview?

July 31, 2019


It is so hard to prepare Facebook onsite interview. Since I do not put too much effort, I believe that I got Amazon onsite by passing online code assessment in 2019 and phone screen in June, I win some trust from the interviewer from Facebook. I did solve the first easy level algorithm, and then talked about the idea how to solve the second algorithm using optimal time complexity, I kept working hard to think about hints given through example, I passed my first Facebook phone screen in my life in July 2019.

How should I prepare onsite? 

I like to write hardness based on how I will be evaluated on system design.

Here are highlights of areas I should be evaluated.
1. Problem exploration
2. Design approach
3. Data management
4. Tradeoffs
5. Deep dive
6. Quantitative analysis

You will be given a type of system to design from scratch. This interview helps us better evaluate you on your architecture experience and your knowledge when given an unfamiliar domain or service to design. Part of your discussion should include doing a deep dive into particular components, but focus on deriving an end-to-end design before narrowing your scope. Keep in mind, that there is no 100% “right” answer in this interview – it’s about the logic and reasoning you provide as evidence to support your design.

Note: The expectation is not for your answer to align with Facebook’s infrastructure; but your design must be scalable in a large environment like Facebook.

1. Problem Exploration
a. Ask questions! Ex: What is the underlying motivation of building “X”
b. Gather requirements and clarify any uncertainties: Target users; # of users; amount of data to be handled; requests per second; etc.
c. Attempt to define the problem and how “X” will be built (this will give you a roadmap to follow for remaining of interview)

2. Design Approach
a. High level discussion in regard to covering the main components (*providing reasons/logic behind why they are relevant)
b. What are the different options to get to a solution
c. HOW this product/system will impact other areas of the business

3. Data Management
There’s a high likelihood the question you’re going to be given will have some facet of data management. Some areas to mention during the discussion:
a. How will data be stored
b. (If any) Define data entities
c. Which technology is best to use; or, suggest which data storage solution is best and why
3. Tradeoffs
a. After discussing the pros/cons of each option; choose 1-2 methods to design up a solution.
b. Explain why your making this choice and how it is right for the domain/service
c. Make your tradeoff clear

4. Deep Dive
Do a deep dive on some facet of the problem you feel comfortable with. This is where you can highlight your expertise in any one area. You will be assessed on your depth in “X” area. Discuss any additional concerns; how to prevent them; and your approach. (Its ok to go back and forth with your tradeoff if later you realize there is a more efficient method)

6. Quantitative Analysis
This can be done towards the end or in between the discussions. Try to think quantitatively about how your design will work in reality - make some approximate calculations

The feeling of United States onsite interview in June 2018

July 31, 2019


It is my personal finance research. I always try to look into the problems I have and learn how to attract money.  I like to write down the feelings I experienced to go for Seattle onsite interview back in June 6, 2018.

32 credit for United States social security

July 31, 2019


It is my personal finance research. Based on 2009 statement, I have at least 32 credit for social security. I do not have USA permanent residentship, I lived in Florida from 1996 to 2010. If I work another 2 years, make 4 credit a year, I will qualify United States social security and medical benefit.

Tuesday, July 30, 2019

Case study: USA social security statement

July 30, 2019


It is my personal finance research. I went to play tennis, and then stopped a few minutes to say thank you for a lady who picked up tennis balls for me when I hit against tennis wall. And then she told me that she worked US over 18 years and now retired in Canada as a Canadian citizen; she is collecting USA social security income as a 62 years old. Since I am planning to prepare an onsite interview from one of top software companies Facebook in August, I like to do a case study about my social security in USA as well.

Case study

I just learned last few days that I will qualify to collect USA social security if I can earn 40 points. I stayed in USA from 1996 to 2019, how many points have I earned?

Let me walk through my statement, and I like to look into how many areas I should conduct some research.

Here is my personal finance repository social security statement.

2009 statements

2010 statements

One simple drill to help system design

July 29, 2019


It is hard for me to learn system design very well in two weeks. I like to focus on basics first. One idea is to go over all index items in the book called Design data intensive application.

Keywords to study

I like to go over keywords if I have 10 to 20 minutes.



Not waiting for something to complete
(e.g., sending data over the network to
another node), and not making any
assumptions about how long it is going to
take. See “Synchronous Versus Asynchronous
Replication” on page 153, “Synchronous
Versus Asynchronous Networks” on
page 284, and “System Model and Reality”
on page 306.

1. In the context of concurrent operations:
describing an operation that appears to
take effect at a single point in time, so
another concurrent process can never
encounter the operation in a “halffinished”
state. See also isolation.
2. In the context of transactions: grouping
together a set of writes that must either all
be committed or all be rolled back, even if
faults occur. See “Atomicity” on page 223
and “Atomic Commit and Two-Phase
Commit (2PC)” on page 354.

Forcing the sender of some data to slow
down because the recipient cannot keep
up with it. Also known as flow control. See
“Messaging Systems” on page 441.

batch process
A computation that takes some fixed (and
usually large) set of data as input and produces
some other data as output, without
modifying the input. See Chapter 10.
Having some known upper limit or size.
Used for example in the context of network
delay (see “Timeouts and Unbounded
Delays” on page 281) and datasets
(see the introduction to Chapter 11).

Byzantine fault
A node that behaves incorrectly in some
arbitrary way, for example by sending
contradictory or malicious messages to
other nodes. See “Byzantine Faults” on
page 304.

A component that remembers recently
used data in order to speed up future
reads of the same data. It is generally not
complete: thus, if some data is missing
from the cache, it has to be fetched from
some underlying, slower data storage

Monday, July 29, 2019

Remove Invalid Parentheses

Here is the link of

An expression will be given which can contain open and close parentheses and optionally some characters, No other operator will be there in string. We need to remove minimum number of parentheses to make the input string valid. If more than one valid output are possible removing same number of parentheses then print all such output.


Input  : str = “()())()” -
Output : ()()() (())()
There are two possible solutions
"()()()" and "(())()"

Input  : str = (v)())()
Output : (v)()()  (v())()

I interviewed two people, both of them use BFS and wrote the algorithm to find the minimum parenthesis to remove algorithm. After two mock interviews, I also wrote a solution based one code study on

I am still thinking about the solution with better time complexity.

Here is my practice link.

10 hard things to learn system design

July 29, 2019

I like to write a short blog about 10 hard things to learn system design. I understand that it is important for me to learn system design, since I have spent more than 5 years to work on algorithm and data structure, I should start to learn system design since it will open opportunity for me as a software programmer.

10 hard things to learn related to system design

I will think about and then write down one by one.

Follow up 

Nov. 15, 2019

I always like to support myself. So I like to put together 10 hard things in less than 10 minutes. Here we go.

1. Very good engineer will fail system design, even though he/ she has a lot of years experience. Learning system design is different from doing a good job at your full time work;
2. System design is broad area to work on, prepare and talk about in onsite interview. In order to prepare for better performance, one person should work hard on generic communication skills, how to relate to basic concept related to distributed system, cache, storage, load balancer etc., sometimes also hard level algorithm in problem solving, quick to write code for the idea.
3. System design is the time and place to show your passion as a software engineer. You like to show your determination to solve the problem together, and earn big bonus in potential at real work. Facebook or Amazon is my desired company, if I am asked to solve a problem, I should show my good analysis and easy-to-work-together talent.

CRICINFO system design | CRICBUZZ System design

July 29, 2019

Here is the link.

Live Score

Sunday, July 28, 2019

Whatsapp System design or software architecture

Here is the link.

1. User base
2. Last seen
3. Media (video?)
4. Encrypt
5. Telephony (audio call, video call)

Book reading challenging: Data models and query languages

July 28, 2019


It is so challenging to think about being a senior engineer for top four software companies, how is it possible? I was so busy to work on Leetcode algorithms, I have two weeks to read a few books to prepare system design and product design. I need to find right book and read basics first, today I will give myself assignment to read a book chapter called Data models and query languages.

A task list 

I like to design a few questions for the book chapter to help myself and other to test how good the reading is. Do I understand the basics and also know what are most basic concepts to master?

Book reading time

Now it is 8:58 AM. I will start to read the book chapter now.

Page 46

MapReduce is neither a declarative query language nor a fully imperative query API,
but somewhere in between: the logic of the query is expressed with snippets of code,
which are called repeatedly by the processing framework. It is based on the map (also
known as collect) and reduce (also known as fold or inject) functions that exist
in many functional programming languages.

How do I choose a book to read for system design?

July 28, 2019


It is my personal story I like to share. I knew Desmond Zhou working in Amazon long time, in June 2018 I asked him for advice for Amazon onsite interview, private coaching, he did not give any important advice. What is next?

A book to read

I attended Vancouver 9 cat tech event. Desmond Zhou gave out a talk about his career and advice. I showed up in the meetup, and asked advice.

More detail, I planned to attend the event, and I told the organizer that I will bring some fruit, snack and fruit juice from Costco.

I found out that I had terrible allergy second year again, my eye was so itching and I found out that it was the first time I am crying and screaming inside the car, outside the Costco. I decided to get into the Costco as soon as possible, purchase antihistamine pill and took 2 pills right away. I was worrying about how I can drive since my eye was so itching.

After the presentation, I asked Desmond Zhou for advice. He said that it is a very good book to read called Designing Data-intensive applications. That is around March, 2019.

Another recommendation

Later I met a young graduate Bhavya who worked at Visa in Singapore on as an interviewer, I connected to her on She also gave out recommendation for the book on linkedin three months ago. This month she joined Google in Singapore.

First chapter

I spent over two hours to read the first chapter last week. I like the writing and also tell myself to calm down, understand some basics first.


My wechat conversation log

Saturday, July 27, 2019

43. Multiply Strings

678. Valid Parenthesis String

918. Maximum Sum Circular Subarray

44. Wildcard Matching

It is  a hard level algorithm. I wrote a sharing post and here is the link.

463. Island Perimeter

413. Arithmetic Slices

393. UTF-8 Validation

109. Convert Sorted List to Binary Search Tree

Facebook design interview questions

1. 设计Facebook News Feed
2. 设计一个wallpaper,从你的facebook 帐号下载图片
3. 设计search infra to return typeahead (mainly focused on the components after tokenization and query understanding, machine learning for ranking is not considered here)
4. Design infra and storage for posts with same keywords
5. 设计一个类似网页translator。网页的内容需要根据不同的地区翻译成该地区的语言。
follow up - 对于dynamic的内容,如何实现
6. 输入提示.
7. 订票网站的后台.

8. 设计whatsapp
9. 第一轮稀通射击 推特搜索 要求是DAU 100M 读写qps一致 按照grokking和散的平方章给的思路回答
10. wiki 挖掘器
11. 系统设计 instagram
12. 设计内存缓存 
13. design friend list online and offline feature
14. system design, 设计搜索自动补全功能,typeahead
15. nlp design: 假定现在有个机器翻译系统上线了,如何评估翻译质量.
16. google photos app-baidu 1point3acres
1. query photos for a user
2. upload/sync phot

1027. Longest Arithmetic Sequence

Leetcode 224

Leetcode 46

Facebook Product Design Interview: Part 1

Here is the link.

NETFLIX System design | software architecture for netflix

Here is the link.

1. OC
2. Backend
3. Client

AWS, Open Connect

Original, five more edge servers - videos are saved in those servers.

Study notes

The design uses Hystrix as circuit breaker. Istio which does similar job:

Talking about micro service: less dependency for critical services; stateless;

Cache in memory and SSD
EVCache (Netflix own cache based on memcached):
I prefer to use KV cache (Redis), a general cache layer, support cluster.

MySQL for any transaction based: billing, user info
MySQL Master-Master synchronization, with several Read-Only slaves, which is vertically partitioned for local data center
MySQL handle heavy read good.

Cassandra: No-sql server, heavy read/write
user history
separate data into recent and old (zip & archive)
One cassandra cluster to keep only recent data, another cluster to keep old compressed data.

Event log process: Kafka & Chukwa
Chukwa collects logs generated by Microservices and save to HDFS.
(Alternative solution is to fluentd, route log events directly to ElasticSearch)

Kafka is a message queue system, Chukwa as producer, events sent to Kafka, can do filter, different queues, copy ,replay...; Each queue can have multiple consumer to process received messages (save to different destination: spark, elasticsearch). Cluster based, easy scale out.

ElasticSearch: index and search service, cluster, easy scale out
Dashboard: kibana
Search events, event need to correlated

Spark: data mine/Machine Learning based on the events:
sorting/row select/ranking/recommendation

Machine Learning: get data from kafka, training, build module, generate rank/recommendation

OCA: Open Connect Applicance
Hardware cache for netflix caching contents, deployed at ISP.
Use hashing ring to determine which OC server is the content located in the cluster.

Redis system design | Distributed cache System design

Here is the link.

It is better for me to slow down and learn some basics first. I like to spend 30 minutes to watch the video, and take some notes for my first watching.

Cashing best practice

Cache miss

Features/ Estimation
1) TeraByte
2) 50k to 1 M QPS
3) close to 1 ms latency
4) LRU (Eviction)
5) 100  Availability
6) Scalable

1. Write through
2. Write around
3. Write back

Cache access pattern

-> C <--> DB


Hashmap, double linked list -> data structure for LRU

LRU - frequence of visit -> Another video to watch

Fault tolerant

1. Regular interval snapshot
2. Log Reconstruction


Leon G. Cooperman - wiki page

Here is the link.

Leon Cooperman: A big market move would be 'knocking on the door of euphoria'

Here is the link.

Tuesday, July 23, 2019

BOOKMYSHOW System Design, FANDANGO System Design | Software architecture for online ticket booking

Here is the link.

  1. Highly concurrent
  2. Responsive UI
  3. Multiple cities
  4. Payments
  5. Movie suggestions
  6. Comments & Rating
  7. Movie 
  8. Sound ticket by SMS

Monday, July 22, 2019

Anett Kontaveit | My Story

Here is the link.

My Story | Sofia Kenin

Here is the link.

Elise Mertens | My Story

Here is the link.

It is hard for me to follow top 20 players on WTA, those seed players. I have to spend time to learn one player tonight. I will remember what I learn from her experience.

Am I good enough for Facebook this summer?

July 22, 2019


It is my personal finance research. I like to figure out what is the most fun I can have this summer. I do not take things for granted. I know that I had Facebook phone screen last Wednesday. I learn that onsite interview is such great opportunity for me and also I should take it seriously. Am I good enough for passing the phone screen?

Facebook phone screen

I am very experienced to be an interviewer on one tree algorithm called lowest common ancestor. But I have to learn how to behave and perform best in my phone screen as an interviewee. This year I learn to be a very good supporting role as an interviewee. I know that the interviewer will have to do his job to do assessment.

My weekend so exciting

July 22, 2019


It is my personal finance research. I am a single person in the city of Vancouver, I do not have a lot of things to work on in weekends and evenings, I love to spend time to work on mock interview 10:00 PM every week day, and I also like to figure out how to make my Canada life much more rewarding.

Another Amazon onsite two weeks

It is time for me to push myself to learn system design. I have to really learn basics first. Today I feel so excited to learn How Uber is designed. I was amazed how a person can give such great presentation to those system design topics.

I like to say that Amazon makes my summer so exciting.

CoCo Vandeweghe | My Story

Here is the link.

Ask Google: jianmin chen what is value of my blogging

July 22, 2019


It is my personal finance research. I like to search Google,  jianmin chen what is value of my blogging.

Search results

Here is the snapshot of searching results:

Follow up

Nov. 7, 2019


I like to write some value called monetized of julia coding blog. After five year blogging nonstop from 2015, in 2019, I learned that I have to look into those stock I chose to purchase back in 2000, and then those 401 K and IRA account I had. So I went back to invest in equity fund.

As of Nov. 7, 2019, I set up Key Largo and Victoria two portfolio, and then I also set up my 401 K par tech portfolio. I set up balanced portfolio, and believe that it is important to stay in the market, not time the market. After six months, the bear market is not here in north America. I learned the experience that portfolio with profit $1200 dollars for key largo, and profit $1000 for Victoria.

I choose to stay in the market, so I do not sell the shares for profit.

Sunday, July 21, 2019

UBER System design | OLA system design | uber architecture | amazon interview question

Here is the link.

It is so surprising to learn so many thing to prepare system design. I really like the learning. I should have worked on system design earlier.

Right now, I have to rush myself to learn from video first.


Here is the link.

Saga Pattern | How to implement business transactions using Microservices – Part I

Here is the article I like to read 10 minutes first.

Do you know Distributed transactions?

Here is the link.

Case study: First two months back in stock market in 2019

July 21, 2019


It is my personal finance research. I just completed the first 60 days in stock market after 10 years break from 2009 to 2019. I like to write a few lessons I learn and how I should push myself to learn more about personal finance literacy.

Case study

I set up a portfolio on my Par 401 K, and then set up a portfolio on my USA IRA on, and also set up a portfolio on Canada TFSA account.

First month is tough. I learned that I experienced different emotions through the experience. I learn to follow my research, and just do it.

Porfolio Key Largo IRA account

Profit: $651.66 (+3.42%)
Annualized yield
+26.74% over 75 days
Cost: 19069.19
Return: 19720.85
Sell: 0.00
Dividend: 126.71

Here are highlights:
1. Install JStock app on my Samsung phone;
2. Go over all links and features on
3. Go over all links on
4. Go over all Charles Schwab links on my Par 401 K
5. Download all statements, and study page by page
6. Keep learning on CNBC, try to understand basics, like yield curve, long term bond, etc.
7. Spend time wisely. Do not peek so often.
8. Talk about the portfolio, and then I can release some stress

What I did correctly

I just think that I should not time the market. I should stay positive, and take loss if bear market comes in less than six months or 12 months; I still have to study how to rebalance my portfolio if bear market comes.

I thought a lot of times if I should dollar cost average, but I noticed that time is wasted if I park all the fund in money market fund.

What I should do next

I like to read more about investment, so that I can learn more about basics.

Distributed File Systems - Part 1

Here is 100 minutes long video given by a professor from Texas Austin. I need to figure out where I can find time to learn, watch the video.

July 21, 2019 4:24 PM - 4:54 PM

What is HDFS | Hadoop Distributed File System (HDFS) Introduction | Hadoop Training | Edureka

Here is the link.

1. What is DFS and Why Do We Need It? 2. What is HDFS? 3. HDFS Architecture 4. HDFS Replication Factor 5. HDFS Commands Demonstration on a Production Hadoop Cluster

MapReduce tutorial - Fundamentals of MapReduce with MapReduce example

I plan to spend 30 minutes to read the article. Here is the link.

System design basics: Learn about Distributed file systems

Here is the link.

System design basics: When to use distributed computing | how distributed computing works

Here is the link.

How to understand mapreduce?

Distributed Locks | System design basics

Here is the link.

Mutex, semaphore, ...

Sudden Wealth Radio: Avoid These Sudden Wealth Mistakes

Here is the link.

Resist to make any promise - how much money you can afford to do?

Keep finance private. No one should know. Others will tell you what to do about money, and ask money from you.

Tremendous amount of anxiety. It becomes overwhelms.

Dealing with 'sudden wealth syndrome'

Here is the link.

It is hard to manage success, same as failure.

H.O.R.S.E with former NBA player Troy Murphy

Here is the link.

Thank You Troy Murphy

Here is the link.

NBA Sharpshooter Troy Murphy Takes Aim at Sudden Wealth

Here is the link.

Home Personal Finance Spending & Saving Why NBA veteran Troy Murphy is dedicating his life to improving financial literacy

Here is the link.

Seven wealth - Ex-NBA player turns into good wealth advisor

Here is the link.


Here is the link.

Money EQ Assessment

Each of us has a unique emotional relationship with money that is influenced by a host of attitudes, opinions, and past experiences. We call this dynamic our Money EQ, or Emotional Intelligence regarding all things money-related, and it may drive our behaviors concerning money. This assessment explores topics that we feel influence our Money EQ. Please answer candidly and go with your gut. Once you have responded to a question, you will not be able to change your answer. Your responses will remain anonymous.

Your results affirm that you appear to hold certain attitudes or beliefs that run counter to a healthy emotional relationship with money. Your responses suggest that you may experience an unhealthy sense of concern about money.

You may feel...

worry and anxiety about money.
driven to accumulate money to reduce your level of stress.
uncertain about your ability to make smart decisions about money.
money has created problems for you and your current situation could be negatively impacted.
jealous of other people’s money, or a sense that you don’t have enough.
Challenge yourself by exploring the following:

Reflect on opportunities to be grateful for your current financial situation.
Be more open with family about money: share your concerns.
Reach out for advice when you feel challenged or experience a problem with money.
Continue to learn about money and how it might positively impact your life.

How You Can Become More Emotionally Intelligent About Money

Here is the article.

By taking this step of self-discovery to examine how money and your feelings about it shaped your experiences, you can begin to identify a variety of fear-based responses like hoarding, anxiety, guilt, and jealousy that may be a product of your money history. This understanding can help you build the capability to shift your thoughts and feelings, leading to greater freedom and contentment with money.

Why Bond Prices and Yields Move in Opposite Directions

Here is the article.

Morning hour tennis

July 21, 2019


It is a tough journey. I have to work on weight loss, and one thing I can do is to weigh myself in the morning. I like to go out to play tennis in the morning, and then play tennis in the afternoon as well.

Tennis sports

I have to figure out how to lose weight in weekend. I like to lose 2 - 3 lbs this weekend. This morning I weigh 190 lb.

Distributed Systems in One Lesson by Tim Berglund

Here is 40 minutes video. I plan to watch it and take some notes here.

How to Take Notes in Class: The 5 Best Methods - College Info Geek

Here is the popular video with more than 1 million view.

outline method
cornell method
mind map method
flow method
write on the slides method

Saturday, July 20, 2019

Learn System design : Distributed datastores | RDBMS scaling problems | CAP theorem

Here is the link.

Bond Bear Market Guide

Here is the link.

Consider this: In the period from May 1 to July 31, 2013, the bond market was hit hard as the yield on the 10-year note soared from 1.64% to 2.59%. (Keep in mind, prices and yields move in opposite directions.) During that time period, the Vanguard Long-Term Bond ETF (BLV) was hit for a loss of 10.1%, while the Vanguard Intermediate Term Bond ETF (BIV) declined 5.0%. In the same time period, however, the Vanguard Short Term Bond ETF (BSV) fell just 0.6%.

Learn System design : Distributed Systems Introduction | Horizontal scaling vertical scaling

Here is the link.

Watch this before your System design interview!!

Here is the link.

Google Docs System design | Part 1| Operational transformation | differentail synchronisation

Here is the link.

Design data intensive application

Here is the link to download the book. I like to spend 30 minutes a day to read the book.

Here's how to play big tech into earnings next week

Here is the link.

I like to invest some time to learn from CNBC television.

Netflix isn't growing as fast as Wall Street wants and the stock gets hit

Here is the link.

Tech Dummies - Narendra L

Here is the link.

What is NoSQL and how is it used? Deep dive with Cassandra!

Here is the link.

NoSQL is a popular database storage method. It keeps data as key value pairs. The advantages and disadvantages of NoSQL compared with RDBMS are discussed here, using the Cassandra architecture as an example. We talk about sharding, redundancy, load balancing, compaction and some other features in NoSQL databases. This allows them to scale efficiently.


Here is the link.



ETF 买下全世界:为什么你应该拥有一个美股 /ETF 帐户?

Here is the link.

不只外企,众多两岸企业也远赴美国上市:$ 阿里巴巴 ( BABA ) $ 、$ 京东 ( JD ) $ 、蔚来、富士康等等,我们都熟悉的旅游企业$ 携程 ( CTRP ) $,2004 年赴美国纳斯达克交易所上市后,其股价由 2003 年的 2 美金增长到 2017 年的高峰 60 美金,股价涨幅高达 30 倍。
之前在这篇文章提过:目前能买到这些 ETF 的平台有各银行的复委托、国外券商平台,但是前者的手续费比后者高出许多,以我自己为例,就是使用雪盈证券的美股帐户。
我目前用的平台是@雪盈证券 的美股帐户,使用时间约半年。他的优点是我目前能找到手续费 / 使用起来成本最低的平台,买卖不用手续费,也有中文页面。缺点是不像国内银行,在购买时有人给你咨询,所以也需要对投资产品有一定程度的认识。

Whatsapp System Design: Chat Messaging Systems for Interviews

Here is the link.

The Whatsapp system architecture is a common system design interview question. This interview question asks us to select a set of features like sending chat messages, read receipts, group messaging and last seen visibility. The chat system must be scalable and have other non functional requirements like message ordering, retrial, idempotency, load balancing and image sharing.

System Design: Designing Tinder's Server Side

July 20, 2019

Here is the link.

We design the system architecture of Tinder. Tinder is an app used for dating and socializing. Designing these apps starts with clarifying the system requirements. In an interview, Tinder has multiple requirements. When designing the system, we will be drawing logical components and defining their interactions with other services based on the microservice architecture. We assume that any internal details can be handled with our proir knowledge of system design concepts. Check the reference section for a set of links. The Tinder system has four requirements: storing profiles, recommendations, noting matches and chatting with matches. Storing profiles is trivial except for the image storage, where we argue on BLOB vs File storage. The distributed file architecture seems best when storing images. Direct Messaging or chatting with matches can be done using the XMPP protocol, which uses web sockets to have peer to peer communications between client and server. Each connection is built over TCP, ensuring that the connection is maintained. The session microservice can send messages to the receiver based on connection to user mappings. Our system is decoupled as much as possible. We try to maintain accept and reject information on the client. On swiping left or right, the client can note the action and avoid showing the same user again, perhaps using bloom filters. The server has a validation engine called the matcher microservice, which notes matches and allows or disallows chat between two users. The final requirement of recommendations needs city wise partitioning on the user data. This is achieved using NoSQL databases like Cassandra or Amazon Dynamo. The other option is to use relational databases with horizontal partitioning. The concept is now referred to as sharding.

Follow up

August 3, 2019 8:57 PM

Start from four or five features since there is one hour limit.
1. Store profiles ( Images) - 5 images per user
2. Recommend matches ( No of active users)
3. Note matches
4. Direct messaging

File vs Blob (Binary large object)

XMPP - take a look and read 10 minutes
TCP - connection, how to maintain?

Case study: My June Victoria portfolio monthly statement 2019-June - Part 6

Investment return

Investment return

I like to work on investment return. I am looking for long term investment return, for example, 4 years return.

Case study: My June Victoria portfolio monthly statement 2019-June - Part 5

TFSA summary

Total contribution available is $63,000 dollars, so I should be able to transfer another $20,000 dollar to my TFSA account.

Case study: My June Victoria portfolio monthly statement 2019-June - Part 4

Balance changes

Case study: My June Victoria portfolio monthly statement 2019-June - Part 3

Total book cost

It is very easy to compare total book cost and total market value. But I do think that quantity of shares are also important. Market may go up and down, and over the long run, I only care about the shares and final price I like to sell, or internal value of current month.

Case study: My June Victoria portfolio monthly statement 2019-June - Part 2


5 Tips for System Design Interviews

Here is the link.

Here are 5 Tips for System Design interviews. They are helpful when preparing for a System Design interview. 1) Don't get into details prematurely 2) Avoid fitting requirements to a set architecture in mind 3) Keep it simple, stupid! Remember to look at the big picture and avoid too many hacks when solving. 4) Have justifications for the points you make. Don't use buzz words or half hearted thoughts in your design. 5) Be aware of the current solutions and tech practices. A lot of solutions can be purchased off the shelf which simplify implementation. You should be able to argue for a custom implementation with it's pros and cons. Have these on the back of your mind during you interview, and all the best! Here are three major points evaluated during the interview: (1) Clarity of Thought a) Express your thoughts in a clear manner. b) Justify your decisions. Critical reasoning and argument are key to a successful software design. c) When faced with a problem, use standard approaches to mitigate it. For example, say you are faced with an availability problem. State that replication and partitioning help increase availability in general, and move on to offer a solution. d) Don’t make points without thinking them through. Half-hearted attempts at solving problems are frowned upon heavily. (2) Knowledge a) Stay up to date with the current solutions in the market. This includes products and design practices. If NoSQL is being adopted left right and center, you need to be aware of it. b) Know when to pick a solution vs. building something custom. If you name a product, you should be (generally) aware of the features it provides. c) Design practices enable you to meet custom requirements. Examples are decoupling systems, load balancing, sticky sessions, etc… (3) Flexibility a) Switch your targets as the requirements shift. If the interviewer wants to know about one particular part of the system, do it first. b) Never have a set architecture in mind. We all try to fit requirements to a system, but only after it has been shaped by the initial ones. A rigid attitude creates a brittle architecture. It will break before you do. c) Take a step back at times to make adjustments to the general architecture. Being focused on one part can narrow our vision and bloat those areas. There will be components which can be extracted out and extended to the rest of the system.

Case study: My June Victoria portfolio monthly statement 2019-June - Part I

July 20, 2019


It is my personal finance research. I like to figure out how good prepares monthly statement for customers. And I also like to learn every detail of my investment and look into new concepts to catch up. Invest time early, later I will benefit a lot how to rebalance my portfolio.

Case study

I like to see how good the presentation is.

A few things I like:

I learn the summary of report, and balance in dollar amount.
Account statement

6 Ways to Prepare for the Next Market Decline

Here is the link.

The U.S. economy is putting up some impressive numbers in GDP, jobs and wages, but many pundits fear that a slowdown is pending. Trade-war fears with China and the European Union remain front and center in the news. And the yield curve is threatening to invert, meaning short-term interest rates may be moving higher than long-term rates. That’s often a sign of pending recession on its own.

By some measures, the current expansion is now 10 years old, making it one of the longest on record. That seems ancient, but there’s no rule that says it can’t continue. Australia is in its 28th consecutive year of economic growth.

Even so, all good things do eventually come to an end. And for the U.S. (and for Australia, for that matter), economists are looking for slowdowns. Even the Federal Reserve has indicated it is ready to lower short-term interest rates to combat any problems that may arise.

Professional investment managers may look to sell a good deal of their holdings to step aside as the market falls. However, for most individuals, timing the market by selling when conditions seem dicey, and buying back when conditions firm up, is a big mistake. Even the pros don’t always get it right, and they have armies of analysts and rooms full of technology at their disposal.

Quick test

Here’s a quick test: Any stock that hit a 52-week low in April or May as the Standard & Poor’s 500 hit a 52-week (and all-time) high probably will not fare well on a market swoon. The same goes for stocks at 52-week lows at today’s highs.

Just remember that this is a tweak of your portfolio. You are not timing the market, pe se, because you will remain largely invested.

Portfolios benefit from owing a percentage of bonds or other fixed-income investments. While bonds typically do not offer the same capital appreciation potential as stocks, their relative price stability and income streams can offset weakness in stocks.

One rule of thumb for a diversified portfolio across different asset classes is 55% stocks, 35% bonds and 10% cash. Of course, this will look a little or very different depending on your risk tolerance and how close you are to retirement.

How Bell Labs Creates Star Performers

It’s likely that IQ was determined to not be a factor because everyone employed by Bell Labs already had a baseline skill of raw intelligence. (Robert Kelley and Janet Caplan, “How Bell Labs Creates Star Performers,” Harvard Business Review, July–August 1993)

After we met with the experts in groups, they came to a consensus about the two categories—cognitive skills and work strategies—that influence high productivity. Since all Bell Labs engineers score at the top in IQ tests, cognitive abilities neither guarantee success nor differentiate stars from middle performers. However, the Bell engineers identified nine work strategies that do make a difference: taking initiative, networking, self-management, teamwork effectiveness, leadership, followership, perspective, show-and-tell, and organizational savvy (see the chart “An Expert Model for Engineers”).

What Great Problem Solvers Do Differently

Here is the link.

Friday, July 19, 2019

What is Distributed Caching? Explained with Redis!

Here is the link.

Case study: My Facebook phone screen on July 17. 2019

July 19, 2019


It is a good idea to write a case study about my past phone screen from Facebook. I like to document how good the interviewer is to manage the time, expectation, and handle the assessment task in such efficient way.

Case study

I got a call on 10:15 AM, the interviewer introduced himself, what he works on in daily job, and asked a simple question. He told me that he likes to end the interview at 10:56, I checked the time when he said that. It was 10:17. He planned to spend 39 minutes to ask interview algorithm. He will give me five minutes to ask questions.

He copied and pasted the question, and then he said that he will read the problem statement word by word, so he gave me time to read and think about the statement.

I thought about a few minutes, and I shared my idea, he liked my idea. He told me that I could start to write the code. He asked me to think out loud.

The algorithm is not a hard algorithm But I did say that I am thinking about how to determine the position in the array. The interviewer gave me hint right away, but only good enough for me to come out the idea to solve it in less than one line of code.

I finished the coding, and then reviewed my code, I found the code was buggy. I talked about two edge cases. The interviewer gave me second hint what to handle about edge case, I took his advice. I just added a few lines of code.

I checked the time that it is 10:28 or 10:38, so I was asked to solve the second algorithm. I believed that It is a hard level algorithm. And I had ideas how to solve the problem, but I did not have good design for possible issues in my choice.

This is most challenging part of problem solving. I had to learn through how to work with the interviewer, he just quickly gave me all kinds of test cases, helped me to think thoroughly about the problem. I was nervous, but I knew that it is important to listen and think about test cases he gave, and I was so surprised that I came out the idea how to make change of my design.

The interviewer said that he believed that I can code the algorithm if I am given the time. That is wonderful experience.

Actionable Items

I do know that a strong algorithm problem solver can come out the idea in less than five minutes; but it takes me over 20 minutes to work on hard level algorithm, I also need a few hints to change my design.

I think time is so critical in official phone screen. I understand that the interviewer will find out if I can perform under pressure, and how focus I can ...

What is Distributed Caching? Explained with Redis!

Here is the link.

What is Load Balancing?

Here is the link.

Asynchronous Processing in Web Applications, Part 2: Developers Need to Understand Message Queues

Here is the link.

What is a Message Queue and Where is it used?

Here is the link.

Messaging Queues are widely use in asynchronous systems. Message processing in an asynchronous fashion allows the client to relieve itself from waiting for a task to complete and, hence, can do other jobs during that time. It also allows a server to process it's jobs in the order it wants to. Messaging Queues provide useful features such as persistence, routing and task management. We will be discussing the benefits of a message queue in future videos. A system having a message queue can move to higher level requirements while abstracting implementation details of message delivery and event handling to the messaging queue. The 'queue' is just a name for this data structure. In practice, it could be storing messages using any policy. Some examples of message queues are Kafka and RabbitMQ. They are widely used for various purposes such as command query request segregation (CQRS) and event sourcing.

Thursday, July 18, 2019

What is Consistent Hashing and Where is it used?

Here is the link.

Consistent hashing

Here is the wiki article I like to read.

In computer scienceconsistent hashing is a special kind of hashing such that when a hash table is resized, only  keys need to be remapped on average, where  is the number of keys, and  is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped because the mapping between the keys and the slots is defined by a modular operation.
Consistent hashing achieves some of the goals of rendezvous hashing (also called HRW Hashing), which is more general, since consistent hashing has been shown to be a special case of rendezvous hashingRendezvous hashing was first described in 1996, while consistent hashing appeared in 1997. The two techniques use different algorithms.

Consistent hashing maps objects to the same cache machine, as far as possible. It means when a cache machine is added, it takes its share of objects from all the other cache machines and when it is removed, its objects are shared among the remaining machines.
The main idea behind the consistent hashing algorithm is to associate each cache with one or more hash value intervals where the interval boundaries are determined by calculating the hash of each cache identifier. (The hash function used to define the intervals does not have to be the same function used to hash the cached values. Only the range of the two functions need match.) If the cache is removed its interval is taken over by a cache with an adjacent interval. All the remaining caches are unchanged.

System Design - Gaurav Sen

Here is the link.

System Design Introduction For Interview.

Here is the link.

System design introduction

A - Ask good questions B - Don't use buzzwords C - Clear and organized thinking D - Drive discussions with 80-20 rule Things to consider Features API Availability Latency Scalability Durability Class Diagram Security and Privacy Cost-effective Concepts to know Vertical vs horizontal scaling CAP theorem ACID vs BASE Partitioning/Sharding Consistent Hashing Optimistic vs pessimistic locking Strong vs eventual consistency RelationalDB vs NoSQL Types of NoSQL Key value Wide column Document-based Graph-based Caching Data center/racks/hosts CPU/memory/Hard drives/Network bandwidth Random vs sequential read/writes to disk HTTP vs http2 vs WebSocket TCP/IP model ipv4 vs ipv6 TCP vs UDP DNS lookup Http & TLS Public key infrastructure and certificate authority(CA) Symmetric vs asymmetric encryption Load Balancer CDNs & Edges Bloom filters and Count-Min sketch Paxos Leader election Design patterns and Object-oriented design Virtual machines and containers Pub-sub architecture MapReduce Multithreading, locks, synchronization, CAS(compare and set) Tools Cassandra MongoDB/Couchbase Mysql Memcached Redis Zookeeper Kafka NGINX HAProxy Solr, Elastic search Amazon S3 Docker, Kubernetes, Mesos Hadoop/Spark and HDFS

System design tips

system design tips 

Here is the link. 

Operating system - three easy piece-baidu 1point3acres
Designing Data-Intensive Application

- (这个讲的蛮好的, 我看了好多遍)这个基础知识讲的不错, 但是具体例子一般)
- Distributed Systems in One Lesson(Safari上有完整版, 4小时左右, 讲的很好)

刷题刷累了, 我就看着这些视频当消遣。。。 蛮有趣的。。 比做题有趣的多。。。. From 1point 3acres bbs
- Scalability Harvard Web Development
- Gaurav Sen
- Tushar Roy - Coding Made Simple
- Tech Dummies - Narendra L
- Coding Tech

- 架构师之路(强推!!!)
- 51CTO

虽然实际工作中没有怎么用, 但是因为简历里有提到,所以我认真看了官方文档, 设计文档,各种教学视频... 
我主要关注architecture, 而不是api应用。
- Kafka
- Cassandra
- Consul

1. 讨论用户是谁
2. 根据用户讨论feature
3. 问一下系统需要handle 的traffic, 问问需不需要进行计算。 面了8次系统设计,只有roblox 要求计算。其他都不要。。。
4. 根据feature讨论系统需要存储和serve哪些data, 这些data用什么存, 讨论sql/nosql/cache/object storage/hdfs 取舍, 巴拉巴拉。。。
5. 根据数据, 设计service。 画图。
6. work through一个use case, 把所有service连起来, 同时修改刚才画好的图。 比如 做uber eats, 讨论用户要order 一个食物,到餐馆接到订单, 到司机接到订单。。。。 
7. 讨论use case细节, 比如 uber eats司机进入某个区域怎么识别啊, cache里怎么存啊。面试官全程都会drive你的design的, 不会丢你在那里自言自语。
8. 面试官会问, 某些环节挂掉了,怎么处理。 无非就是1. 要么replica, master slave, active-passive 或者 2.周期存snapshot 在磁盘上,然后存action log... 挂了可以重新恢复。。。
9. 一些环节怎么scale... multi instance, partition 这些呗。。 偶尔说说service mesh...