Saturday, July 27, 2019

NETFLIX System design | software architecture for netflix

Here is the link.

1. OC
2. Backend
3. Client

AWS, Open Connect


Original, five more edge servers - videos are saved in those servers.



Study notes


The design uses Hystrix as circuit breaker. Istio which does similar job:
https://www.exoscale.com/syslog/istio-vs-hystrix-circuit-breaker/


Talking about micro service: less dependency for critical services; stateless;

Cache in memory and SSD
EVCache (Netflix own cache based on memcached):
I prefer to use KV cache (Redis), a general cache layer, support cluster.

Database:
RDBMS using MySQL
MySQL for any transaction based: billing, user info
MySQL Master-Master synchronization, with several Read-Only slaves, which is vertically partitioned for local data center
MySQL handle heavy read good.

Cassandra: No-sql server, heavy read/write
user history
separate data into recent and old (zip & archive)
One cassandra cluster to keep only recent data, another cluster to keep old compressed data.

Event log process: Kafka & Chukwa
http://jasonwilder.com/blog/2012/01/03/centralized-logging/
Chukwa collects logs generated by Microservices and save to HDFS.
(Alternative solution is to fluentd, route log events directly to ElasticSearch)

Kafka is a message queue system, Chukwa as producer, events sent to Kafka, can do filter, different queues, copy ,replay...; Each queue can have multiple consumer to process received messages (save to different destination: spark, elasticsearch). Cluster based, easy scale out.

ElasticSearch: index and search service, cluster, easy scale out
Dashboard: kibana
Search events, event need to correlated
https://docs.microsoft.com/en-us/azure/architecture/microservices/logging-monitoring#distributed-tracing
https://medium.com/@_jesus_rafael/designing-a-event-trace-microservice-86846efb951a

Spark: data mine/Machine Learning based on the events:
sorting/row select/ranking/recommendation

Machine Learning: get data from kafka, training, build module, generate rank/recommendation

OCA: Open Connect Applicance
Hardware cache for netflix caching contents, deployed at ISP.
Use hashing ring to determine which OC server is the content located in the cluster.



No comments:

Post a Comment