Tuesday, August 11, 2020

Netflix tech talk: Netflix architecture evolution

 Here is the link. 

Architecture evolution

Sessions - Oracle database - scale up - no scale out - ad hoc ... painful, not evolved 


Real time data - gen 1 pain points

- scalability - DB scaled up not out

- Event data analytics - ad hoc

- Fixed schema

More expensive Oracle hardware - computer, cannot scale out

Real time data - gen 2 motivation

Scalability - scale out not up 

Flexible schema - key/value attributes

Service oriented

Real time data - gen 2 pain points

- scale out - resharding was painful

- performance - hot spots

- Disaster recovery - simpleDB had no backups

Real time data - gen 3 landscape

- Cassandra 0.6

- Before SSDs in AWS

- Netflix in 1 AWS region

Real time data - gen 3 motivations

- order of magnitude increase in requests

- scalability - actually scale out rather than up 

Real time data - gen 3 writes

start - stop 

viewing service - 

gen 3 - Cluster Scale 

cluster - scale 

Real time Data - gen 3 pain points

- stateful tier - hot spots, multi-region complexity

- monolithic service

-read-modify-write poorly suited for memcached


Real Time Data - gen 3 learnings

- Distributed stateful systems are hard - go stateless, use C*/ memcached/redis...

- Decompose into microservices


Real Time Data - gen 4

stateless Microservices

- stream state/ event collectors

- data processors

- data services

- data feeds


Session analytics

- summarize detailed event data

- non-real time, but near real time

- some shared logic with real time


Session analytics - processing 

Storage - processing 

processing - storm - mantis, Samza 


Polygot persistence - one size fits all doesn't fit all

Strong opinions, loosely held - design for long term, but be open to redesigns



Viewing service - 50 data partitions 

Scale out - resharding was painful 

Performance - hot spots 

Disaster recovery 

NoSQL - > MemCache, Cassandra - 

gen 3 motivation

order of magnitude - include ... 

Write / read stateful tier - active sessions, latest positions, View summary - > sanpshot, viewing history Memcached 

Access - ...

No comments:

Post a Comment