Here is the link.
Harrison Fisk, engineering manager at Facebook, talks about the storage considerations and challenges in serving Facebook's graph and includes an overview of TAO, Facebook's distributed graph storage system built on top of MySQL, and Wormhole, a publish-subscribe system.
Data Focused
Messages
. Large historical dataset
. Mostly append data
. Private data
.Low amount of viewers
.Hbase + caches
What is Hbase? 5 minutes break here to look up
Titan server - 10 minutes break here to look up
Mobile WWW
cell
Titan server
HBase
HDFS
Timeline
What is consideration of Timeline
.Lots of writes
.Single user reads
.Read recent data
.Large range reads of older data
.MySQL + Aggregator + MC
Graph data
Graph data
.Low MS response times
. <1 MS average
.Near 100% correct
.Heavy read-to-write ratio
.99.99%+reads
.Moving fast
Conclusion
.Applications are increasingly data-driven
.Different systems for different needs
.Clearly identify your goals
.Build towards those goals
graph serving
Why MYSQL?
Compression
Range query
TAO
.Distributed graph storage system
.Read and write through
.Understands nodes and edges
.Complex graph logic
.Easy to define objects/associations
TAO read path
TAO write path - local region
Master region - front end cluster
15:28
Consistency model
. Read-after-write
.Write-through
.Sticky user to cluster
.Always progression forward
.Single master
.Eventual cache consistency
Wormhole
.Pub-Sub system
.Subscribe to certain types of data change events
.All writes to a certain assoc type
.All deletes of objects
.Tails MySQL binlog consuming SQL comments
.Other systems can be tailed as well.
Wormhole consumers
.Cache invalidation
.Indexing
.Graph search
.Secondary index services
.ETL to data warehouse
Conclusion
.Social graph data needs worldwide realtime access
.Distributed caching can hide latency issues
.But create consistency issues
.Tao/MySQL/Wormhole is our solution
Actionable Items
I need to google search storage for system design.
No comments:
Post a Comment