Sunday, August 9, 2020

Facebook tech talk: TAO: Facebook’s Distributed Data Store for the Social Graph

 Here is the link. 

Abstract: 

We introduce a simple data model and API tailored for serving the social graph, and TAO, an implementation of this model. TAO is a geographically distributed data store that provides efficient and timely access to the social graph for Facebook’s demanding workload using a fixed set of queries. It is deployed at Facebook, replacing memcache for many data types that fit its model. The system runs on thousands of machines, is widely distributed, and provides access to many petabytes of data. TAO can process a billion reads and millions of writes each second.


Tao summary

Efficiency at scale 

Read latency 

The solution is to separate cache and DB, graph-specific caching, subdivide data centers

Write timeliness

The solution is to write-through cache, asynchronous replication

Read availability 

The solution is to alternate data sources


More details:

web server -> query graph -> html 

when to aggregate and rendering ...

query them and dynamically - hard to predict, different from each user, each user has different view of graph

guarantee for right content - TAO 

database, stacks, dynamic rendering every time - highest ...

query in TAO - multiple billions per second - whole graph - petabytes data every data centers

Limitation 

Scale - large level 

efficiency in scale - not too much cost 

Dynamic resolution of data dependencies - post - three rounds - HTTP 

Low read latency - read from local data center 

Timeliness of writes - data center - web server - read - cannot read after write - high readability - TAO 

Graph in Memcache 

PHP abstraction - obj & Assoc API - memcache (nodes, edges, edge lists)

Service -> control - cache - model 

objects= Nodes 

64-bit IDs type, with a schema for fields 

Associations - Edges 

Association lists 

ide1, type - descending order by time 

Every edge has time - all queries more than one edge - size of association lists 

objects and associations API - read work very well

web servers - stateless 

cache - objects, assoc lists, assoc counts, Database TAO ,

stateless, shared by id, servers -> read qps 

Sharded by id - servers - bytes

subdivding the data center  

web servers, cache, database, 

servers - nearby building - many open sockets, lots of hot spot 

Subdividing the data center 

web servers - cache - database 

distributed write control logic - cache - thundering objects 

leader cache - database - follower cache

Timeliness of writes 


Async DB replication 

Master data center   Replica data center

web servers 

writes forwarded to master

Inval and forwarded to master 

Inval and refill embedded in SQL 


Improving availability: Read failover 

TAO summary 

Efficiency at scale / Read latency 

Write timeliness - write-through cache, asynchronous replication 

Read availability - Alternate data sources

Questions: 

MySQL - 

graphic roles - 

clean separations - 

size of Node - largest 1 MB - graph, long tail large value - content link, not a photo itself, metadata, ...

17,000 lines of comment - TAO ... stretch to limit 

Iphone - ...

Leader node - half hour ...... 

leaders do - reduce temperature of hot spots - 

TAO - not designed heavy write - less timely, more data intensity 

who to suggest for friends

Consistency model - TAO - opportunity - 

TAO summary - efficiency at scale / Read latency / Write timeliness/ Read availability 







No comments:

Post a Comment