https://www.youtube.com/watch?
F4 - Photo Storage at Facebook
F4 - photo storage at Facebook
world's largest photo storage
BLOB "hotness"
Haystack: 2008
Hot storage design goals
HIgh throughput
- in memory index
- single I/o per request!
- Multiple copies
Failure tolerant
RAIDS
Multiple copies
RAID 6
2 Redudant drivers
2 of 12 = 1.2x replication
Over 3 arrays = 3.6x
RAID 2, RAID 3, RAID 4, RAID 6 explained with diagram
https://www.thegeekstuff.com/
Haystack warm storage
- Redundancy = replcation
Read throughput = replication
Total replication = 3.6x
Warm storage
Redundancy still required
Read throughput the ...
RS Encoding:
Reduundancy
upload request -> web server -> Storage router -> haystack
F4
haystack -> Migration -> F4
Read request -> CDN -> Storage router -> F4
f4: What are we solving
Warm storage problem:
- Need to store (warm) data efficiently
- Storage must be highly fault tolerant
- Read latency should be comparable to haystack
- Load is NOT primary concern
Solution: f4
- 2.x replication factor compared to haystack's 3.6x
- yet more fault tolerant than haystack(!!)
f4: Data splitting RS(5,2)
10G Haystack volume
1G Data blocks
Data blocks
Parity blocks
f4: RS Rebuild
RS decoding
Data blocks parity blocks
f4: Block placement policy
Blocks of each stripe is placed in different racks(=> hosts)
RS(10,4) is used in practice (1.4x)
Tolerant 4 racks(-> 4 disks/hosts) failures
f4 Cell anatomy
f4 storage consists of a set of cells.
One cell resides completely in one data center
Cell consists of 3 kind of nodes: storage, compute, coordinator
The index is distributed across storage needs
f4 Reads
user request -> router -> index read (1), storage nodes, compute, cell
Data read (2)
REads with datacenter failures (2.1X)
Router1, router2, router3
Datacenter1
Datacenter2
Datacenter3
Volume1
volume2
xorVolume
No comments:
Post a Comment