Sunday, August 9, 2020

Storage research: Finding a Needle in Haystack: Facebook's Photo Storage

 Here is the link. 

NFS based design

Typical webstie

  • Small working set
  • Infrequent access of ...
Metadata bottleneck
    Each image stored as a file
    Large metadata size severely limits the metadata hit ratio

Image read performance
10 lops / image read (large directories - thousands of files)
2 iops  / image read (smaller directories - hundreds of files_
2.5 ..

Haystack based design 

Haystack store

Storage - 12 * 1TB SATA, RAID6
Filesystem
- single - 10TB xfs filesystem

Haystack
  Log structured, append only object store containing needles as object abstractions
 100 haystacks per node each 100GB in size

Haystack store - haystack file layout 
Haystack store - haystack index file layout
Haystack store - photo server
Accepts HTTP requests and translates them to corresponding Haystack operations
Builds and maintains an inore index of all images in the Haystack
32 bytes per photo (8 bytes per image vs. ~600 ybtes per inode)
5GB index

Read operation
Read 
Lookup offset/ size of the image in the incore index
Read data (-1 top)

Multiwrite(Modify)
Async append images one by one to haystack file 
Flush haystack file
Asyc append index records to the index file
Flush index file if too many dirty index records
Update incore index

Delete 
Lookup offset of the image in the incore index
Synchronously mark image as  "Deleted" in the needle header
Update incore index

Compaction
...

Haystack Directory
Logical to physical volume mapping 
URL generation 
http-> CDN->Cache->Node->Logical volume id, image id

Load balancing 
writes across logical volumes

Photo upload 
Photo download - 

conclusion 

Haystack - simple and effective storage system
 - optimized for random reads (~1 Q/o per object read)
- cheap commodity storage
8500 LOC (C++)
2 engineers 4 months from inception to initial deployment 


Slides from the talk. Here is the link. 


Storage 
 – 12x 1TB SATA, RAID6 

Filesystem 
– Single ~10TB xfs filesystem 

Haystack 
– Log structured, append only object store containing needles as object abstractions 
– 100 haystacks per node each 100GB in size





No comments:

Post a Comment