Aug. 11, 2021
Here is the link.
Google's billion-user services like Gmail and Google Maps depend on Bigtable to store data at massive scale and retrieve data with ultra low-latency. Today, many use cases such as IoT, finance, mapping, advertising and dealing with time-series data, face similar demands. In this video, you'll learn how to integrate Cloud Bigtable into your application architecture to solve the challenges of storing and retrieving data for these and other use cases. We'll cover the specifics of the Cloud Bigtable service and also dive into schema level design considerations. You'll also hear how customers have successfully leveraged Bigtable to solve their problems at scale and the patterns they've implemented on top of Cloud Bigtable. Missed the conference? Watch all the talks here: https://goo.gl/c1Vs3h Watch more talks about Infrastructure & Operations here: https://goo.gl/k2LOYG
My notes
Google research in data technologies
- 2002, GFS
- 2004, MapReduce
- 2006, Bigtable
- 2008, Dremel,
- 2010 - 2011, Colossus, Flume, Megastore
- 2012, Spanner
- Millwheel
- 2013, PubSub, F1
- NoSQL (no-join) distributed key-value store, designed to scale-out
- has only one index (the row-key)
- supports atomic single-row transactions
- unwritten cells in do not take up any space
- every cell is versioned (default is timestamp on server)
- garbage collection retains latest version (configurable)
- expiration (optional) can be set at column-family level
- periodic compaction relations unused space from cells
- Put
- Increment
- Append
- Conditional updates
- Bulk import
- Gets
- Range scan
- Filter
- Full scan
- Export
- updates are atomic - but only at the row level
- store items related to a given entity in a single row
- where atomic updates aren't needed and entity is large - then split it up
- rows are sorted lexicographically by row-key
- store related entities in adjacent rows
- determine a key strategy that facilitates common queries
- choose keys that help distribute reads/writes and avoids hotspots
- avoid solely monotonically increasing keys (timestamp or sequence)
- a combined-key strategy is helpful
- JanusGraph - graph database
- OpenTSDB - time-series database
- Spotify/Heroic time-series database
- GeoMesa geospatial querying
Sami Zuhuruddin
https://www.linkedin.com/in/samizuh/
No comments:
Post a Comment