Feb. 11, 2021
Manual sharding - my learning notes:
- Automatica sharding
- Manual sharding at the application layer
- common mistakes: uneven data distribution, hotspots, and storing data on too few shards
- challenges: database joins cross multiple servers
Sharding, including the day-1 shard creation and day-2 shard rebalancing, when completely automated can be a boon to high-volume data applications. Unfortunately, monolithic databases like Oracle, PostgreSQL, MySQL, and even newer distributed SQL databases like Amazon Aurora do not support automatic sharding. This means manual sharding at the application layer has to be performed if you want to continue to use these databases. The net result is a massive increase in development complexity. Your application now has additional sharding logic to know exactly how your data is distributed, and how to fetch it. You also have to decide what sharding approach to adopt, how many shards to create, and how many nodes to use. And also account for shard key as well as even sharding approach changes if your business needs change.