Episode 116 - Database Sharding

Published: Dec. 18, 2019, 5:05 p.m.

Database Sharding Crash Course (with Postgres examples)

Database sharding is process of segmenting the data into partitions that are spread on multiple database instances to speed up queries and scale the system.

What is sharding?

sharing key / partition key

Consistent Hashing

Horizontal partitioning vs Sharding

Example

Pros and cons

What is Sharding? 1:30

Consistent Hashing 4:50

Horizontal partitioning vs Sharding 7:36

Example 8:45

Spin up Docker Postgres Shards 10:02

Write to the shard 17:25

Read from the Shard 39:20

Pros & Cons 51:10

Cards

Postgres pgadmin Docker 8:54

Postgres Javascript 18:18

URL vs Query param 22:30

CORS 29:30

sql injection 42:40

Source Code

https://github.com/hnasr/javascript_playground/tree/master/sharding

Docker commands (including pgadmin)

https://github.com/hnasr/javascript_playground/blob/master/sharding/shards/commands.txt

Dockerfile & init.sql

https://github.com/hnasr/javascript_playground/tree/master/sharding/shards

Horizontal partitioning vs Sharding

HP same database instance so you can still join

sharding across instances (different server)

Pros

Extreme scale rdbms

Optimal and Smaller index size

Cons

Transactions across shards problem

Rollbacks

Schema changes

Complex client (aware of the shard)

Joins

Has to be something you know in the query

Example

URL shortener

create table

CREATE TABLE public.test1

(

id serial NOT NULL primary key,

url text,

url_id character(5)

)

Spin up 3 instances

p1

P2

P3

post

get

--- Send in a voice message: https://anchor.fm/hnasr/message