Scaling PostgreSQL: Replication, Sharding, and Beyond

November 14, 2016

PostgreSQL handles impressive workloads on a single server. Proper indexing, query optimization, and adequate hardware carry most applications far beyond their initial scale projections. But eventually, growth hits limits: CPU can’t process more queries, disk can’t handle more writes, or data exceeds practical storage limits.

Scaling PostgreSQL requires understanding the bottleneck and choosing appropriate strategies. This isn’t one-size-fits-all; different problems require different solutions.

Understanding the Bottleneck

Before choosing a scaling strategy, identify what’s actually limiting you:

Read capacity: Too many queries saturating CPU. Response times increasing under load. Solution: read replicas.

Write capacity: Transaction log writes can’t keep up. Write operations queuing. Solution: write optimization, partitioning, eventual sharding.

Storage: Data exceeds disk capacity or performance degrades with size. Solution: partitioning, archival, sharding.

Connection count: Too many concurrent connections. Solution: connection pooling.

Don’t assume you need sharding because “that’s what big companies do.” Most applications never need sharding; simpler approaches solve their problems.

Connection Pooling

PostgreSQL creates a process for each connection. Many connections consume significant memory; connection storms can overwhelm the server.

Connection pooling solves this by maintaining a pool of database connections and multiplexing application connections onto them.

PgBouncer

PgBouncer is the standard PostgreSQL connection pooler:

[databases]
myapp = host=127.0.0.1 dbname=myapp

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 5432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

Pool modes:

Connection pooling is almost always worth implementing. It’s low-risk and solves a common bottleneck.

Read Replicas

When read queries saturate your primary server, add read replicas. Replicas receive all writes from the primary and can serve read queries independently.

Streaming Replication

PostgreSQL streaming replication is straightforward to configure:

Primary (postgresql.conf):

wal_level = replica
max_wal_senders = 5
wal_keep_segments = 32

Replica setup:

pg_basebackup -h primary-host -D /var/lib/postgresql/data -P -R

The -R flag creates a recovery.conf that connects to the primary for streaming.

Application Integration

Applications must route queries appropriately:

This requires application changes or middleware. Options:

Application-level routing: Application code explicitly routes reads to replicas. Gives precise control but requires code changes.

Connection middleware: Tools like PgPool-II route queries based on content. More transparent but adds complexity.

Load balancer with read/write awareness: Some load balancers can parse queries and route accordingly.

Replication Lag

Replicas lag behind the primary—writes on primary take time to propagate. This creates consistency challenges:

Strategies for lag:

Read-your-writes: After writes, read from primary (or verify replication lag is acceptable).

Causal consistency: Track which replica has seen a particular write; read from that replica.

Accept eventual consistency: For many use cases (dashboards, analytics), slightly stale data is fine.

Monitor replication lag and alert when it exceeds acceptable thresholds.

Partitioning

Partitioning divides a large table into smaller pieces. Queries that touch only one partition access less data; maintenance operations can work on individual partitions.

Table Inheritance (Pre-PostgreSQL 10)

Traditional PostgreSQL partitioning uses table inheritance:

-- Parent table
CREATE TABLE events (
    id SERIAL,
    event_time TIMESTAMP NOT NULL,
    data JSONB
);

-- Child tables for each month
CREATE TABLE events_2016_01 (
    CHECK (event_time >= '2016-01-01' AND event_time < '2016-02-01')
) INHERITS (events);

CREATE TABLE events_2016_02 (
    CHECK (event_time >= '2016-02-01' AND event_time < '2016-03-01')
) INHERITS (events);

-- Index child tables
CREATE INDEX ON events_2016_01 (event_time);
CREATE INDEX ON events_2016_02 (event_time);

This approach requires triggers for insert routing and manual partition management.

Declarative Partitioning (PostgreSQL 10+)

PostgreSQL 10 introduces declarative partitioning:

CREATE TABLE events (
    id SERIAL,
    event_time TIMESTAMP NOT NULL,
    data JSONB
) PARTITION BY RANGE (event_time);

CREATE TABLE events_2016_01 PARTITION OF events
    FOR VALUES FROM ('2016-01-01') TO ('2016-02-01');

CREATE TABLE events_2016_02 PARTITION OF events
    FOR VALUES FROM ('2016-02-01') TO ('2016-03-01');

Declarative partitioning is cleaner and more efficient.

Partitioning Strategies

Range partitioning: Divide by value ranges (dates, IDs). Good for time-series data.

List partitioning: Divide by discrete values (regions, categories). Good for known, stable categories.

Hash partitioning: Divide by hash of a column. Distributes data evenly when no natural partitioning key exists.

Benefits

Limitations

Sharding

Sharding distributes data across multiple independent database servers. Unlike replication (all servers have all data), sharding divides data (each server has a subset).

When Sharding Makes Sense

Sharding is the most complex scaling approach. Consider it only when:

Most applications never need sharding. The complexity cost is significant.

Sharding Approaches

Application-level sharding: Application code determines which shard holds each piece of data and routes queries accordingly.

def get_shard(user_id):
    return f"shard_{user_id % NUM_SHARDS}"

def get_user(user_id):
    conn = get_connection(get_shard(user_id))
    return conn.execute("SELECT * FROM users WHERE id = %s", user_id)

Middleware sharding: A proxy layer routes queries to appropriate shards. Tools like Vitess, Citus, and others provide this.

Database-level sharding: Some databases handle sharding internally. Citus extends PostgreSQL with distributed table capability.

Sharding Challenges

Schema changes: Must apply to all shards consistently.

Cross-shard queries: Queries spanning multiple shards require aggregation from multiple sources.

Rebalancing: Adding shards requires redistributing data.

Transactions: Distributed transactions across shards are complex or impossible.

Operational complexity: Multiple databases to monitor, backup, upgrade.

Shard Key Selection

The shard key determines data distribution. Choose carefully:

Good shard keys: Provide even distribution and match query patterns. If most queries filter by user_id, shard by user_id.

Bad shard keys: Create hot spots (one shard gets disproportionate traffic) or require cross-shard queries for common operations.

Citus: PostgreSQL Sharding Extension

Citus extends PostgreSQL with distributed table capability. Tables can be distributed across multiple workers while queries use standard PostgreSQL syntax.

-- On coordinator
CREATE TABLE events (
    tenant_id INT,
    event_time TIMESTAMP,
    data JSONB
);

-- Distribute by tenant_id
SELECT create_distributed_table('events', 'tenant_id');

-- Queries work transparently
SELECT * FROM events WHERE tenant_id = 123;

Citus handles routing queries to appropriate shards and aggregating results. It’s particularly effective for multi-tenant applications where tenant_id is a natural shard key.

Scaling Strategy Progression

A typical scaling progression:

  1. Single server optimization: Proper indexing, query optimization, adequate hardware. This handles more than you expect.

  2. Connection pooling: Low-effort, high-reward. Implement early.

  3. Read replicas: When reads saturate the primary. Moderate application changes.

  4. Partitioning: When table size causes performance issues. Good for time-series data.

  5. Sharding: When nothing else suffices. Significant architecture change.

At each stage, measure whether the problem is actually solved before moving to more complex solutions.

Key Takeaways