Consistent Hashing

Consistent hashing is a distributed hashing scheme that minimizes key redistribution when nodes are added or removed from a distributed system. It's a fundamental technique used in distributed caches, databases, and load balancers.

Consistent Hash Ring Visualization

Physical Nodes

Node A

2 keys

Node B

1 keys

Node C

0 keys

Test Keys

user123

Node B

session456

Node A

cache7

Node A

Settings

Virtual Nodes per Physical Node

Show Virtual Nodes

Physical Node

Virtual Node

Key

Key Assignment

Total Virtual Nodes

Keys per Node (avg)

1.0

How it works: Each key is hashed to a position on the ring (0-360°). The key is assigned to the first node found clockwise from its position. Virtual nodes improve distribution by giving each physical node multiple positions on the ring.

The Problem

Traditional hashing (hash(key) % N) redistributes most keys when N changes:

4 servers: hash("user123") % 4 = server 0
5 servers: hash("user123") % 5 = server 2 (moved!)

Result: ~80% of keys need to be redistributed

How Consistent Hashing Works

Imagine a circle (hash ring) with hash values from 0 to 2³²-1:

Nodes are placed on the ring using hash(node_id)
Keys are placed on the ring using hash(key)
A key belongs to the first node clockwise from its position

Virtual Nodes

To improve load distribution, each physical node is mapped to multiple positions:

// Physical node A creates 150 virtual nodes
for i := 0; i < 150; i++ {
    hash := crc32.ChecksumIEEE([]byte(fmt.Sprintf("nodeA:%d", i)))
    ring[hash] = "nodeA"
}

This ensures:

Better load distribution (standard deviation < 10%)
Smoother rebalancing when nodes change
More predictable performance

Implementation with B-tree

We use Google's B-tree for O(log n) operations:

type ConsistentHash struct {
    tree     *btree.BTree      // B-tree for fast lookups
    nodes    map[string]int    // physical nodes
    replicas int               // virtual nodes per physical node
}

func (ch *ConsistentHash) GetNode(key string) string {
    hash := ch.hashFunc([]byte(key))
    
    // Find first virtual node >= hash
    var result string
    ch.tree.AscendGreaterOrEqual(VirtualNode{Hash: hash}, 
        func(item btree.Item) bool {
            result = item.(VirtualNode).NodeID
            return false // stop after first
        })
    
    return result
}

Distribution Quality

With 150 virtual nodes per physical node:

Key Distribution Across Nodes

Node A24.8%(-0.8%)

Node B25.3%(+1.2%)

Node C24.5%(-2.0%)

Node D25.4%(+1.6%)

Standard Deviation

0.45%

Excellent

Max Deviation

2.0%

Within target

Target: Each node should handle ~25% of keys. With 150 virtual nodes per physical node, distribution stays within ±5% of the target, ensuring balanced load across the cluster.

Real-World Usage

1. Distributed Cache (Memcached/Redis)

func (cache *DistributedCache) Get(key string) ([]byte, error) {
    node := cache.ring.GetNode(key)
    return cache.clients[node].Get(key)
}

2. Database Sharding

func (db *ShardedDB) GetShard(userID string) *DBConnection {
    shard := db.ring.GetNode(userID)
    return db.connections[shard]
}

3. Load Balancing

func (lb *LoadBalancer) Route(sessionID string) *Backend {
    backend := lb.ring.GetNode(sessionID)
    return lb.backends[backend]
}

Data Migration

When topology changes, consistent hashing tells you where data should go, but doesn't move it:

Adding a Node

// Before: 3 nodes
"user123" → NodeA
"user456" → NodeB

// After: 4 nodes (NodeD added)
"user123" → NodeD  // needs migration
"user456" → NodeB  // stays put

// Only ~25% of keys move (optimal)

Migration Strategies

Lazy Migration: Migrate on access
Active Migration: Background job moves data
Dual Reads: Check both old and new locations

Performance

Performance Characteristics

GetNode

O(log n)

Find which node owns a key

109ns

AddNode

O(r log n)

Add a new node to the ring

86μs

RemoveNode

O(r log n)

Remove a node from the ring

90μs

Physical Nodes: 4

1510

Virtual Nodes per Physical Node: 150

50150300

GetNode Operation Details

Time Complexity

O(log n)

n = total virtual nodes

Current Performance

109ns

With 600 virtual nodes

Operations/Second

9174311.9M

Theoretical maximum

GetNode is the most frequent operation, called on every request. With B-tree implementation, it maintains O(log n) complexity even with thousands of virtual nodes. At ~109ns per lookup, it can handle millions of requests per second.

Benchmarked on Intel Core i7 @ 2.6GHz with Go 1.21 | Times vary based on hardware and load

Production Considerations

1. Choose Replica Count

100-200: Good balance of memory vs distribution
More replicas = better distribution but more memory

2. Hash Function

CRC32: Fast, good distribution (recommended)
MD5/SHA: Cryptographic but slower
MurmurHash: Excellent distribution, very fast

3. Monitoring

stats := ring.GetStats()
distribution := ring.Distribution(sampleKeys)
// Log if any node has >130% average load

Code Example

package main

import (
    "fmt"
    "github.com/hawaijar/system-design-go/pkg/consistenthash"
)

func main() {
    // Create ring with 150 virtual nodes per physical node
    ring := consistenthash.NewDefault()
    
    // Add nodes
    ring.AddNode("cache-1")
    ring.AddNode("cache-2")
    ring.AddNode("cache-3")
    
    // Route requests
    key := "user:12345"
    node := ring.GetNode(key)
    fmt.Printf("%s → %s\n", key, node)
    
    // Get replicas for redundancy
    replicas := ring.GetNodes(key, 3)
    fmt.Printf("Replicas: %v\n", replicas)
}

Comparison with Alternatives

Hashing Methods Comparison

Understanding the Methods: Click on any method below to see detailed explanation and examples. Each method has different trade-offs in terms of memory usage, lookup speed, and flexibility when adding or removing nodes.

Mod-N Hash

Static clusters

Memory

O(1)

Lookup

O(1)

Scale

O(n) redistribution

Consistent Hash

Dynamic clusters

Memory

O(vn)

Lookup

O(log vn)

Scale

O(k/n) movement

Jump Hash

Fixed growth

Memory

O(1)

Lookup

O(ln n)

Scale

Limited flexibility

Maglev

Load balancers

Memory

O(M)

Lookup

O(1)

Scale

O(M) rebuild

Rendezvous Hash

Small clusters

Memory

O(n)

Lookup

O(n)

Scale

O(k/n) movement

Consistent Hash

Best for: Dynamic clusters

Memory Usage

O(vn)

Lookup Performance

O(log vn)

Dynamic Operations

O(k/n) movement

Advantages

Minimal data movement
Smooth scaling
Good load distribution with virtual nodes

Limitations

More complex than mod-N
Requires virtual nodes for balance
Slightly slower lookups

When to use Consistent Hash: Ideal for distributed caches, databases, and any system where nodes frequently join or leave. Provides the best balance of features for most use cases.

Implementation

Explore our Go implementation:

📄 Source Code - Core consistent hashing with B-tree
🧪 Example Usage - Practical examples and demos
📖 Documentation - Features and best practices

Next Steps

Try our interactive hash ring demo
Implement consistent hashing in your project
Learn about Rate Limiting

Consistent Hashing

Consistent Hash Ring Visualization

Physical Nodes

Test Keys

Settings

The Problem

How Consistent Hashing Works

Virtual Nodes

Implementation with B-tree

Distribution Quality

Key Distribution Across Nodes

Real-World Usage

1. Distributed Cache (Memcached/Redis)

2. Database Sharding

3. Load Balancing

Data Migration

Adding a Node

Migration Strategies

Performance

Performance Characteristics

GetNode

AddNode

RemoveNode

GetNode Operation Details

Production Considerations

1. Choose Replica Count

2. Hash Function

3. Monitoring

Code Example

Comparison with Alternatives

Hashing Methods Comparison

Mod-N Hash

Consistent Hash

Jump Hash

Maglev

Rendezvous Hash

Consistent Hash

Advantages

Limitations

Implementation

Next Steps

References