Distributed Unique ID Generator

Designing a distributed ID generator that can produce unique, ordered identifiers at massive scale. Think Twitter's Snowflake, Instagram's ID generation, or Discord's unique message IDs.

System Requirements

Functional Requirements

•Uniqueness: IDs must be globally unique across all servers
•Numerical: IDs are numerical values only (primary requirement)
•64-bit constraint: Must fit in a 64-bit integer (long/bigint)
•Time-ordered: IDs generated later have larger values
•High throughput: Generate 10,000+ unique IDs per second
•Low latency: ID generation should be fast (~1ms)

Why Not Use Traditional Approaches?

❌ Auto-increment Database ID

• Single point of failure
• Doesn't scale horizontally
• Database bottleneck at high throughput
• Hard to shard

❌ UUID/GUID

• 128-bit (doesn't fit 64-bit requirement)
• Not time-ordered (UUID v4)
• Poor index performance
• Not sequential

❌ Multi-Master Replication

• ID collision risk across masters
• Complex conflict resolution
• Network partition issues
• Synchronization overhead

🔍 Why Multi-Master Fails at Scale

Scenario: Two masters generating auto-increment IDs independently:

Master A: 1001, 1002, 1003...
Master B: 1001, 1002, 1003...
Result: Duplicate IDs! 💥

Common "Solutions" and Their Limitations:

• Increment by K (K = number of servers):

Master A: 1, 4, 7, 10... (start=1, increment=3)
Master B: 2, 5, 8, 11... (start=2, increment=3)
Master C: 3, 6, 9, 12... (start=3, increment=3)
✓ No collisions, but ✗ Adding/removing servers is complex
✗ IDs reveal infrastructure (3→6→9 shows 3 servers)
✗ Gaps grow with more servers (100 servers = increment by 100)

• ID ranges: Master A: 1-1M, Master B: 1M-2M (range exhaustion, rebalancing needed)

• Coordination: Masters sync before generating (network latency kills throughput)

• Central allocator: Request ID blocks from central service (single point of failure)

The Real Problem: While increment-by-K prevents collisions, it doesn't meet our requirement of time-ordered IDs. IDs from different servers interleave randomly:

Time order: A₁(t=1), B₁(t=2), A₂(t=3), C₁(t=4), B₂(t=5)
Actual IDs: 1, 2, 4, 3, 5
Result: IDs are NOT time-ordered! 🔀

Approach 1: Twitter Snowflake

❄️ Snowflake Algorithm (Most Popular)

Twitter's Snowflake packs multiple components into a 64-bit integer to guarantee uniqueness and ordering.

Snowflake Bit Layout

64-bit ID Structure:
┌─────────────────────────────────────────────────────────────────┐
│ 0 │   41 bits timestamp   │ 10 bits machine │ 12 bits sequence │
└─────────────────────────────────────────────────────────────────┘
  ↑            ↑                      ↑                ↑
Sign    Milliseconds since     Data center +    Sequence counter
bit     custom epoch           Worker ID         (per millisecond)

Example ID: 412376492842516481
Binary:     0|10110110010110100000101111000100001|01001|00101|000000000001
            ↑ └────────────41 bits──────────────┘└─5──┘└─5──┘└───12────┘

Component Breakdown

Component	Bits	Range	Purpose
Sign bit	1 bit	Always 0	Keep number positive
Timestamp	41 bits	~69 years	Milliseconds since epoch (ensures ordering)
Machine ID	10 bits	1024 machines	Unique server identifier
Sequence	12 bits	4096/ms	Counter for IDs in same millisecond

Implementation

import time
import threading

class SnowflakeIDGenerator:
    def __init__(self, datacenter_id, worker_id):
        # Bit allocation
        self.TIMESTAMP_BITS = 41
        self.DATACENTER_BITS = 5
        self.WORKER_BITS = 5
        self.SEQUENCE_BITS = 12
        
        # Max values
        self.MAX_DATACENTER_ID = (1 << self.DATACENTER_BITS) - 1  # 31
        self.MAX_WORKER_ID = (1 << self.WORKER_BITS) - 1          # 31
        self.MAX_SEQUENCE = (1 << self.SEQUENCE_BITS) - 1         # 4095
        
        # Bit shift amounts
        self.WORKER_SHIFT = self.SEQUENCE_BITS                    # 12
        self.DATACENTER_SHIFT = self.SEQUENCE_BITS + self.WORKER_BITS  # 17
        self.TIMESTAMP_SHIFT = self.DATACENTER_SHIFT + self.DATACENTER_BITS  # 22
        
        # Custom epoch (January 1, 2020)
        self.EPOCH = 1577836800000
        
        # Instance configuration
        if datacenter_id > self.MAX_DATACENTER_ID or datacenter_id < 0:
            raise ValueError(f"Datacenter ID must be between 0 and {self.MAX_DATACENTER_ID}")
        if worker_id > self.MAX_WORKER_ID or worker_id < 0:
            raise ValueError(f"Worker ID must be between 0 and {self.MAX_WORKER_ID}")
            
        self.datacenter_id = datacenter_id
        self.worker_id = worker_id
        self.sequence = 0
        self.last_timestamp = -1
        self.lock = threading.Lock()
    
    def _current_millis(self):
        return int(time.time() * 1000)
    
    def _wait_next_millis(self, last_timestamp):
        timestamp = self._current_millis()
        while timestamp <= last_timestamp:
            timestamp = self._current_millis()
        return timestamp
    
    def generate(self):
        with self.lock:
            timestamp = self._current_millis()
            
            # Clock moved backwards - wait
            if timestamp < self.last_timestamp:
                raise Exception(f"Clock moved backwards. Refusing to generate ID for {self.last_timestamp - timestamp} ms")
            
            # Same millisecond - increment sequence
            if timestamp == self.last_timestamp:
                self.sequence = (self.sequence + 1) & self.MAX_SEQUENCE
                # Sequence overflow - wait for next millisecond
                if self.sequence == 0:
                    timestamp = self._wait_next_millis(self.last_timestamp)
            else:
                # New millisecond - reset sequence
                self.sequence = 0
            
            self.last_timestamp = timestamp
            
            # Generate ID by combining all components
            id_value = ((timestamp - self.EPOCH) << self.TIMESTAMP_SHIFT) | \
                       (self.datacenter_id << self.DATACENTER_SHIFT) | \
                       (self.worker_id << self.WORKER_SHIFT) | \
                       self.sequence
            
            return id_value

# Usage
generator = SnowflakeIDGenerator(datacenter_id=1, worker_id=1)
for _ in range(5):
    print(generator.generate())
# Output: 
# 412376492842516481
# 412376492842516482
# 412376492842516483

Snowflake Analysis

⚠️ The 69-Year Limit: What Happens After?

Question: What happens when the 41-bit timestamp overflows after ~69 years?

The Problem: Simply resetting to a new epoch would cause ID collisions - the same IDs would be generated again!

2020-2089: [timestamp=1][machine=1][seq=0] = 4096
2089-2158: [timestamp=1][machine=1][seq=0] = 4096 💥 Collision!

Real-World Solutions:

• Most common: Migrate to new system before 69 years (average system lifespan: 10-15 years)
• Version prefix: Add epoch version outside 64-bit (v1_id, v2_id)
• Reserve bits: Use 2 bits for epoch version (4 epochs × 17 years each)

Reality: Twitter (2010 epoch) has until 2079. They'll likely migrate to a completely different system decades before then.

✅ Advantages

• Time-ordered (sortable)
• Highly scalable (1024 machines)
• No network calls needed
• 4096 IDs per millisecond per machine
• Works for 69 years from epoch

⚠️ Considerations

• Requires machine ID coordination
• Clock synchronization critical
• Clock rollback handling needed
• ID reveals timestamp (privacy)

Approach 2: Database Ticket Server

🎫 Centralized Ticket Server

Flickr's approach: Use dedicated MySQL servers to generate IDs with auto-increment.

Architecture

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│ App Server 1 │     │ App Server 2 │     │ App Server 3 │
└──────┬───────┘     └──────┬───────┘     └──────┬───────┘
       │                    │                    │
       └────────────────────┼────────────────────┘
                           │
                    ┌──────▼──────┐
                    │Load Balancer│
                    └──────┬──────┘
                           │
            ┌──────────────┼──────────────┐
            │                             │
    ┌───────▼────────┐          ┌────────▼───────┐
    │ Ticket Server 1│          │ Ticket Server 2│
    │  (Odd IDs)     │          │  (Even IDs)    │
    │  AUTO_INCREMENT│          │  AUTO_INCREMENT│
    │  increment = 2 │          │  increment = 2 │
    └────────────────┘          └────────────────┘

MySQL Configuration

-- Ticket Server 1 (generates odd IDs)
CREATE TABLE tickets64 (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    stub CHAR(1) NOT NULL DEFAULT '',
    PRIMARY KEY (id),
    UNIQUE KEY (stub)
);

ALTER TABLE tickets64 AUTO_INCREMENT = 1;
SET @@auto_increment_increment = 2;
SET @@auto_increment_offset = 1;

-- Ticket Server 2 (generates even IDs)
ALTER TABLE tickets64 AUTO_INCREMENT = 2;
SET @@auto_increment_increment = 2;
SET @@auto_increment_offset = 2;

-- To get a new ID:
REPLACE INTO tickets64 (stub) VALUES ('a');
SELECT LAST_INSERT_ID();

✅ Advantages

• Simple implementation
• IDs are roughly time-ordered
• Works with any MySQL setup
• No clock sync issues

❌ Disadvantages

• Network call for every ID
• Single point of failure risk
• Database becomes bottleneck
• Not suitable for 10K+ IDs/sec

Approach 3: UUID with Timestamp (UUIDv7)

🔑 Time-Ordered UUID (128-bit Alternative)

📝 Note: This doesn't fit the 64-bit requirement but is included for completeness as it's widely used when 128-bit is acceptable.

UUIDv7 Structure (Time-ordered):
┌────────────────────────────────────────────────────────────────┐
│  48-bit timestamp  │ 4-bit │  12-bit random  │ 2-bit │ 62-bit │
│   (milliseconds)   │version│    or counter   │variant│ random │
└────────────────────────────────────────────────────────────────┘

Example: 018e3f7a-5ec9-7a2b-9e4c-4f2b6c3d8e9f
         └─timestamp─┘└ver┘└─────random──────┘

Approach 4: Alphanumeric IDs

🔤 Alphanumeric ID Generation

When alphanumeric IDs are acceptable, we gain more density and flexibility.

Approach 4a: Encoded Snowflake

def generate_alphanumeric_id():
    # Generate numeric Snowflake ID
    numeric_id = snowflake_generator.generate()  # e.g., 412376492842516481
    
    # Encode to Base62 for URL-safe alphanumeric
    base62_id = base62_encode(numeric_id)  # e.g., "3kQ9mPx"
    
    return base62_id

# Benefits:
# - 7-8 characters instead of 19 digits
# - Still time-ordered (underlying Snowflake)
# - URL-safe without encoding

Approach 4b: Nano ID

import string
import secrets
import time

class NanoIDGenerator:
    def __init__(self, alphabet=None, size=21):
        self.alphabet = alphabet or string.ascii_letters + string.digits + '-_'
        self.size = size
        
    def generate(self):
        # Add timestamp prefix for ordering (optional)
        timestamp_part = base62_encode(int(time.time() * 1000))
        
        # Random suffix for uniqueness
        random_part = ''.join(secrets.choice(self.alphabet) 
                             for _ in range(self.size - len(timestamp_part)))
        
        return timestamp_part + random_part

# Example: "1Ky8_dXl9pRb2C4mNvQ7F"

Approach 4c: KSUID (K-Sortable Unique ID)

KSUID Structure (160 bits = 20 bytes):
┌─────────────────────────────────────────┐
│ 32-bit timestamp │ 128-bit random payload│
│  (seconds since  │   (16 random bytes)   │
│   custom epoch)  │                       │
└─────────────────────────────────────────┘

Base62 encoded: 27 characters
Example: "2LjpHjBdPXKDSJ4gVcNvYjZ9MqT"

Comparison Matrix

Approach	64-bit	Ordered	Throughput	Complexity	Use Case
Snowflake	✅ Yes	✅ Yes	4M/sec	Medium	Twitter, Discord, Instagram
Ticket Server	✅ Yes	⚠️ Rough	10K/sec	Simple	Flickr, Small scale
UUIDv7	❌ 128-bit	✅ Yes	Unlimited	Simple	When 128-bit OK
Base62 Snowflake	✅ Yes*	✅ Yes	4M/sec	Medium	URL shorteners
KSUID	❌ 160-bit	✅ Yes	Unlimited	Simple	Segment, Stripe

Real-World Implementations

🐦 Twitter Snowflake

Original Snowflake implementation for tweet IDs

41 bits: timestamp
10 bits: machine ID
12 bits: sequence
= 64-bit sortable IDs

📸 Instagram

Modified Snowflake with PostgreSQL

41 bits: timestamp
13 bits: shard ID
10 bits: sequence
Uses PostgreSQL schemas

💬 Discord

Snowflake for messages, channels, users

Discord epoch: 2015-01-01
5 bits: worker ID
5 bits: process ID
Visible in all Discord IDs

🔧 MongoDB ObjectId

96-bit IDs with embedded timestamp

32 bits: Unix timestamp
40 bits: random value
24 bits: counter
Hex encoded: 24 chars

Handling Edge Cases

⚠️ Critical Considerations

Clock Synchronization

• Use NTP to keep clocks in sync
• Detect clock rollback and refuse to generate
• Consider monotonic clocks
• Add jitter buffer (wait if clock moves back)

Machine ID Assignment

• Use ZooKeeper for dynamic assignment
• Environment variables for static assignment
• IP address hashing (risk of collisions)
• Container orchestrator labels (K8s)

Sequence Overflow

• Wait for next millisecond if sequence exhausted
• Pre-generate ID pools during low traffic
• Use larger sequence bits if needed
• Monitor sequence usage patterns

Final Recommendation

🎯 For Your Requirements

Given the requirements (64-bit, numerical, ordered, 10K+ IDs/sec), Snowflake is the best choice.

Implementation Plan:

1. Use 41-bit timestamp (69 years from custom epoch)
2. Allocate 10 bits for machine ID (1024 servers)
3. Use 12-bit sequence (4096 IDs/ms = 4M IDs/sec per machine)
4. Implement clock rollback protection
5. Use ZooKeeper/etcd for machine ID coordination
6. For alphanumeric: Encode Snowflake IDs with Base62

Summary

Distributed ID generation is a fundamental problem with multiple solutions. Snowflake provides the best balance of:

✅ Fits in 64-bit
✅ Time-ordered for database indexing
✅ No coordination needed after setup
✅ Scales to thousands of machines
✅ Proven at Twitter, Discord, Instagram scale

For alphanumeric IDs, encoding Snowflake IDs with Base62 gives you shorter, URL-safe identifiers while maintaining all the benefits.

💡 Pro Tip: Start with Snowflake for numerical IDs. If you need shorter strings for URLs or user-facing IDs, Base62-encode them. This gives you the best of both worlds: sortable 64-bit integers in your database and compact strings for your users.