Step 4: Scale Refinement

Step 4 of 6: R - Refinement for Scale

Optimize the architecture to handle billions of users with database sharding, global distribution, and advanced caching

🔥 Scale Challenges at 2 Billion Users

⚠️ Where Our Core Architecture Breaks

🔥 Current Bottlenecks
  • Single DynamoDB table: 40K WCU/RCU limits hit
  • Celebrity posts: 100M followers = 100M writes
  • Hot partitions: Popular users overload specific shards
  • Global latency: 500ms+ for cross-continent requests
  • Cache misses: Redis memory limits at 100TB scale
📊 Scale Requirements
  • Read QPS: 7 million/sec (peak: 10M)
  • Write QPS: 65K/sec (peak: 100K)
  • Storage: 456 PB/year growth
  • Bandwidth: 950 GB/sec egress
  • P95 Latency: Must stay under 100ms

🗄️ DynamoDB Sharding Strategy

📊 Horizontal Sharding Architecture

DynamoDB Sharding Strategy (32 Shards)

Sharding Router (Feed Service)Shard = hash(userId) % 32Posts Tables (Sharded by userId)Posts_062.5M users15M posts/day40K WCUPosts_162.5M users15M posts/day40K WCUPosts_262.5M users15M posts/day40K WCU...Posts_3162.5M users15M posts/day40K WCUFollow Tables (Sharded by followerId)Follow_0+ GSI62.5M users20K WCUFollow_1+ GSI62.5M users20K WCU...Follow_31+ GSI62.5M users20K WCUPrecomputedFeed Tables (Sharded by userId)Feed_062.5M usersTTL: 30 days100K RCUFeed_162.5M usersTTL: 30 days100K RCU...Feed_3162.5M usersTTL: 30 days100K RCUCross-Shard Query ServiceFor analytics & admin queriesElasticsearch + Aggregation layer

✅ Sharding Implementation

Shard Routing Logic:
// Deterministic shard routing
function getShardNumber(userId: string): number {
const hash = murmurHash3(userId);
return hash % 32;  // 32 total shards
}

// Service implementation
class FeedService {
async createPost(userId: string, post: Post) {
  const shard = getShardNumber(userId);
  const tableName = `Posts_${shard}`;

  await dynamodb.putItem({
    TableName: tableName,
    Item: {
      postId: generateId(),
      userId: userId,
      content: post.content,
      createdAt: Date.now()
    }
  });
}

async getFollowers(userId: string) {
  const shard = getShardNumber(userId);
  const tableName = `Follow_${shard}`;

  // Query GSI on the sharded table
  return await dynamodb.query({
    TableName: tableName,
    IndexName: 'FollowingId-FollowerId-Index',
    KeyConditionExpression: 'followingId = :userId',
    ExpressionAttributeValues: {
      ':userId': userId
    }
  });
}
}

Benefits:

  • • Each shard stays under DynamoDB limits (40K WCU/RCU)
  • • Linear scaling - add more shards as needed
  • • Fault isolation - shard failure affects only 1/32 of users
  • • Predictable performance per shard

⚠️ Sharding Challenges & Solutions

Hot Shard Problem:

Celebrity users can overload their shard. Solution: Implement virtual sharding where popular users span multiple shards.

Cross-Shard Queries:

Analytics need data from all shards. Solution: Async ETL to data warehouse (ClickHouse) for aggregated queries.

Resharding:

Growing from 32 to 64 shards requires data migration. Solution: Consistent hashing with virtual nodes for smoother transitions.

Shard Key Selection:

userId ensures all user data is co-located, but creates hotspots. Alternative: Composite keys (userId + timestamp) for time-series optimization.

🌍 Global Multi-Region Architecture

Global Distribution Strategy - 6 Regions

US East (Primary)ServicesRedisDynamoDBS3 PrimaryKafkaML ModelsShards: 0-7Users: 500MStatus: Read/WriteUS West (Secondary)ServicesRedisDynamoDBShards: 8-15Users: 400MStatus: Read/WriteEU (GDPR Primary)ServicesRedisDynamoDBShards: 16-23Users: 600MStatus: Read/WriteAsia PacificSingapore + MumbaiShards: 24-27Users: 300MStatus: Read/WriteSouth AmericaSão PauloShards: 28-29Users: 150MStatus: Read HeavyAfrica/Middle EastCape Town + DubaiShards: 30-31Users: 50MStatus: Read HeavyGlobal Load BalancerGeoDNS + AnycastGlobal CDN200+ Edge LocationsAsyncAsync

🌐 Regional Strategy

Data Residency:

  • • EU users' data stays in EU (GDPR compliance)
  • • China users served from separate infrastructure
  • • Cross-region replication for public content only

Request Routing:

  • • GeoDNS routes to nearest region
  • • Anycast IP for ultra-low latency
  • • Sticky sessions for consistency

Failover Strategy:

  • • US East → US West (10 seconds)
  • • EU → US East for non-GDPR users
  • • Health checks every 5 seconds

⚡ Cross-Region Optimization

DynamoDB Global Tables:
{
"GlobalTableName": "Posts",
"Regions": [
  {
    "RegionName": "us-east-1",
    "ReplicaStatus": "ACTIVE",
    "Role": "PRIMARY"
  },
  {
    "RegionName": "eu-west-1",
    "ReplicaStatus": "ACTIVE",
    "Role": "SECONDARY",
    "ReplicationLag": "< 1 second"
  }
],
"ConsistencyModel": "EVENTUAL",
"ConflictResolution": "LAST_WRITER_WINS"
}

Performance Targets:

  • • Intra-region latency: <50ms
  • • Cross-region replication: <1 second
  • • Global cache hit rate: >85%

🚀 Multi-Tier Caching Strategy

🏗️ Cache Architecture

L1: Application Cache
  • Location: In-memory on app servers
  • Size: 16GB per instance
  • TTL: 60 seconds
  • Content: Hot user sessions, API responses
  • Hit Rate: 30-40%
L2: Redis Cluster
  • Location: Dedicated cache tier
  • Size: 10TB per region
  • TTL: 15 minutes
  • Content: Feeds, user data, social graph
  • Hit Rate: 70-80%
L3: CDN Edge Cache
  • Location: 200+ edge locations
  • Size: 100TB globally
  • TTL: 24 hours
  • Content: Media files, static content
  • Hit Rate: 85-95%

⚡ Caching Strategies

Feed Caching Logic:
class FeedCache {
async getFeed(userId: string): Feed {
  // L1: Check application cache
  const l1Key = `feed:${userId}`;
  let feed = this.appCache.get(l1Key);
  if (feed) return feed;

  // L2: Check Redis
  const l2Key = `feed:${userId}`;
  feed = await this.redis.get(l2Key);
  if (feed) {
    this.appCache.set(l1Key, feed, 60);
    return feed;
  }

  // Generate feed (cache miss)
  feed = await this.generateFeed(userId);

  // Write-through caching
  await Promise.all([
    this.redis.setex(l2Key, 900, feed),  // 15 min
    this.appCache.set(l1Key, feed, 60)   // 1 min
  ]);

  return feed;
}

async invalidateFeed(userId: string) {
  // Cascade invalidation
  const l1Key = `feed:${userId}`;
  const l2Key = `feed:${userId}`;

  this.appCache.delete(l1Key);
  await this.redis.del(l2Key);

  // Pre-warm cache for active users
  if (await this.isActiveUser(userId)) {
    const feed = await this.generateFeed(userId);
    await this.redis.setex(l2Key, 900, feed);
  }
}
}

Cache Warming:

  • • Pre-compute feeds for users active in last hour
  • • Celebrity posts cached immediately
  • • Trending content pre-cached globally

Smart Eviction:

  • • LRU with frequency bias
  • • Celebrity content never evicted
  • • Predictive eviction based on user patterns

📊 Cache Performance Impact

Response Time
45ms
P95 with caching
vs 200ms without
Database Load
-85%
Reduction in DB queries
7M → 1M QPS
Cost Savings
$2.5M
Monthly savings
Reduced DB capacity

🌟 Celebrity User Optimization

🎯 The Celebrity Problem at Scale

Scenario: Cristiano Ronaldo (600M followers) posts a photo

Without Optimization:

  • • 600M write operations to PrecomputedFeed
  • • 600M notifications to send
  • • ~10 minutes of 100% CPU across cluster
  • • $500 in immediate compute costs
  • • System-wide latency spike

Solution: Hybrid Pull + Heavy Caching

With Optimization:

  • • 0 immediate writes (pull model)
  • • Post cached in all regions
  • • Push only to verified/VIP followers
  • • Batch notifications over 30 minutes
  • • <$5 in compute costs
Celebrity Detection & Routing:
class CelebrityOptimizer {
// Celebrity threshold configuration
private readonly CELEBRITY_THRESHOLD = 100000;
private readonly MEGA_CELEBRITY_THRESHOLD = 10000000;

async handlePost(userId: string, post: Post) {
  const user = await this.userService.getUser(userId);

  if (user.followerCount >= this.MEGA_CELEBRITY_THRESHOLD) {
    return this.handleMegaCelebrityPost(user, post);
  } else if (user.followerCount >= this.CELEBRITY_THRESHOLD) {
    return this.handleCelebrityPost(user, post);
  } else {
    return this.handleRegularPost(user, post);
  }
}

async handleMegaCelebrityPost(user: User, post: Post) {
  // 1. Store post (no fanout)
  await this.postService.create(post);

  // 2. Cache aggressively
  await Promise.all([
    this.cacheGlobally(post, 86400),     // 24 hour TTL
    this.indexForSearch(post),
    this.addToTrendingPosts(post)
  ]);

  // 3. Notify only VIP followers (< 1000)
  const vipFollowers = await this.getVipFollowers(user.id);
  await this.pushToFeeds(vipFollowers, post);

  // 4. Batch notify regular followers
  await this.queueBatchNotifications(user.id, post.id);
}

async handleCelebrityPost(user: User, post: Post) {
  // Hybrid approach
  const activeFollowers = await this.getActiveFollowers(user.id, 24); // Active in last 24h

  // Push to active followers only (typically 10-20% of total)
  await this.pushToFeeds(activeFollowers, post);

  // Cache for pull model
  await this.cacheRegionally(post, 3600); // 1 hour TTL
}

private async cacheGlobally(post: Post, ttl: number) {
  const regions = ['us-east', 'us-west', 'eu', 'asia', 'sa', 'africa'];

  await Promise.all(regions.map(region =>
    this.regionalCache[region].setex(
      `celebrity:post:${post.id}`,
      ttl,
      post
    )
  ));
}
}

✅ Optimization Techniques

  • Dedicated Infrastructure: Celebrity content on separate shards
  • Pre-caching: Celebrity posts cached before viral
  • Rate Limiting: Throttle fanout to prevent overload
  • Prioritization: VIP followers get updates first
  • Lazy Loading: Regular users pull on-demand

📊 Performance Impact

  • Write Reduction: 99.9% fewer writes for celebrities
  • Latency: No impact on regular users
  • Cost Savings: $5M/month in compute
  • Scalability: Can handle users with 1B+ followers
  • Reliability: No system-wide failures from viral posts

📊 Performance Monitoring & SLAs

Feed Load Time

45ms
P95 Latency
Target: <100ms

Availability

99.99%
Uptime SLA
4.38 min/month

Cache Hit Rate

87%
Global Average
Target: >85%

Error Rate

0.001%
Request Failures
Target: <0.01%

🔍 Monitoring Stack

Metrics Collection
  • Prometheus: Time-series metrics
  • Grafana: Real-time dashboards
  • CloudWatch: AWS resource monitoring
  • Custom Metrics: Business KPIs
Distributed Tracing
  • Jaeger: End-to-end request tracing
  • X-Ray: AWS service maps
  • OpenTelemetry: Standard instrumentation
  • Correlation IDs: Cross-service tracking
Alerting & Response
  • PagerDuty: On-call management
  • Severity Levels: P0-P3 classification
  • Auto-remediation: Self-healing systems
  • Runbooks: Automated response

💰 Cost Optimization at Scale

📊 Monthly Cost Breakdown (2B Users)

DynamoDB (32 shards)$1.2M
Redis Cache Clusters$800K
S3 Storage (456 PB)$9.5M
CDN (CloudFront)$2.5M
Compute (EC2/Lambda)$3.2M
Data Transfer$1.8M
ML/Analytics$1.0M
Total Monthly Cost$20M

Cost per DAU: $0.01/month

💡 Optimization Strategies

Storage Optimization:

  • • S3 Intelligent Tiering: Save 40% on cold data
  • • Media compression: 30% size reduction
  • • Deduplication: 15% storage savings

Compute Optimization:

  • • Spot instances for batch jobs: 70% savings
  • • Reserved instances: 40% discount
  • • Auto-scaling: Right-size capacity

Database Optimization:

  • • On-demand → Provisioned capacity: 50% savings
  • • TTL for old data: Reduce storage 30%
  • • Query optimization: Reduce read costs 25%

Potential Savings: $6M/month (30%)

🎯 Key Scale Refinements

✅ Database Sharding

32 shards handle 2B users with room for growth to 64 shards

✅ Global Distribution

6 regions provide <50ms latency for 95% of users worldwide

✅ Multi-Tier Caching

87% cache hit rate reduces database load by 85%

✅ Celebrity Optimization

Pull model for celebrities prevents system overload

✅ Performance Monitoring

Real-time dashboards ensure 99.99% availability SLA

✅ Cost Optimization

$0.01 per DAU/month through intelligent resource management

🔮 Coming Up Next

With our architecture optimized for scale, we'll address edge cases and failure scenarios:

  • Network partitions - CAP theorem trade-offs and eventual consistency
  • Cascading failures - Circuit breakers and graceful degradation
  • Data corruption - Recovery strategies and data integrity checks
  • Security breaches - Privacy controls and data protection
  • Disaster recovery - Multi-region failover and backup strategies