Step 3: Core Architecture

Step 3 of 6: C - Core High-Level Design

Build the fundamental microservices architecture to satisfy functional requirements

šŸ—ļø Facebook Feed System Architecture

Facebook Feed - Microservices Architecture

Client ApplicationsMobile AppWeb BrowserThird PartyCDN (CloudFront)Media DeliveryAPI Gateway (Load Balancer)MicroservicesPost Service• Create/Edit Posts• Media Upload• Content ValidationFeed Service• Timeline Generation• Push/Pull Logic• Feed AggregationFollow Service• Follow/Unfollow• Relationship Mgmt• Social GraphRanking Service• ML Model Scoring• Content Relevance• PersonalizationNotification Service• WebSocket• Push Notifications• Real-time UpdatesMessage Queue (Kafka)Data LayerDynamoDBPosts500M/dayFollow+ GSIFeedPrecomputedRedis ClusterHot Feeds & SessionsS3 StorageMedia FilesElasticsearchContent SearchClickHouseAnalytics

šŸ“± Client Layer

Mobile, web, and third-party applications connect through a unified API Gateway for authentication and routing.

āš™ļø Service Layer

Independent microservices handle specific domains: posts, feeds, follows, ranking, and notifications.

šŸ’¾ Data Layer

Multi-database approach: DynamoDB for core data, Redis for caching, S3 for media, Elasticsearch for search.

šŸŽÆ The Heart of Social Media: Push vs Pull Feed Models

šŸ¤” The Core Dilemma

When Alice posts a photo, how do we get it into Bob's feed? This is the fundamental challenge that shapes the entire architecture. Do we push the post to all followers' feeds immediately, or wait for Bob to pull posts when he opens the app?

The Scale Problem:

With 2 billion users, 500M posts/day, and some users having 100M+ followers, the wrong choice can bring down the entire system.

šŸ“¤ Push Model (Write-Heavy)

How it Works:
  1. Alice posts: "Beautiful sunset today! šŸŒ…"
  2. Query followers: GSI lookup → [bob, carol, dave, ...]
  3. Fanout writes: Insert into each follower's PrecomputedFeed table
  4. Notification: Send real-time updates via WebSocket
Example Data Flow:
// 1. Alice posts (postId: post_123)
POST /api/posts

// 2. Feed Service queries followers
GSI Query: followingId = "alice"
→ Result: [bob, carol, dave]

// 3. Fanout to PrecomputedFeed table
Write to:
- userId: bob,   postId: post_123, score: 0.95
- userId: carol, postId: post_123, score: 0.92
- userId: dave,  postId: post_123, score: 0.88

// 4. Bob opens app → Fast read from his feed
Query: userId = "bob" → Gets pre-computed posts
āœ… Advantages
  • • Fast reads (<50ms)
  • • Predictable latency
  • • Pre-ranked content
  • • Offline feed available
āŒ Disadvantages
  • • Write amplification
  • • Storage expensive
  • • Celebrity problem
  • • Stale feed data

šŸ“„ Pull Model (Read-Heavy)

How it Works:
  1. Bob opens app: Requests his timeline
  2. Query following: Primary table lookup → [alice, carol, eve]
  3. Gather posts: Get recent posts from each user he follows
  4. Merge & rank: Combine and sort by relevance algorithm
Example Data Flow:
// 1. Bob opens feed
GET /api/feed/timeline

// 2. Feed Service queries who Bob follows
Primary Query: followerId = "bob"
→ Result: [alice, carol, eve]

// 3. Get recent posts from each
Parallel queries:
- Posts by alice (last 24hrs)
- Posts by carol (last 24hrs)
- Posts by eve (last 24hrs)

// 4. Merge, rank, and return
Ranking Service scores posts → Final feed
āœ… Advantages
  • • No write amplification
  • • Fresh content
  • • Handles celebrities
  • • Lower storage costs
āŒ Disadvantages
  • • Slow reads (200-500ms)
  • • Complex aggregation
  • • High CPU usage
  • • No offline feeds

šŸŽÆ Hybrid Strategy: Best of Both Worlds

šŸ“Š User Classification Strategy

Regular Users (<1K followers)

  • • Use Push Model
  • • Pre-compute feeds for all followers
  • • Fast reads, acceptable write cost

Influencers (1K - 100K followers)

  • • Smart Push - Active users only
  • • Push to users active in last 24h
  • • Pull for inactive users

Celebrities (>100K followers)

  • • Use Pull Model
  • • Heavy caching of their posts
  • • On-demand feed generation
⚔ Smart Optimizations

Cache Layer Strategy

  • • Hot feeds in Redis (active users)
  • • Celebrity posts cached for 24h
  • • Popular content pre-warmed

Async Processing

  • • Kafka for fanout jobs
  • • Background feed pre-computation
  • • Batch processing for efficiency

ML-Based Ranking

  • • Real-time scoring via Ranking Service
  • • Personalized relevance models
  • • A/B testing for algorithms
šŸŽÆ Production Decision Logic:
// Feed Generation Strategy Selection
function decideFeedStrategy(user) {
if (user.followerCount < 1000) {
  return PUSH_MODEL;
} else if (user.followerCount < 100000) {
  return SMART_PUSH;  // Active followers only
} else {
  return PULL_MODEL;  // Heavy caching
}
}

// Feed Retrieval Strategy
function getFeed(userId) {
const cachedFeed = redis.get(`feed:${userId}`);
if (cachedFeed) return cachedFeed;

const followingList = getFollowing(userId);
const celebrityPosts = getCelebrityPosts(followingList);
const regularPosts = getPrecomputedFeed(userId);

return rankingService.merge(celebrityPosts, regularPosts);
}

🧩 Microservices Deep Dive

šŸ“ Post Service

šŸŽÆ Core Responsibilities
  • • Post Creation: Handle text, image, and video posts
  • • Media Processing: Resize images, compress videos
  • • Content Validation: Spam detection, policy enforcement
  • • Privacy Controls: Public/private post settings
  • • Edit/Delete: Post modification and removal
šŸ”„ Workflow Example
POST /api/posts
{
"content": "Amazing day at the beach!",
"mediaUrls": ["beach.jpg"],
"privacy": "public"
}

// Post Service Flow:
1. Validate content & user permissions
2. Generate unique postId
3. Upload media to S3 → get URLs
4. Store post in DynamoDB Posts table
5. Publish "PostCreated" event to Kafka
6. Return success response

// Kafka Event Published:
{
"eventType": "PostCreated",
"postId": "post_123",
"userId": "alice",
"timestamp": "2024-01-15T10:30:00Z"
}

šŸ“° Feed Service (The Brain)

🧠 Core Responsibilities
  • • Timeline Generation: Create personalized feeds
  • • Fanout Logic: Push posts to followers' feeds
  • • Feed Aggregation: Pull and merge posts on-demand
  • • Cache Management: Hot feed caching in Redis
  • • Real-time Updates: Live feed refresh
āš™ļø Event Processing
// Kafka Consumer: PostCreated Event
onPostCreated(event) {
const { postId, userId } = event;

// Determine strategy based on user type
const user = await getUserProfile(userId);

if (user.followerCount < 1000) {
  // Push Model: Fanout to all followers
  const followers = await getFollowers(userId);
  for (const follower of followers) {
    await precomputedFeedTable.put({
      userId: follower.id,
      postId: postId,
      createdAt: Date.now(),
      score: calculateScore(post, follower)
    });
  }
} else {
  // Pull Model: Cache post for on-demand retrieval
  await redis.set(`post:${postId}`, post, 3600);
}
}

šŸ‘„ Follow Service

šŸ”— Core Responsibilities
  • • Follow/Unfollow: Manage user relationships
  • • Social Graph: Query followers and following lists
  • • Relationship Status: Check follow status between users
  • • Privacy Controls: Handle private accounts and blocks
  • • Batch Operations: Mass follow/unfollow processing
šŸ“Š DynamoDB Integration
// Follow Operation
POST /api/users/bob/follow

// Follow Service Implementation:
async followUser(followerId, followingId) {
// 1. Write to Follow table
await followTable.put({
  followerId: followerId,    // PK
  followingId: followingId,  // SK
  status: 'active',
  createdAt: Date.now()
});

// 2. Update follower counts (async)
await updateUserStats(followingId, 'followers', +1);
await updateUserStats(followerId, 'following', +1);

// 3. Publish event for feed updates
await kafka.publish('UserFollowed', {
  followerId, followingId, timestamp: Date.now()
});
}

šŸŽÆ Ranking Service (ML-Powered)

šŸ¤– Core Responsibilities
  • • Content Scoring: ML model-based post relevance
  • • Personalization: User preference learning
  • • Engagement Prediction: Likelihood to interact
  • • Content Freshness: Time-decay scoring
  • • A/B Testing: Algorithm experimentation
šŸ“Š Scoring Algorithm
// Ranking Score Calculation
function calculateScore(post, user) {
const features = {
  // Content features
  postType: post.type,
  hasMedia: post.mediaUrls.length > 0,
  contentLength: post.content.length,

  // Social features
  authorFollowers: post.author.followerCount,
  userFollowsAuthor: user.following.includes(post.authorId),
  mutualFriends: getMutualFriends(user.id, post.authorId),

  // Engagement features
  likeCount: post.likeCount,
  commentCount: post.commentCount,
  shareCount: post.shareCount,

  // Temporal features
  timeSincePost: Date.now() - post.createdAt,
  timeOfDay: new Date().getHours()
};

// ML Model Inference
return mlModel.predict(features);
}

šŸ”” Notification Service

šŸ“” Core Responsibilities
  • • WebSocket Management: Real-time connections
  • • Push Notifications: Mobile APNs/FCM delivery
  • • Event Broadcasting: Live feed updates
  • • Presence Tracking: Online/offline status
  • • Notification Preferences: User-specific settings
⚔ Real-time Updates
// WebSocket Event Broadcasting
class NotificationService {
// When post is created
onPostCreated(event) {
  const { postId, authorId } = event;

  // Get online followers
  const onlineFollowers = await getOnlineFollowers(authorId);

  // Send real-time update
  const message = {
    type: 'NEW_POST',
    postId: postId,
    author: authorId,
    timestamp: Date.now()
  };

  // Broadcast via WebSocket
  onlineFollowers.forEach(follower => {
    this.webSocketManager.send(follower.connectionId, message);
  });

  // Send push notification to offline users
  const offlineFollowers = await getOfflineFollowers(authorId);
  await this.pushNotificationService.send(offlineFollowers, {
    title: `New post from ${author.name}`,
    body: post.content.substring(0, 100)
  });
}
}

šŸ”„ Complete Data Flow Examples

šŸ“ Post Creation Flow

Step-by-Step Process:
1

Client Request: Alice uploads "Beautiful sunset! šŸŒ…" with photo

POST /api/posts → API Gateway → Post Service

2

Media Processing: Upload image to S3, generate thumbnails

Post Service → S3 → CDN integration

3

Data Persistence: Store post metadata in DynamoDB

Post Service → DynamoDB Posts table

4

Event Publishing: Trigger fanout process

Post Service → Kafka "PostCreated" event

5

Feed Fanout: Update followers' feeds based on strategy

Feed Service consumes Kafka event

6

Real-time Notifications: Notify online followers

Notification Service → WebSocket/Push notifications

Performance Metrics:

Post Creation: ~150ms

Including S3 upload

Fanout Processing: ~2-5 seconds

Async via Kafka

Real-time Delivery: ~100ms

WebSocket notification

šŸ“± Feed Retrieval Flow

Step-by-Step Process:
1

Client Request: Bob opens his feed

GET /api/feed/timeline → API Gateway → Feed Service

2

Cache Check: Look for pre-computed feed in Redis

Feed Service → Redis cache lookup

3

Feed Generation: If cache miss, generate feed using hybrid strategy

Query PrecomputedFeed + Pull celebrity posts

4

Content Ranking: Score and sort posts by relevance

Feed Service → Ranking Service ML models

5

Response Assembly: Enrich posts with user data and media URLs

Parallel queries to User Service, CDN URLs

6

Cache Update: Store generated feed for future requests

Feed Service → Redis cache (15min TTL)

Performance Scenarios:

Cache Hit (90% of requests): ~50ms

Direct Redis retrieval

Cache Miss (10% of requests): ~200ms

Full feed generation + ranking

šŸŽÆ Key Architecture Decisions

āœ… Microservices Design

Domain-driven service boundaries enable independent scaling and team ownership

āœ… Hybrid Feed Strategy

User-based routing between push and pull models optimizes for both performance and cost

āœ… Event-Driven Architecture

Kafka enables loose coupling and async processing of high-volume social interactions

āœ… Multi-Database Strategy

Right database for each use case: DynamoDB for scale, Redis for speed, S3 for media

āœ… Real-time Layer

WebSocket connections and push notifications provide instant social interactions

āœ… ML-Powered Ranking

Dedicated ranking service enables sophisticated personalization algorithms

šŸ”® Coming Up Next

Our core architecture handles the functional requirements well. Next, we'll optimize for massive scale:

  • • Database sharding - Horizontal scaling across multiple DynamoDB tables
  • • Global distribution - Multi-region deployment and data replication
  • • Caching strategies - Multi-tier caching for sub-100ms responses
  • • Celebrity optimizations - Special handling for high-follower accounts
  • • Performance monitoring - SLAs, alerting, and cost optimization