Step 3: Core Architecture
Step 3 of 6: C - Core High-Level Design
Build the fundamental microservices architecture to satisfy functional requirements
šļø Facebook Feed System Architecture
š± Client Layer
Mobile, web, and third-party applications connect through a unified API Gateway for authentication and routing.
āļø Service Layer
Independent microservices handle specific domains: posts, feeds, follows, ranking, and notifications.
š¾ Data Layer
Multi-database approach: DynamoDB for core data, Redis for caching, S3 for media, Elasticsearch for search.
šÆ The Heart of Social Media: Push vs Pull Feed Models
š¤ The Core Dilemma
When Alice posts a photo, how do we get it into Bob's feed? This is the fundamental challenge that shapes the entire architecture. Do we push the post to all followers' feeds immediately, or wait for Bob to pull posts when he opens the app?
The Scale Problem:
With 2 billion users, 500M posts/day, and some users having 100M+ followers, the wrong choice can bring down the entire system.
š¤ Push Model (Write-Heavy)
How it Works:
- Alice posts: "Beautiful sunset today! š "
- Query followers: GSI lookup ā [bob, carol, dave, ...]
- Fanout writes: Insert into each follower's PrecomputedFeed table
- Notification: Send real-time updates via WebSocket
Example Data Flow:
// 1. Alice posts (postId: post_123) POST /api/posts // 2. Feed Service queries followers GSI Query: followingId = "alice" ā Result: [bob, carol, dave] // 3. Fanout to PrecomputedFeed table Write to: - userId: bob, postId: post_123, score: 0.95 - userId: carol, postId: post_123, score: 0.92 - userId: dave, postId: post_123, score: 0.88 // 4. Bob opens app ā Fast read from his feed Query: userId = "bob" ā Gets pre-computed posts
ā Advantages
- ⢠Fast reads (<50ms)
- ⢠Predictable latency
- ⢠Pre-ranked content
- ⢠Offline feed available
ā Disadvantages
- ⢠Write amplification
- ⢠Storage expensive
- ⢠Celebrity problem
- ⢠Stale feed data
š„ Pull Model (Read-Heavy)
How it Works:
- Bob opens app: Requests his timeline
- Query following: Primary table lookup ā [alice, carol, eve]
- Gather posts: Get recent posts from each user he follows
- Merge & rank: Combine and sort by relevance algorithm
Example Data Flow:
// 1. Bob opens feed GET /api/feed/timeline // 2. Feed Service queries who Bob follows Primary Query: followerId = "bob" ā Result: [alice, carol, eve] // 3. Get recent posts from each Parallel queries: - Posts by alice (last 24hrs) - Posts by carol (last 24hrs) - Posts by eve (last 24hrs) // 4. Merge, rank, and return Ranking Service scores posts ā Final feed
ā Advantages
- ⢠No write amplification
- ⢠Fresh content
- ⢠Handles celebrities
- ⢠Lower storage costs
ā Disadvantages
- ⢠Slow reads (200-500ms)
- ⢠Complex aggregation
- ⢠High CPU usage
- ⢠No offline feeds
šÆ Hybrid Strategy: Best of Both Worlds
š User Classification Strategy
Regular Users (<1K followers)
- ⢠Use Push Model
- ⢠Pre-compute feeds for all followers
- ⢠Fast reads, acceptable write cost
Influencers (1K - 100K followers)
- ⢠Smart Push - Active users only
- ⢠Push to users active in last 24h
- ⢠Pull for inactive users
Celebrities (>100K followers)
- ⢠Use Pull Model
- ⢠Heavy caching of their posts
- ⢠On-demand feed generation
ā” Smart Optimizations
Cache Layer Strategy
- ⢠Hot feeds in Redis (active users)
- ⢠Celebrity posts cached for 24h
- ⢠Popular content pre-warmed
Async Processing
- ⢠Kafka for fanout jobs
- ⢠Background feed pre-computation
- ⢠Batch processing for efficiency
ML-Based Ranking
- ⢠Real-time scoring via Ranking Service
- ⢠Personalized relevance models
- ⢠A/B testing for algorithms
šÆ Production Decision Logic:
// Feed Generation Strategy Selection
function decideFeedStrategy(user) {
if (user.followerCount < 1000) {
return PUSH_MODEL;
} else if (user.followerCount < 100000) {
return SMART_PUSH; // Active followers only
} else {
return PULL_MODEL; // Heavy caching
}
}
// Feed Retrieval Strategy
function getFeed(userId) {
const cachedFeed = redis.get(`feed:${userId}`);
if (cachedFeed) return cachedFeed;
const followingList = getFollowing(userId);
const celebrityPosts = getCelebrityPosts(followingList);
const regularPosts = getPrecomputedFeed(userId);
return rankingService.merge(celebrityPosts, regularPosts);
}š§© Microservices Deep Dive
š Post Service
šÆ Core Responsibilities
- ⢠Post Creation: Handle text, image, and video posts
- ⢠Media Processing: Resize images, compress videos
- ⢠Content Validation: Spam detection, policy enforcement
- ⢠Privacy Controls: Public/private post settings
- ⢠Edit/Delete: Post modification and removal
š Workflow Example
POST /api/posts
{
"content": "Amazing day at the beach!",
"mediaUrls": ["beach.jpg"],
"privacy": "public"
}
// Post Service Flow:
1. Validate content & user permissions
2. Generate unique postId
3. Upload media to S3 ā get URLs
4. Store post in DynamoDB Posts table
5. Publish "PostCreated" event to Kafka
6. Return success response
// Kafka Event Published:
{
"eventType": "PostCreated",
"postId": "post_123",
"userId": "alice",
"timestamp": "2024-01-15T10:30:00Z"
}š° Feed Service (The Brain)
š§ Core Responsibilities
- ⢠Timeline Generation: Create personalized feeds
- ⢠Fanout Logic: Push posts to followers' feeds
- ⢠Feed Aggregation: Pull and merge posts on-demand
- ⢠Cache Management: Hot feed caching in Redis
- ⢠Real-time Updates: Live feed refresh
āļø Event Processing
// Kafka Consumer: PostCreated Event
onPostCreated(event) {
const { postId, userId } = event;
// Determine strategy based on user type
const user = await getUserProfile(userId);
if (user.followerCount < 1000) {
// Push Model: Fanout to all followers
const followers = await getFollowers(userId);
for (const follower of followers) {
await precomputedFeedTable.put({
userId: follower.id,
postId: postId,
createdAt: Date.now(),
score: calculateScore(post, follower)
});
}
} else {
// Pull Model: Cache post for on-demand retrieval
await redis.set(`post:${postId}`, post, 3600);
}
}š„ Follow Service
š Core Responsibilities
- ⢠Follow/Unfollow: Manage user relationships
- ⢠Social Graph: Query followers and following lists
- ⢠Relationship Status: Check follow status between users
- ⢠Privacy Controls: Handle private accounts and blocks
- ⢠Batch Operations: Mass follow/unfollow processing
š DynamoDB Integration
// Follow Operation
POST /api/users/bob/follow
// Follow Service Implementation:
async followUser(followerId, followingId) {
// 1. Write to Follow table
await followTable.put({
followerId: followerId, // PK
followingId: followingId, // SK
status: 'active',
createdAt: Date.now()
});
// 2. Update follower counts (async)
await updateUserStats(followingId, 'followers', +1);
await updateUserStats(followerId, 'following', +1);
// 3. Publish event for feed updates
await kafka.publish('UserFollowed', {
followerId, followingId, timestamp: Date.now()
});
}šÆ Ranking Service (ML-Powered)
š¤ Core Responsibilities
- ⢠Content Scoring: ML model-based post relevance
- ⢠Personalization: User preference learning
- ⢠Engagement Prediction: Likelihood to interact
- ⢠Content Freshness: Time-decay scoring
- ⢠A/B Testing: Algorithm experimentation
š Scoring Algorithm
// Ranking Score Calculation
function calculateScore(post, user) {
const features = {
// Content features
postType: post.type,
hasMedia: post.mediaUrls.length > 0,
contentLength: post.content.length,
// Social features
authorFollowers: post.author.followerCount,
userFollowsAuthor: user.following.includes(post.authorId),
mutualFriends: getMutualFriends(user.id, post.authorId),
// Engagement features
likeCount: post.likeCount,
commentCount: post.commentCount,
shareCount: post.shareCount,
// Temporal features
timeSincePost: Date.now() - post.createdAt,
timeOfDay: new Date().getHours()
};
// ML Model Inference
return mlModel.predict(features);
}š Notification Service
š” Core Responsibilities
- ⢠WebSocket Management: Real-time connections
- ⢠Push Notifications: Mobile APNs/FCM delivery
- ⢠Event Broadcasting: Live feed updates
- ⢠Presence Tracking: Online/offline status
- ⢠Notification Preferences: User-specific settings
ā” Real-time Updates
// WebSocket Event Broadcasting
class NotificationService {
// When post is created
onPostCreated(event) {
const { postId, authorId } = event;
// Get online followers
const onlineFollowers = await getOnlineFollowers(authorId);
// Send real-time update
const message = {
type: 'NEW_POST',
postId: postId,
author: authorId,
timestamp: Date.now()
};
// Broadcast via WebSocket
onlineFollowers.forEach(follower => {
this.webSocketManager.send(follower.connectionId, message);
});
// Send push notification to offline users
const offlineFollowers = await getOfflineFollowers(authorId);
await this.pushNotificationService.send(offlineFollowers, {
title: `New post from ${author.name}`,
body: post.content.substring(0, 100)
});
}
}š Complete Data Flow Examples
š Post Creation Flow
Step-by-Step Process:
Client Request: Alice uploads "Beautiful sunset! š " with photo
POST /api/posts ā API Gateway ā Post Service
Media Processing: Upload image to S3, generate thumbnails
Post Service ā S3 ā CDN integration
Data Persistence: Store post metadata in DynamoDB
Post Service ā DynamoDB Posts table
Event Publishing: Trigger fanout process
Post Service ā Kafka "PostCreated" event
Feed Fanout: Update followers' feeds based on strategy
Feed Service consumes Kafka event
Real-time Notifications: Notify online followers
Notification Service ā WebSocket/Push notifications
Performance Metrics:
Post Creation: ~150ms
Including S3 upload
Fanout Processing: ~2-5 seconds
Async via Kafka
Real-time Delivery: ~100ms
WebSocket notification
š± Feed Retrieval Flow
Step-by-Step Process:
Client Request: Bob opens his feed
GET /api/feed/timeline ā API Gateway ā Feed Service
Cache Check: Look for pre-computed feed in Redis
Feed Service ā Redis cache lookup
Feed Generation: If cache miss, generate feed using hybrid strategy
Query PrecomputedFeed + Pull celebrity posts
Content Ranking: Score and sort posts by relevance
Feed Service ā Ranking Service ML models
Response Assembly: Enrich posts with user data and media URLs
Parallel queries to User Service, CDN URLs
Cache Update: Store generated feed for future requests
Feed Service ā Redis cache (15min TTL)
Performance Scenarios:
Cache Hit (90% of requests): ~50ms
Direct Redis retrieval
Cache Miss (10% of requests): ~200ms
Full feed generation + ranking
šÆ Key Architecture Decisions
ā Microservices Design
Domain-driven service boundaries enable independent scaling and team ownership
ā Hybrid Feed Strategy
User-based routing between push and pull models optimizes for both performance and cost
ā Event-Driven Architecture
Kafka enables loose coupling and async processing of high-volume social interactions
ā Multi-Database Strategy
Right database for each use case: DynamoDB for scale, Redis for speed, S3 for media
ā Real-time Layer
WebSocket connections and push notifications provide instant social interactions
ā ML-Powered Ranking
Dedicated ranking service enables sophisticated personalization algorithms
š® Coming Up Next
Our core architecture handles the functional requirements well. Next, we'll optimize for massive scale:
- ⢠Database sharding - Horizontal scaling across multiple DynamoDB tables
- ⢠Global distribution - Multi-region deployment and data replication
- ⢠Caching strategies - Multi-tier caching for sub-100ms responses
- ⢠Celebrity optimizations - Special handling for high-follower accounts
- ⢠Performance monitoring - SLAs, alerting, and cost optimization