Step 1: Scope & Requirements
Step 1 of 6: S - Scope & Requirements
Define what we're building, who will use it, and the scale we need to handle
🎯 What We're Building
Design a social media feed system like Facebook or Instagram that can serve personalized content to billions of users in real-time. Users should see relevant posts from people they follow, with the most engaging content prioritized.
✅ Functional Requirements
🎯 Core Requirements (MVP)
- •Post Creation: Users can create posts with text, images, and videos
- •Follow System: Users can follow/unfollow other users
- •Feed Generation: Display personalized timeline of posts from followed users
- •Basic Interactions: Like, comment, and share posts
- •Real-time Updates: New posts appear in feeds without refresh
🌟 Enhanced Features
- •Content Ranking: ML-based feed ranking by relevance
- •Media Support: High-quality image/video upload and streaming
- •Privacy Controls: Public/private posts, block users
- •Push Notifications: Notify users of likes, comments, mentions
- •Analytics: Track post engagement, user growth
⚡ Non-Functional Requirements
📊 Scale
- Daily Active Users: 2 billion
- Total Users: 3 billion
- Posts per day: 500 million
- Average posts per user: 2 per day
- Average follows: 200 users
- Celebrity accounts: Up to 100M followers
🚀 Performance
- Feed Generation: < 100ms
- Post Creation: < 200ms
- Like/Comment: < 50ms
- Real-time Updates: < 1 second
- Image Upload: < 5 seconds
- Video Processing: < 30 seconds
🛡️ Reliability
- Uptime: 99.99% (4.38 min/month downtime)
- Data Consistency: Eventually consistent
- Disaster Recovery: < 1 hour RTO
- Data Retention: Indefinite (user-controlled)
- Regional Failover: Automatic
- Data Backup: 3x replication
🧮 Back-of-Envelope Estimations
📖 Read vs Write Analysis
Assumption: Average user checks feed 10 times per day, each loading 20 posts.
- Writes per day: 500M posts
- Reads per day: 2B users × 10 checks × 20 posts = 400B reads
- Read:Write Ratio: 800:1 (read-heavy)
💡 Design Implication
Heavy read workload means we need aggressive caching, read replicas, and optimized feed generation algorithms.
⚡ QPS (Queries Per Second) Analysis
✍️ Write QPS
📖 Read QPS
💾 Storage Requirements
📝 Text & Metadata
🎬 Media Content
👤 User & Relationships
📊 Total Storage Requirement
🌐 Bandwidth Requirements
📤 Ingress (Upload)
📥 Egress (Download)
🎯 Key Design Challenges
🔥 Celebrity Problem
How to handle accounts with 100M+ followers without overwhelming the system when they post? Push-based fanout would create massive write amplification.
⚡ Real-time Updates
Deliver new posts to millions of users instantly while maintaining sub-100ms feed generation latency.
🎯 Content Ranking
Use ML algorithms to rank billions of posts by relevance and engagement while respecting user preferences.
💾 Hot Data Problem
Recent posts need to be instantly accessible while older content can be moved to slower storage tiers.
🌍 Global Distribution
Serve users worldwide with low latency while handling different privacy laws and content regulations.
🔄 Data Consistency
Balance between immediate consistency (likes/comments) and eventual consistency (feed updates) across regions.
📈 Success Metrics
User Engagement
Performance
Content Quality
Business
❌ Out of Scope (For This Design)
📱 Advanced Features
- • Stories/temporary content
- • Live streaming
- • Video calling
- • Marketplace/commerce
- • Advanced content moderation
🛠️ Implementation Details
- • Detailed ML model training
- • Mobile app implementation
- • Specific security protocols
- • Payment processing
- • Legal compliance frameworks
📌 Focus: We'll concentrate on the core feed generation, posting, and user interaction systems that handle the massive scale requirements defined above.
🔮 Coming Up Next
Now that we have clear requirements and scale estimates, we'll design the core entities and API contracts:
- • User, Post, Follow entities with relationships
- • REST API design for posts, feeds, and interactions
- • Database schema optimized for our read-heavy workload
- • Feed generation strategies (push vs pull models)