System Design Framework
A structured methodology for approaching system design problems. This framework ensures you cover all critical aspects while maintaining a logical flow that interviewers can easily follow.
The SACRED Framework
ποΈ SACRED: Systematic Approach to Complex Requirements & Efficient Design
S - Scope & Requirements
Define functional & non-functional requirements
A - API Design & Core Entities
Define entities, relationships, and API contracts
C - Core High-Level Design
Basic architecture to satisfy functional requirements
R - Refinement for Scale
Address non-functional requirements (scale, performance)
E - Edge Cases
Failure handling and special scenarios
D - Deep Dives
Detailed component implementation
Step 1: Scope & Requirements (5-10 mins)
π― Requirements Gathering
Functional Requirements
What the system should DO
Non-Functional Requirements
HOW WELL the system should perform
π‘ Pro Tip: Write down specific numbers! "Handle 1B requests/day" is better than "handle high traffic". If not given, make reasonable assumptions and state them clearly.
Example: URL Shortener Requirements
π Real Interview Dialogue
π¨βπΌ Interviewer:
"Design a URL shortening service like bit.ly."
π§βπ» You:
"Great! Let me clarify the requirements. For functional requirements - should users be able to shorten any URL, and do we need custom aliases?"
π¨βπΌ Interviewer:
"Yes to shortening URLs. Custom aliases would be nice but not required initially."
π§βπ» You:
"Got it. Do we need analytics - like click counts, geographic data? Also, should URLs expire or last forever?"
π¨βπΌ Interviewer:
"Basic click counts are important. For expiration, let's say URLs last forever unless deleted."
π§βπ» You:
"Perfect. Now for scale - how many URL shortenings per day should we handle? And what about read vs write ratio?"
π¨βπΌ Interviewer:
"Let's say 100M URLs created per day. Reads are much higher - assume 100:1 read/write ratio."
π§βπ» You:
"Excellent. So 100M writes/day and 10B reads/day. For availability - is 99.9% uptime acceptable? Any latency requirements?"
π¨βπΌ Interviewer:
"99.9% is fine. URL redirection should be under 100ms."
π Summary:
Step 2: API Design & Core Entities (10-15 mins)
π Entities & API Contracts
2.1 Core Entities
Identify main objects:
- β’ Primary entities (User, Post, Order)
- β’ Relationships (1:1, 1:N, N:M)
- β’ Key attributes per entity
- β’ Data size estimations
2.2 API Design
Define endpoints:
- β’ REST vs GraphQL decision
- β’ Request/Response format
- β’ Authentication method
- β’ Rate limiting approach
2.3 Back-of-Envelope Calculations
Step 3: Core High-Level Design (10-15 mins)
ποΈ Basic Architecture (Functional Requirements)
3.1 Simple Architecture
3.2 Database Design
Choose database type:
- β’ SQL for ACID compliance
- β’ NoSQL for scale & flexibility
- β’ Schema design for entities
- β’ Primary/foreign key relationships
3.3 Basic Component Flow
Draw simple architecture that satisfies functional requirements:
3.4 Database Selection
SQL
- β’ ACID compliance
- β’ Complex queries
- β’ Strong consistency
NoSQL
- β’ Horizontal scaling
- β’ Flexible schema
- β’ Eventual consistency
Step 4: Refinement for Scale (10-15 mins)
β‘ Scale & Performance Optimization
4.1 Scale Architecture
4.2 Performance Optimizations
Read Path (10B/day):
- 1. CDN cache (popular URLs)
- 2. Application cache (Redis)
- 3. Database read replicas
- 4. Geographic distribution
- 5. <100ms latency target
Write Path (100M/day):
- 1. Rate limiting (per user)
- 2. Database sharding (by short_code)
- 3. Async analytics processing
- 4. Write-through cache
- 5. Batch writes for analytics
4.3 Scaling Strategies
Database Scaling:
- β’ Read replicas (handle 100:1 ratio)
- β’ Sharding by short_code hash
- β’ Connection pooling
- β’ Indexing on short_code
Caching Strategy:
- β’ CDN for popular URLs (80/20 rule)
- β’ Redis for recent URLs
- β’ Application-level caching
- β’ Write-through for new URLs
Step 5: Edge Cases & Failures (10 mins)
π‘οΈ Resilience & Failure Handling
Edge Cases & Failure Scenarios
- β’ Hotspots: Celebrity problem, viral content
- β’ Race conditions: Distributed locks
- β’ Server failure: Health checks, auto-restart
- β’ Database failure: Replicas, failover
- β’ Network partition: CAP theorem trade-offs
- β’ Cascading failures: Circuit breakers
Performance Refinements
- β’ Caching: Multi-level (CDN, Redis, application)
- β’ Database: Indexing, query optimization
- β’ Load balancing: Geographic distribution
- β’ Async processing: Queue for heavy operations
- β’ Data partitioning: Sharding strategies
- β’ Connection pooling: Resource management
Caching Strategy
CDN
Static content, edge caching
Application Cache
Redis/Memcached for hot data
Database Cache
Query result caching
Database Scaling
Vertical Scaling
- β’ Upgrade hardware (CPU, RAM)
- β’ Limited by single machine
- β’ Simple but expensive
Horizontal Scaling
- β’ Replication (Master-Slave)
- β’ Sharding (partition data)
- β’ Federation (split databases)
Load Distribution
- β’ Load Balancer: Round-robin, least connections, IP hash
- β’ Message Queue: Decouple components, handle spikes
- β’ Service Mesh: Microservices communication
Step 6: Deep Dives (Time Permitting)
π Component Deep Dives
Based on interviewer interest or remaining time, dive deep into specific components:
Algorithm Details
- β’ Consistent hashing implementation
- β’ Rate limiting algorithms
- β’ Ranking/recommendation logic
- β’ Consensus protocols (Raft, Paxos)
Data Structures
- β’ Bloom filters for deduplication
- β’ LSM trees for write-heavy loads
- β’ B-trees for indexing
- β’ Tries for autocomplete
Specific Optimizations
- β’ Database query optimization
- β’ Connection pooling
- β’ Batch processing strategies
- β’ Compression techniques
Trade-off Analysis
- β’ SQL vs NoSQL for specific use case
- β’ Synchronous vs asynchronous processing
- β’ Push vs pull architecture
- β’ Monolith vs microservices
Time Management Guide
β±οΈ 45-Minute Interview Timeline
Common Pitfalls to Avoid
β What NOT to Do
Design Mistakes
- β Jumping to implementation details too early
- β Over-engineering for unlikely scenarios
- β Ignoring data consistency requirements
- β Not considering failure modes
- β Forgetting about data growth over time
Communication Mistakes
- β Not asking clarifying questions
- β Making assumptions without stating them
- β Getting stuck on one approach
- β Not explaining trade-offs
- β Avoiding areas you're unsure about
Framework Application Examples
π Applied to Common Problems
URL Shortener
Chat Application
Video Streaming
Quick Reference Card
π Interview Checklist
Before You Start Drawing
- β Clarified functional requirements
- β Clarified scale and performance needs
- β Stated your assumptions
- β Did back-of-envelope math
Core Design Must-Haves
- β Data model defined
- β API endpoints listed
- β High-level architecture drawn
- β Database choice justified
Scaling Considerations
- β Identified bottlenecks
- β Added caching layers
- β Discussed database scaling
- β Considered geographic distribution
Production Readiness
- β Handled failure scenarios
- β Added monitoring/alerting
- β Discussed security concerns
- β Considered edge cases
π‘ Remember: System design is about trade-offs. There's no perfect solution. Always explain your reasoning, acknowledge alternatives, and be prepared to adjust based on new requirements or constraints.