System Design Framework

A structured methodology for approaching system design problems. This framework ensures you cover all critical aspects while maintaining a logical flow that interviewers can easily follow.

The SACRED Framework

πŸ›οΈ SACRED: Systematic Approach to Complex Requirements & Efficient Design

S - Scope & Requirements

Define functional & non-functional requirements

A - API Design & Core Entities

Define entities, relationships, and API contracts

C - Core High-Level Design

Basic architecture to satisfy functional requirements

R - Refinement for Scale

Address non-functional requirements (scale, performance)

E - Edge Cases

Failure handling and special scenarios

D - Deep Dives

Detailed component implementation

Step 1: Scope & Requirements (5-10 mins)

🎯 Requirements Gathering

Functional Requirements

What the system should DO

Ask about:
β€’ Core features (must-have vs nice-to-have)
β€’ User actions and workflows
β€’ Data operations (CRUD)
β€’ Business logic rules
β€’ User types and permissions

Non-Functional Requirements

HOW WELL the system should perform

Clarify:
β€’ Scale (users, requests, data volume)
β€’ Performance (latency, throughput)
β€’ Availability (uptime targets)
β€’ Consistency requirements
β€’ Geographic distribution

πŸ’‘ Pro Tip: Write down specific numbers! "Handle 1B requests/day" is better than "handle high traffic". If not given, make reasonable assumptions and state them clearly.

Example: URL Shortener Requirements

πŸ”— Real Interview Dialogue

πŸ‘¨β€πŸ’Ό Interviewer:

"Design a URL shortening service like bit.ly."

πŸ§‘β€πŸ’» You:

"Great! Let me clarify the requirements. For functional requirements - should users be able to shorten any URL, and do we need custom aliases?"

πŸ‘¨β€πŸ’Ό Interviewer:

"Yes to shortening URLs. Custom aliases would be nice but not required initially."

πŸ§‘β€πŸ’» You:

"Got it. Do we need analytics - like click counts, geographic data? Also, should URLs expire or last forever?"

πŸ‘¨β€πŸ’Ό Interviewer:

"Basic click counts are important. For expiration, let's say URLs last forever unless deleted."

πŸ§‘β€πŸ’» You:

"Perfect. Now for scale - how many URL shortenings per day should we handle? And what about read vs write ratio?"

πŸ‘¨β€πŸ’Ό Interviewer:

"Let's say 100M URLs created per day. Reads are much higher - assume 100:1 read/write ratio."

πŸ§‘β€πŸ’» You:

"Excellent. So 100M writes/day and 10B reads/day. For availability - is 99.9% uptime acceptable? Any latency requirements?"

πŸ‘¨β€πŸ’Ό Interviewer:

"99.9% is fine. URL redirection should be under 100ms."

πŸ“ Summary:

Functional Requirements:
β€’ Shorten long URLs to short URLs
β€’ Redirect short URLs to original URLs
β€’ Basic click analytics
β€’ URLs persist forever (no expiration)
Non-Functional Requirements:
β€’ 100M URL shortenings/day
β€’ 10B redirections/day (100:1 read/write)
β€’ 99.9% availability
β€’ <100ms redirect latency

Step 2: API Design & Core Entities (10-15 mins)

πŸ”— Entities & API Contracts

2.1 Core Entities

Identify main objects:

  • β€’ Primary entities (User, Post, Order)
  • β€’ Relationships (1:1, 1:N, N:M)
  • β€’ Key attributes per entity
  • β€’ Data size estimations
Example - URL Shortener:
URL: {id, long_url, short_code, created_at}
User: {id, email, api_key}
Analytics: {url_id, clicks, timestamp}
User β†’ URLs (1:N)

2.2 API Design

Define endpoints:

  • β€’ REST vs GraphQL decision
  • β€’ Request/Response format
  • β€’ Authentication method
  • β€’ Rate limiting approach
URL Shortener APIs:
POST /shorten
{long_url} β†’ {short_url, expires}
GET /{short_code}
β†’ 301 Redirect to long_url
GET /stats/{short_code}
β†’ {clicks, referrers, timeline}

2.3 Back-of-Envelope Calculations

Based on requirements & entities:
β€’ QPS = Daily requests / 86,400 seconds
β€’ Storage = Entity size Γ— Number Γ— Retention
β€’ Bandwidth = QPS Γ— Average payload size
β€’ Cache = 20% of frequently accessed data
Example (100M URLs/day):
β€’ Write QPS: 100M / 86400 β‰ˆ 1200 QPS
β€’ Read QPS: 1200 Γ— 100 = 120K QPS
β€’ Storage: 500 bytes Γ— 100M Γ— 365 β‰ˆ 18 TB/year

Step 3: Core High-Level Design (10-15 mins)

πŸ›οΈ Basic Architecture (Functional Requirements)

3.1 Simple Architecture

Draw components to satisfy APIs:
β€’ Client β†’ Load Balancer
β€’ Load Balancer β†’ Application Servers
β€’ Application β†’ Database (single master initially)
β€’ Application β†’ Object Storage (for media)
Focus: Make it work functionally!

3.2 Database Design

Choose database type:

  • β€’ SQL for ACID compliance
  • β€’ NoSQL for scale & flexibility
  • β€’ Schema design for entities
  • β€’ Primary/foreign key relationships
URL Shortener Tables:
urls: id, long_url, short_code, user_id
users: id, email, api_key
analytics: url_id, timestamp, ip

3.3 Basic Component Flow

Draw simple architecture that satisfies functional requirements:

URL Shortener Basic Flow:
Client β†’ Load Balancer β†’ App Server
App Server β†’ Database (read/write)
App Server β†’ Cache (for hot URLs)
Focus: Make functional requirements work!

3.4 Database Selection

SQL
  • β€’ ACID compliance
  • β€’ Complex queries
  • β€’ Strong consistency
NoSQL
  • β€’ Horizontal scaling
  • β€’ Flexible schema
  • β€’ Eventual consistency

Step 4: Refinement for Scale (10-15 mins)

⚑ Scale & Performance Optimization

4.1 Scale Architecture

Now address non-functional requirements:
β€’ Multiple Load Balancers (availability)
β€’ Auto-scaling Application Servers
β€’ Multi-level Caching (CDN + Redis)
β€’ Database Replication (Master-Slave)
β€’ Database Sharding (for 100M+ URLs)
β€’ Message Queue (analytics processing)

4.2 Performance Optimizations

Read Path (10B/day):

  • 1. CDN cache (popular URLs)
  • 2. Application cache (Redis)
  • 3. Database read replicas
  • 4. Geographic distribution
  • 5. <100ms latency target

Write Path (100M/day):

  • 1. Rate limiting (per user)
  • 2. Database sharding (by short_code)
  • 3. Async analytics processing
  • 4. Write-through cache
  • 5. Batch writes for analytics

4.3 Scaling Strategies

Database Scaling:
  • β€’ Read replicas (handle 100:1 ratio)
  • β€’ Sharding by short_code hash
  • β€’ Connection pooling
  • β€’ Indexing on short_code
Caching Strategy:
  • β€’ CDN for popular URLs (80/20 rule)
  • β€’ Redis for recent URLs
  • β€’ Application-level caching
  • β€’ Write-through for new URLs

Step 5: Edge Cases & Failures (10 mins)

πŸ›‘οΈ Resilience & Failure Handling

Edge Cases & Failure Scenarios

  • β€’ Hotspots: Celebrity problem, viral content
  • β€’ Race conditions: Distributed locks
  • β€’ Server failure: Health checks, auto-restart
  • β€’ Database failure: Replicas, failover
  • β€’ Network partition: CAP theorem trade-offs
  • β€’ Cascading failures: Circuit breakers

Performance Refinements

  • β€’ Caching: Multi-level (CDN, Redis, application)
  • β€’ Database: Indexing, query optimization
  • β€’ Load balancing: Geographic distribution
  • β€’ Async processing: Queue for heavy operations
  • β€’ Data partitioning: Sharding strategies
  • β€’ Connection pooling: Resource management

Caching Strategy

CDN

Static content, edge caching

Application Cache

Redis/Memcached for hot data

Database Cache

Query result caching

Database Scaling

Vertical Scaling
  • β€’ Upgrade hardware (CPU, RAM)
  • β€’ Limited by single machine
  • β€’ Simple but expensive
Horizontal Scaling
  • β€’ Replication (Master-Slave)
  • β€’ Sharding (partition data)
  • β€’ Federation (split databases)

Load Distribution

  • β€’ Load Balancer: Round-robin, least connections, IP hash
  • β€’ Message Queue: Decouple components, handle spikes
  • β€’ Service Mesh: Microservices communication

Step 6: Deep Dives (Time Permitting)

πŸ” Component Deep Dives

Based on interviewer interest or remaining time, dive deep into specific components:

Algorithm Details

  • β€’ Consistent hashing implementation
  • β€’ Rate limiting algorithms
  • β€’ Ranking/recommendation logic
  • β€’ Consensus protocols (Raft, Paxos)

Data Structures

  • β€’ Bloom filters for deduplication
  • β€’ LSM trees for write-heavy loads
  • β€’ B-trees for indexing
  • β€’ Tries for autocomplete

Specific Optimizations

  • β€’ Database query optimization
  • β€’ Connection pooling
  • β€’ Batch processing strategies
  • β€’ Compression techniques

Trade-off Analysis

  • β€’ SQL vs NoSQL for specific use case
  • β€’ Synchronous vs asynchronous processing
  • β€’ Push vs pull architecture
  • β€’ Monolith vs microservices

Time Management Guide

⏱️ 45-Minute Interview Timeline

0-5 min
S - Scope:Functional & non-functional requirements
5-15 min
A - API & Entities:Data model, API design, calculations
15-25 min
C - Core Design:Basic architecture for functional requirements
25-35 min
R - Refinement:Scale optimization, performance tuning
35-40 min
E - Edge Cases:Failures, monitoring, security
40-45 min
D - Deep Dives:Specific components, algorithms, Q&A

Common Pitfalls to Avoid

❌ What NOT to Do

Design Mistakes

  • ❌ Jumping to implementation details too early
  • ❌ Over-engineering for unlikely scenarios
  • ❌ Ignoring data consistency requirements
  • ❌ Not considering failure modes
  • ❌ Forgetting about data growth over time

Communication Mistakes

  • ❌ Not asking clarifying questions
  • ❌ Making assumptions without stating them
  • ❌ Getting stuck on one approach
  • ❌ Not explaining trade-offs
  • ❌ Avoiding areas you're unsure about

Framework Application Examples

πŸ“š Applied to Common Problems

URL Shortener

S: 100M URLs/day, <100ms latency
A: URL entity, POST/GET APIs
C: LB→App→DB+Cache basic flow
R: CDN, sharding, read replicas
E: Rate limiting, failover
D: Base62 encoding, Snowflake ID

Chat Application

S: Real-time, 1M users
A: User, Message entities, WebSocket
C: WebSocket servers + message queue
R: Connection pooling, partitioning
E: Offline sync, delivery guarantees
D: Message ordering, encryption

Video Streaming

S: Netflix-like, global scale
A: Video, User entities, streaming APIs
C: Upload→Encode→CDN→Stream
R: Global CDN, adaptive bitrates
E: Buffering, bandwidth limits
D: Video encoding, caching strategy

Quick Reference Card

πŸ“‹ Interview Checklist

Before You Start Drawing

  • ☐ Clarified functional requirements
  • ☐ Clarified scale and performance needs
  • ☐ Stated your assumptions
  • ☐ Did back-of-envelope math

Core Design Must-Haves

  • ☐ Data model defined
  • ☐ API endpoints listed
  • ☐ High-level architecture drawn
  • ☐ Database choice justified

Scaling Considerations

  • ☐ Identified bottlenecks
  • ☐ Added caching layers
  • ☐ Discussed database scaling
  • ☐ Considered geographic distribution

Production Readiness

  • ☐ Handled failure scenarios
  • ☐ Added monitoring/alerting
  • ☐ Discussed security concerns
  • ☐ Considered edge cases

πŸ’‘ Remember: System design is about trade-offs. There's no perfect solution. Always explain your reasoning, acknowledge alternatives, and be prepared to adjust based on new requirements or constraints.