Communication Patterns
Understanding different communication patterns is crucial for system design. Each pattern has its own trade-offs in terms of latency, complexity, resource usage, and real-time capabilities.
Table of Contents
Quick Navigation
Overview of Communication Patterns
Request-Response Patterns
- • HTTP/HTTPS (Traditional)
- • Long Polling
- • HTTP/2 Server Push
Streaming Patterns
- • Server-Sent Events (SSE)
- • WebSocket (Bidirectional)
- • WebRTC (Peer-to-Peer)
1. HTTP/HTTPS (Traditional Request-Response)
Standard HTTP/HTTPS Communication
✅ Advantages
- • Simple and well-understood
- • Stateless protocol
- • Wide browser/server support
- • Works with proxies/firewalls
❌ Disadvantages
- • Connection overhead for each request
- • No real-time updates
- • Client must initiate all communication
- • Inefficient for frequent updates
2. Long Polling
Long Polling Pattern
✅ Advantages
- • Near real-time updates
- • Works with HTTP/1.1
- • No special protocols needed
- • Reduces unnecessary polling
❌ Disadvantages
- • Keeps connections open (resource intensive)
- • Reconnection overhead
- • Timeout handling complexity
- • Not truly bidirectional
3. Server-Sent Events (SSE)
Server-Sent Events (EventSource)
✅ Advantages
- • Real-time server-to-client updates
- • Automatic reconnection
- • Simple API (EventSource)
- • Works over HTTP
- • Built-in event IDs for replay
❌ Disadvantages
- • Server-to-client only (unidirectional)
- • Text data only (no binary)
- • 6 connection limit per domain
- • No request headers after connection
4. WebSocket
WebSocket (Full Duplex)
✅ Advantages
- • Full-duplex bidirectional communication
- • Low latency real-time updates
- • Supports binary data
- • Less overhead than HTTP polling
- • Wide browser support
❌ Disadvantages
- • More complex to implement
- • Requires stateful connections
- • Proxy/firewall complications
- • No automatic reconnection
- • Scaling challenges (sticky sessions)
5. gRPC (Remote Procedure Call)
gRPC - High Performance RPC
✅ Advantages
- • High performance (10x faster than JSON/HTTP)
- • Strongly typed with Protocol Buffers
- • Built-in streaming support
- • HTTP/2 multiplexing
- • Auto-generated client/server code
- • Efficient binary serialization
- • Language agnostic
❌ Disadvantages
- • Not browser-native (needs proxy)
- • Binary format not human-readable
- • Steeper learning curve
- • Limited ecosystem vs REST
- • Requires HTTP/2 support
- • Schema versioning complexity
6. WebRTC (Peer-to-Peer)
WebRTC - P2P Communication over UDP
📌 Key Point: WebRTC uses UDP (not TCP) for media transport to achieve ultra-low latency. UDP's lack of guaranteed delivery is actually beneficial for real-time media where dropping old packets is better than waiting for retransmission.
🌐Deep Dive: STUN vs TURN Servers▼
🔍 STUN Server
Purpose: NAT Discovery & Public IP Detection
- • Tells clients their public IP address
- • Determines NAT type and behavior
- • Enables direct P2P connections
- • Low cost - just UDP packet reflection
- • Works for ~80% of connections
Example: stun.l.google.com:19302
🔄 TURN Server
Purpose: Relay Server (Fallback)
- • Relays traffic when direct P2P fails
- • Handles symmetric NATs & firewalls
- • Uses server bandwidth for media
- • Higher latency than direct connection
- • Required for ~20% of connections
Fallback: When STUN-assisted P2P fails
🔗 The Connection Process: Peers first contact STUN to discover their public endpoints. They exchange these "ICE candidates" via the signaling server. If direct UDP connection succeeds, they communicate P2P. If blocked by NAT/firewalls, they fall back to TURN relay.
✅ Advantages
- • Lowest latency (direct P2P)
- • No server bandwidth for media
- • Built-in audio/video codecs
- • End-to-end encryption
- • NAT traversal capabilities
❌ Disadvantages
- • Complex setup (SDP, ICE, STUN/TURN)
- • Requires signaling server
- • May need TURN relay servers
- • Browser compatibility issues
- • Difficult debugging
Comparison Table
| Pattern | Direction | Real-time | Protocol | Use Cases |
|---|---|---|---|---|
| HTTP/HTTPS | Request-Response | No | HTTP | REST APIs, Web pages |
| Long Polling | Client → Server | Near real-time | HTTP | Notifications, Updates |
| SSE | Server → Client | Yes | HTTP | Live feeds, Dashboards |
| WebSocket | Bidirectional | Yes | WS/WSS | Chat, Gaming, Trading |
| gRPC | Bidirectional Streaming | Yes | HTTP/2 | Microservices, APIs |
| WebRTC | P2P Bidirectional | Yes | SRTP/SCTP | Video calls, Screen sharing |
When to Use What?
Use HTTP/HTTPS
- • RESTful APIs
- • CRUD operations
- • File uploads/downloads
- • Traditional web apps
Use Long Polling
- • Fallback for older browsers
- • Simple notifications
- • Infrequent updates
- • Behind restrictive proxies
Use SSE
- • Live news feeds
- • Stock price updates
- • Server monitoring
- • Progress indicators
Use WebSocket
- • Chat applications
- • Multiplayer games
- • Collaborative editing
- • Trading platforms
Use gRPC
- • Microservice communication
- • Internal APIs
- • High-performance systems
- • Mobile app backends
Use WebRTC
- • Video conferencing
- • Screen sharing
- • P2P file transfer
- • Low-latency gaming
Performance & Resource Comparison
Resource Usage & Performance
| Metric | HTTP | Long Poll | SSE | WebSocket | gRPC | WebRTC |
|---|---|---|---|---|---|---|
| Latency | High | Medium | Low | Very Low | Very Low | Ultra Low |
| Server Resources | Low | High | Medium | Medium | Low | Low* |
| Complexity | Simple | Simple | Simple | Medium | Medium | Complex |
| Scalability | Excellent | Poor | Good | Medium | Excellent | Good** |
- Low server resources for media after connection establishment
** Good scalability for P2P, but TURN servers may be needed
Summary
Each communication pattern serves different needs:
- HTTP/HTTPS: Best for traditional request-response scenarios
- Long Polling: Simple real-time updates when WebSocket isn't available
- SSE: Perfect for server-to-client streaming (news feeds, notifications)
- WebSocket: Ideal for bidirectional real-time communication (chat, gaming)
- gRPC: Excellent for high-performance microservice communication
- WebRTC: Essential for peer-to-peer media streaming (video calls)
Choose based on your specific requirements for latency, directionality, browser support, and infrastructure complexity.
Communication Reliability Patterns
Communication protocols are only as reliable as the patterns you use to handle failures. These patterns ensure your systems gracefully handle network partitions, service outages, and cascading failures.
🔄 Timeouts & Retries
📌 Key Point: Timeouts prevent hanging requests, retries handle transient failures, but both need careful configuration to avoid making things worse.
⏱️ Timeout Strategies
- • Connection Timeout: 3-5 seconds
- • Read Timeout: 10-30 seconds
- • Total Request Timeout: 1-2 minutes
- • Service-specific: Database (5s), Cache (100ms)
🔁 Retry Strategies
- • Exponential Backoff: 1s, 2s, 4s, 8s...
- • Jitter: ±25% randomization
- • Max Retries: Usually 3-5 attempts
- • Retry Budget: Limit retry rate (e.g., 10%)
🎲Deep Dive: Why Jitter Prevents Retry Storms▼
❌ Without Jitter
All clients retry at exact same intervals:
Result: Synchronized load spikes overwhelm recovering service
✅ With ±25% Jitter
Randomized retry intervals spread the load:
Result: Smooth load distribution allows recovery
🔧 Implementation Example
This spreads 1000ms retry into 750-1250ms range, preventing thundering herd.
⚠️ Common Mistakes
- • Retry Storms: All clients retry simultaneously
- • No Jitter: Synchronized retries amplify load spikes
- • Retrying Non-Idempotent Operations: Payment processing, user creation
- • No Circuit Breaking: Retrying when service is clearly down
⚡ Circuit Breaker Pattern
✅ Benefits
- • Prevents cascading failures
- • Fast failure responses
- • Automatic recovery testing
- • Resource protection
🔧 Configuration
- • Failure threshold: 5 failures
- • Success threshold: 3 successes
- • Timeout: 60 seconds
- • Request volume: 20 requests
🎯 Use Cases
- • Database connections
- • External API calls
- • Microservice communication
- • Third-party integrations
🌊 Cascading Failures
📌 Definition: When the failure of one component causes the failure of other components, creating a domino effect that can bring down entire systems.
Common Cascading Failure Scenarios:
🔥 Retry Amplification
Service A fails → All clients retry → 10x load on Service A → Service A crashes completely
⚡ Resource Exhaustion
Database slow → Connection pool exhausted → Web servers hang → Load balancer marks them unhealthy
📈 Thundering Herd
Cache expires → All requests hit database → Database overloaded → More cache misses
🛡️ Prevention Strategies
- • Bulkhead Pattern: Isolate critical resources
- • Load Shedding: Drop low-priority requests
- • Graceful Degradation: Reduce functionality, stay up
- • Health Checks: Proactive failure detection
- • Rate Limiting: Protect downstream services
🚨 Detection & Recovery
- • Monitoring: Error rates, latency, throughput
- • Alerting: Early warning systems
- • Automated Rollback: Quick recovery
- • Chaos Engineering: Proactive failure testing
- • Runbooks: Clear incident response
💀 Dead Letter Queues
📌 Purpose: Capture messages that cannot be processed after multiple retry attempts, preventing message loss and infinite retry loops.
✅ Benefits
- • Prevents message loss
- • Stops infinite retries
- • Enables failure analysis
- • Manual recovery possible
- • Maintains system stability
🔄 Processing Flow
- • Message fails processing
- • Retry with exponential backoff
- • After max retries → DLQ
- • Monitor DLQ for patterns
- • Fix issues and replay