Communication Patterns

Understanding different communication patterns is crucial for system design. Each pattern has its own trade-offs in terms of latency, complexity, resource usage, and real-time capabilities.

Table of Contents

Overview of Communication Patterns

Request-Response Patterns

  • • HTTP/HTTPS (Traditional)
  • • Long Polling
  • • HTTP/2 Server Push

Streaming Patterns

  • • Server-Sent Events (SSE)
  • • WebSocket (Bidirectional)
  • • WebRTC (Peer-to-Peer)

1. HTTP/HTTPS (Traditional Request-Response)

Standard HTTP/HTTPS Communication

ClientServer1. TCP 3-way handshakeSYN → SYN-ACK → ACK2. TLS handshake (HTTPS)Certificate exchange3. HTTP RequestGET /api/data HTTP/1.14. HTTP ResponseHTTP/1.1 200 OK + Data5. Connection closeTCP 4-way handshakeOne Request = One Response = Connection Closed

✅ Advantages

  • • Simple and well-understood
  • • Stateless protocol
  • • Wide browser/server support
  • • Works with proxies/firewalls

❌ Disadvantages

  • • Connection overhead for each request
  • • No real-time updates
  • • Client must initiate all communication
  • • Inefficient for frequent updates

2. Long Polling

Long Polling Pattern

ClientServer1. Request (timeout: 30s)Server holdsconnection(waiting...)2. Event occurs!3. Immediate responseHTTP 200 + Event data4. New request immediatelyNo events(30s timeout)5. Empty responseHTTP 204 No ContentConnection held open until data available or timeout

✅ Advantages

  • • Near real-time updates
  • • Works with HTTP/1.1
  • • No special protocols needed
  • • Reduces unnecessary polling

❌ Disadvantages

  • • Keeps connections open (resource intensive)
  • • Reconnection overhead
  • • Timeout handling complexity
  • • Not truly bidirectional

3. Server-Sent Events (SSE)

Server-Sent Events (EventSource)

ClientServer1. EventSource connectGET /eventsAccept: text/event-streamHTTP 200 OKContent-Type: text/event-streamPersistent Connection (Keep-Alive)Event 1data: {"message": "update1"}Event 2data: {"message": "update2"}Event 3data: {"message": "update3"}Heartbeat:keep-aliveOne-way stream: Server → Client (HTTP connection stays open)

✅ Advantages

  • • Real-time server-to-client updates
  • • Automatic reconnection
  • • Simple API (EventSource)
  • • Works over HTTP
  • • Built-in event IDs for replay

❌ Disadvantages

  • • Server-to-client only (unidirectional)
  • • Text data only (no binary)
  • • 6 connection limit per domain
  • • No request headers after connection

4. WebSocket

WebSocket (Full Duplex)

ClientServer1. HTTP UpgradeGET /ws HTTP/1.1Upgrade: websocketHTTP 101 Switching ProtocolsUpgrade: websocketFull-Duplex WebSocket ConnectionClient msg 1{"action": "subscribe"}Server msg 1{"data": "real-time update"}Client msg 2{"action": "send_message"}Server broadcast{"type": "broadcast"}PingPongBidirectional: Client ↔ Server (persistent TCP connection)

✅ Advantages

  • • Full-duplex bidirectional communication
  • • Low latency real-time updates
  • • Supports binary data
  • • Less overhead than HTTP polling
  • • Wide browser support

❌ Disadvantages

  • • More complex to implement
  • • Requires stateful connections
  • • Proxy/firewall complications
  • • No automatic reconnection
  • • Scaling challenges (sticky sessions)

5. gRPC (Remote Procedure Call)

gRPC - High Performance RPC

gRPC ClientgRPC ServerHTTP/2 Multiplexed4 RPC Types:1. UnaryRequest → Response2. Server StreamRequest → Stream3. Client StreamStream → Response4. BidirectionalStream ↔ StreamProtocol BuffersBinary SerializationStrongly Typed

✅ Advantages

  • • High performance (10x faster than JSON/HTTP)
  • • Strongly typed with Protocol Buffers
  • • Built-in streaming support
  • • HTTP/2 multiplexing
  • • Auto-generated client/server code
  • • Efficient binary serialization
  • • Language agnostic

❌ Disadvantages

  • • Not browser-native (needs proxy)
  • • Binary format not human-readable
  • • Steeper learning curve
  • • Limited ecosystem vs REST
  • • Requires HTTP/2 support
  • • Schema versioning complexity

6. WebRTC (Peer-to-Peer)

WebRTC - P2P Communication over UDP

📌 Key Point: WebRTC uses UDP (not TCP) for media transport to achieve ultra-low latency. UDP's lack of guaranteed delivery is actually beneficial for real-time media where dropping old packets is better than waiting for retransmission.

WebRTC Connection Establishment with STUNPeer ABehind NAT192.168.1.100Signaling Server(WebSocket/HTTP)STUN Serverstun.l.google.com:19302Peer BBehind NAT10.0.0.501. STUN RequestGet public IPYour IP: 203.0.113.45:51234Get public IPYour IP: 198.51.100.78:432102. Create Offer+ ICE Candidates(public IP from STUN)3. Send Offer4. Forward Offer5. Create Answer+ ICE Candidates6. Send Answer7. Forward Answer8. Direct P2P Connection (UDP)Audio/Video over UDP (SRTP)Data Channel over UDP (SCTP)🔒 DTLS encryption for securityPeer-to-Peer: Direct connection after signaling (no server for data)
🌐Deep Dive: STUN vs TURN Servers

🔍 STUN Server

Purpose: NAT Discovery & Public IP Detection

  • • Tells clients their public IP address
  • • Determines NAT type and behavior
  • • Enables direct P2P connections
  • • Low cost - just UDP packet reflection
  • • Works for ~80% of connections

Example: stun.l.google.com:19302

🔄 TURN Server

Purpose: Relay Server (Fallback)

  • • Relays traffic when direct P2P fails
  • • Handles symmetric NATs & firewalls
  • • Uses server bandwidth for media
  • • Higher latency than direct connection
  • • Required for ~20% of connections

Fallback: When STUN-assisted P2P fails

🔗 The Connection Process: Peers first contact STUN to discover their public endpoints. They exchange these "ICE candidates" via the signaling server. If direct UDP connection succeeds, they communicate P2P. If blocked by NAT/firewalls, they fall back to TURN relay.

✅ Advantages

  • • Lowest latency (direct P2P)
  • • No server bandwidth for media
  • • Built-in audio/video codecs
  • • End-to-end encryption
  • • NAT traversal capabilities

❌ Disadvantages

  • • Complex setup (SDP, ICE, STUN/TURN)
  • • Requires signaling server
  • • May need TURN relay servers
  • • Browser compatibility issues
  • • Difficult debugging

Comparison Table

PatternDirectionReal-timeProtocolUse Cases
HTTP/HTTPSRequest-ResponseNoHTTPREST APIs, Web pages
Long PollingClient → ServerNear real-timeHTTPNotifications, Updates
SSEServer → ClientYesHTTPLive feeds, Dashboards
WebSocketBidirectionalYesWS/WSSChat, Gaming, Trading
gRPCBidirectional StreamingYesHTTP/2Microservices, APIs
WebRTCP2P BidirectionalYesSRTP/SCTPVideo calls, Screen sharing

When to Use What?

Use HTTP/HTTPS

  • • RESTful APIs
  • • CRUD operations
  • • File uploads/downloads
  • • Traditional web apps

Use Long Polling

  • • Fallback for older browsers
  • • Simple notifications
  • • Infrequent updates
  • • Behind restrictive proxies

Use SSE

  • • Live news feeds
  • • Stock price updates
  • • Server monitoring
  • • Progress indicators

Use WebSocket

  • • Chat applications
  • • Multiplayer games
  • • Collaborative editing
  • • Trading platforms

Use gRPC

  • • Microservice communication
  • • Internal APIs
  • • High-performance systems
  • • Mobile app backends

Use WebRTC

  • • Video conferencing
  • • Screen sharing
  • • P2P file transfer
  • • Low-latency gaming

Performance & Resource Comparison

Resource Usage & Performance

MetricHTTPLong PollSSEWebSocketgRPCWebRTC
LatencyHighMediumLowVery LowVery LowUltra Low
Server ResourcesLowHighMediumMediumLowLow*
ComplexitySimpleSimpleSimpleMediumMediumComplex
ScalabilityExcellentPoorGoodMediumExcellentGood**

  • Low server resources for media after connection establishment
    ** Good scalability for P2P, but TURN servers may be needed

Summary

Each communication pattern serves different needs:

  • HTTP/HTTPS: Best for traditional request-response scenarios
  • Long Polling: Simple real-time updates when WebSocket isn't available
  • SSE: Perfect for server-to-client streaming (news feeds, notifications)
  • WebSocket: Ideal for bidirectional real-time communication (chat, gaming)
  • gRPC: Excellent for high-performance microservice communication
  • WebRTC: Essential for peer-to-peer media streaming (video calls)

Choose based on your specific requirements for latency, directionality, browser support, and infrastructure complexity.

Communication Reliability Patterns

Communication protocols are only as reliable as the patterns you use to handle failures. These patterns ensure your systems gracefully handle network partitions, service outages, and cascading failures.

🔄 Timeouts & Retries

📌 Key Point: Timeouts prevent hanging requests, retries handle transient failures, but both need careful configuration to avoid making things worse.

⏱️ Timeout Strategies

  • Connection Timeout: 3-5 seconds
  • Read Timeout: 10-30 seconds
  • Total Request Timeout: 1-2 minutes
  • Service-specific: Database (5s), Cache (100ms)

🔁 Retry Strategies

  • Exponential Backoff: 1s, 2s, 4s, 8s...
  • Jitter: ±25% randomization
  • Max Retries: Usually 3-5 attempts
  • Retry Budget: Limit retry rate (e.g., 10%)
🎲Deep Dive: Why Jitter Prevents Retry Storms
❌ Without Jitter

All clients retry at exact same intervals:

T+1s: 1000 clients retry
T+2s: 1000 clients retry
T+4s: 1000 clients retry

Result: Synchronized load spikes overwhelm recovering service

✅ With ±25% Jitter

Randomized retry intervals spread the load:

T+0.75-1.25s: Clients spread out
T+1.5-2.5s: Clients spread out
T+3-5s: Clients spread out

Result: Smooth load distribution allows recovery

🔧 Implementation Example
baseDelay = Math.pow(2, attempt) * 1000 // 1s, 2s, 4s...
jitter = baseDelay * 0.25 * (Math.random() - 0.5)
actualDelay = baseDelay + jitter

This spreads 1000ms retry into 750-1250ms range, preventing thundering herd.

⚠️ Common Mistakes

  • Retry Storms: All clients retry simultaneously
  • No Jitter: Synchronized retries amplify load spikes
  • Retrying Non-Idempotent Operations: Payment processing, user creation
  • No Circuit Breaking: Retrying when service is clearly down

⚡ Circuit Breaker Pattern

Circuit Breaker State TransitionsCLOSEDNormal OperationRequests Pass ThroughOPENFailing FastNo Requests SentHALF-OPENTesting RecoveryLimited RequestsFailure threshold reachedTimeout expiresSuccess threshold metStill failing

✅ Benefits

  • • Prevents cascading failures
  • • Fast failure responses
  • • Automatic recovery testing
  • • Resource protection

🔧 Configuration

  • • Failure threshold: 5 failures
  • • Success threshold: 3 successes
  • • Timeout: 60 seconds
  • • Request volume: 20 requests

🎯 Use Cases

  • • Database connections
  • • External API calls
  • • Microservice communication
  • • Third-party integrations

🌊 Cascading Failures

📌 Definition: When the failure of one component causes the failure of other components, creating a domino effect that can bring down entire systems.

Common Cascading Failure Scenarios:

🔥 Retry Amplification

Service A fails → All clients retry → 10x load on Service A → Service A crashes completely

Solution: Exponential backoff, circuit breakers, retry budgets
⚡ Resource Exhaustion

Database slow → Connection pool exhausted → Web servers hang → Load balancer marks them unhealthy

Solution: Timeouts, connection limits, bulkhead pattern
📈 Thundering Herd

Cache expires → All requests hit database → Database overloaded → More cache misses

Solution: Cache warming, staggered expiration, circuit breakers

🛡️ Prevention Strategies

  • Bulkhead Pattern: Isolate critical resources
  • Load Shedding: Drop low-priority requests
  • Graceful Degradation: Reduce functionality, stay up
  • Health Checks: Proactive failure detection
  • Rate Limiting: Protect downstream services

🚨 Detection & Recovery

  • Monitoring: Error rates, latency, throughput
  • Alerting: Early warning systems
  • Automated Rollback: Quick recovery
  • Chaos Engineering: Proactive failure testing
  • Runbooks: Clear incident response

💀 Dead Letter Queues

📌 Purpose: Capture messages that cannot be processed after multiple retry attempts, preventing message loss and infinite retry loops.

✅ Benefits

  • • Prevents message loss
  • • Stops infinite retries
  • • Enables failure analysis
  • • Manual recovery possible
  • • Maintains system stability

🔄 Processing Flow

  • • Message fails processing
  • • Retry with exponential backoff
  • • After max retries → DLQ
  • • Monitor DLQ for patterns
  • • Fix issues and replay