Rate Limiting
Discover powerful algorithms that control request flow and protect your systems from overload
Rate limiting is a fundamental technique in system design to control the rate at which requests are processed. It protects services from being overwhelmed, ensures fair resource allocation, and maintains quality of service.
Why Rate Limiting?
- Prevent DoS attacks: Stop malicious actors from overwhelming your service
- Fair usage: Ensure all users get reasonable access to resources
- Cost control: Limit expensive API calls or resource usage
- Quality of service: Maintain consistent performance for all users
Implementations
We've implemented four different rate limiting algorithms, each with its own trade-offs:
1. Token Bucket
The token bucket algorithm is one of the most popular rate limiting approaches. Think of it as a bucket that holds tokens:
- Tokens are added at a constant rate (e.g., 10 tokens per second)
- Each request consumes one or more tokens
- Requests are rejected when the bucket is empty
- Burst capacity allows handling spike traffic up to bucket size
type TokenBucket struct {
rate float64 // tokens per second
burst int // max tokens in bucket
tokens float64 // current tokens
lastTime time.Time // last refill time
}
Visual Flow:
Core Algorithm Logic:
func (tb *TokenBucket) Allow() bool {
// 1. Calculate elapsed time and refill tokens
elapsed := time.Since(tb.lastTime).Seconds()
tb.tokens = math.Min(tb.burst, tb.tokens + elapsed * tb.rate)
tb.lastTime = time.Now()
// 2. Check if tokens available
if tb.tokens >= 1 {
tb.tokens -= 1 // Consume token
return true // Allow request
}
return false // Reject request
}
Pros:
- Smooth rate limiting
- Allows burst traffic
- Memory efficient
Cons:
- Not strictly accurate over short intervals
- Requires token calculation on each request
2. Fixed Window Counter
Divides time into fixed windows and counts requests per window:
type FixedWindow struct {
window time.Duration // window size (e.g., 1 minute)
limit int // max requests per window
counter int // current window request count
windowStart time.Time // current window start time
}
Core Algorithm Logic:
func (fw *FixedWindow) Allow() bool {
// 1. Calculate current window
now := time.Now()
currentWindow := now.Truncate(fw.window) // Rounds down to window start
// 2. Reset counter if new window
if currentWindow != fw.windowStart {
fw.counter = 0
fw.windowStart = currentWindow
}
// 3. Check if under limit
if fw.counter < fw.limit {
fw.counter++ // Increment counter
return true // Allow request
}
return false // Reject request
}
The Boundary Problem:
3. Sliding Window Log
Stores timestamps of all requests within the current time window:
type SlidingWindowLog struct {
window time.Duration
limit int
timestamps []time.Time // all requests in window
}
Core Algorithm Logic:
func (swl *SlidingWindowLog) Allow() bool {
now := time.Now()
cutoff := now.Add(-swl.window) // Window boundary
// 1. Remove expired timestamps
for len(swl.timestamps) > 0 && swl.timestamps[0].Before(cutoff) {
swl.timestamps = swl.timestamps[1:] // Remove first element
}
// 2. Check if under limit
if len(swl.timestamps) < swl.limit {
swl.timestamps = append(swl.timestamps, now) // Add current request
return true // Allow request
}
return false // Reject request
}
How Sliding Window Log Fixes the Boundary Problem:
Most accurate but memory intensive (O(n) where n = requests in window)
4. Sliding Window Counter
Hybrid approach combining fixed windows with weighted calculation:
Previous window: 80 requests
Current window: 40 requests
Position: 25% into current window
Weighted count = 40 + (80 ร 0.75) = 100
type SlidingWindowCounter struct {
window time.Duration // window size (e.g., 1 minute)
limit int // max requests per window
currentCount int // requests in current window
previousCount int // requests in previous window
currentWindow time.Time // current window start time
}
Core Algorithm Logic:
func (swc *SlidingWindowCounter) Allow() bool {
now := time.Now()
currentWindow := now.Truncate(swc.window) // Current window start
// 1. Update counters if new window
if currentWindow != swc.currentWindow {
swc.previousCount = swc.currentCount
swc.currentCount = 0
swc.currentWindow = currentWindow
}
// 2. Calculate weighted count
elapsed := now.Sub(currentWindow)
weight := 1.0 - float64(elapsed)/float64(swc.window)
estimatedCount := swc.currentCount + int(float64(swc.previousCount) * weight)
// 3. Check if under limit
if estimatedCount < swc.limit {
swc.currentCount++ // Increment current window
return true // Allow request
}
return false // Reject request
}
How Sliding Window Counter Approximates the Solution:
Performance Comparison
Performance Comparison
Note: Benchmarks run on Apple M4 Pro. Lower latency is better. All algorithms except Sliding Log have zero memory allocations per operation.
Algorithm Comparison
๐ชฃToken Bucket
KEY FEATURES
- Burst handling
- Simple implementation
- Memory efficient
๐ชFixed Window
KEY FEATURES
- Simplest algorithm
- Reset at intervals
- Minimal overhead
๐Sliding Log
KEY FEATURES
- Most accurate
- Request-level tracking
- Higher memory usage
๐ขSliding Counter
KEY FEATURES
- Good balance
- Approximates sliding log
- Memory efficient
Interactive Demo
Try different rate limiting algorithms and see how they behave:
Rate Limiter Simulator
Request Timeline
Click "Run Simulation" to see requests
Production Considerations
1. Distributed Rate Limiting
In a distributed system, you need to coordinate rate limits across multiple servers:
// Using Redis for distributed rate limiting
type RedisRateLimiter struct {
client *redis.Client
local RateLimiter // fallback
}
2. Rate Limit Headers
Always return rate limit information to clients:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640995200
3. Graceful Degradation
func (rl *RateLimiter) AllowWithFallback(key string) bool {
// Try distributed rate limiter
allowed, err := rl.distributed.Allow(key)
if err != nil {
// Fall back to local rate limiter
return rl.local.Allow(key)
}
return allowed
}
Code Examples
Basic Usage
// Create a rate limiter
config := ratelimiter.Config{
Rate: 100.0, // 100 requests per second
Burst: 200, // burst up to 200
}
rl := ratelimiter.NewTokenBucket(config)
// Check if request is allowed
if rl.Allow(ctx, userID) {
// Process request
} else {
// Return 429 Too Many Requests
}
API Gateway Example
func RateLimitMiddleware(rl ratelimiter.RateLimiter) func(next http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
key := getClientIP(r)
if !rl.Allow(r.Context(), key) {
w.Header().Set("Retry-After", "60")
http.Error(w, "Rate limit exceeded", 429)
return
}
next.ServeHTTP(w, r)
})
}
}
Implementation
Explore our Go implementations:
- ๐ Source Code - All 4 rate limiting algorithms
- ๐งช Example Usage - Comprehensive examples
- ๐ง Test Suite - Unit tests and benchmarks
- ๐ Documentation - Algorithm comparison guide
Next Steps
- Learn about Consistent Hashing for distributed systems
- Implement rate limiting in your own projects