release: bump version to 0.3.0
- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
This commit is contained in:
319
docs/advanced/distributed-systems.rst
Normal file
319
docs/advanced/distributed-systems.rst
Normal file
@@ -0,0 +1,319 @@
|
||||
Distributed Systems
|
||||
===================
|
||||
|
||||
Running rate limiting across multiple application instances requires careful
|
||||
consideration. This guide covers the patterns and pitfalls.
|
||||
|
||||
The Challenge
|
||||
-------------
|
||||
|
||||
In a distributed system, you might have:
|
||||
|
||||
- Multiple application instances behind a load balancer
|
||||
- Kubernetes pods that scale up and down
|
||||
- Serverless functions that run independently
|
||||
|
||||
Each instance needs to share rate limit state. Otherwise, a client could make
|
||||
100 requests to instance A and another 100 to instance B, effectively bypassing
|
||||
a 100 request limit.
|
||||
|
||||
Redis: The Standard Solution
|
||||
----------------------------
|
||||
|
||||
Redis is the go-to choice for distributed rate limiting:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi_traffic import RateLimiter
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://redis-server:6379/0",
|
||||
key_prefix="myapp:ratelimit",
|
||||
)
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
limiter = get_limiter()
|
||||
await limiter.close()
|
||||
|
||||
All instances connect to the same Redis server and share state.
|
||||
|
||||
High Availability Redis
|
||||
-----------------------
|
||||
|
||||
For production, you'll want Redis with high availability:
|
||||
|
||||
**Redis Sentinel:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/0",
|
||||
sentinel_master="mymaster",
|
||||
)
|
||||
|
||||
**Redis Cluster:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://node1:6379,node2:6379,node3:6379/0",
|
||||
)
|
||||
|
||||
Atomic Operations
|
||||
-----------------
|
||||
|
||||
Race conditions are a real concern in distributed systems. Consider this scenario:
|
||||
|
||||
1. Instance A reads: 99 requests made
|
||||
2. Instance B reads: 99 requests made
|
||||
3. Instance A writes: 100 requests (allows request)
|
||||
4. Instance B writes: 100 requests (allows request)
|
||||
|
||||
Now you've allowed 101 requests when the limit was 100.
|
||||
|
||||
FastAPI Traffic's Redis backend uses Lua scripts to make operations atomic:
|
||||
|
||||
.. code-block:: lua
|
||||
|
||||
-- Simplified example of atomic check-and-increment
|
||||
local current = redis.call('GET', KEYS[1])
|
||||
if current and tonumber(current) >= limit then
|
||||
return 0 -- Reject
|
||||
end
|
||||
redis.call('INCR', KEYS[1])
|
||||
return 1 -- Allow
|
||||
|
||||
The entire check-and-update happens in a single Redis operation.
|
||||
|
||||
Network Latency
|
||||
---------------
|
||||
|
||||
Redis adds network latency to every request. Some strategies to minimize impact:
|
||||
|
||||
**1. Connection pooling (automatic):**
|
||||
|
||||
The Redis backend maintains a connection pool, so you're not creating new
|
||||
connections for each request.
|
||||
|
||||
**2. Local caching:**
|
||||
|
||||
For very high-traffic endpoints, consider a two-tier approach:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
|
||||
# Local memory backend for fast path
|
||||
local_backend = MemoryBackend()
|
||||
local_limiter = RateLimiter(local_backend)
|
||||
|
||||
# Redis backend for distributed state
|
||||
redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
distributed_limiter = RateLimiter(redis_backend)
|
||||
|
||||
async def check_rate_limit(request: Request, config: RateLimitConfig):
|
||||
# Quick local check (may allow some extra requests)
|
||||
local_result = await local_limiter.check(request, config)
|
||||
if not local_result.allowed:
|
||||
return local_result
|
||||
|
||||
# Authoritative distributed check
|
||||
return await distributed_limiter.check(request, config)
|
||||
|
||||
**3. Skip on error:**
|
||||
|
||||
If Redis latency is causing issues, you might prefer to allow requests through
|
||||
rather than block:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
Handling Redis Failures
|
||||
-----------------------
|
||||
|
||||
What happens when Redis goes down?
|
||||
|
||||
**Fail closed (default):**
|
||||
|
||||
Requests fail. This is safer but impacts availability.
|
||||
|
||||
**Fail open:**
|
||||
|
||||
Allow requests through:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
|
||||
**Circuit breaker pattern:**
|
||||
|
||||
Implement a circuit breaker to avoid hammering a failing Redis:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import time
|
||||
|
||||
class CircuitBreaker:
|
||||
def __init__(self, failure_threshold=5, reset_timeout=60):
|
||||
self.failures = 0
|
||||
self.threshold = failure_threshold
|
||||
self.reset_timeout = reset_timeout
|
||||
self.last_failure = 0
|
||||
self.open = False
|
||||
|
||||
def record_failure(self):
|
||||
self.failures += 1
|
||||
self.last_failure = time.time()
|
||||
if self.failures >= self.threshold:
|
||||
self.open = True
|
||||
|
||||
def record_success(self):
|
||||
self.failures = 0
|
||||
self.open = False
|
||||
|
||||
def should_allow(self) -> bool:
|
||||
if not self.open:
|
||||
return True
|
||||
# Check if we should try again
|
||||
if time.time() - self.last_failure > self.reset_timeout:
|
||||
return True
|
||||
return False
|
||||
|
||||
Kubernetes Deployment
|
||||
---------------------
|
||||
|
||||
Here's a typical Kubernetes setup:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# redis-deployment.yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: redis
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: redis
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: redis
|
||||
spec:
|
||||
containers:
|
||||
- name: redis
|
||||
image: redis:7-alpine
|
||||
ports:
|
||||
- containerPort: 6379
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: redis
|
||||
spec:
|
||||
selector:
|
||||
app: redis
|
||||
ports:
|
||||
- port: 6379
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# app-deployment.yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: api
|
||||
spec:
|
||||
replicas: 3
|
||||
selector:
|
||||
matchLabels:
|
||||
app: api
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: api
|
||||
image: myapp:latest
|
||||
env:
|
||||
- name: REDIS_URL
|
||||
value: "redis://redis:6379/0"
|
||||
|
||||
Your app connects to Redis via the service name:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import os
|
||||
|
||||
redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0")
|
||||
backend = await RedisBackend.from_url(redis_url)
|
||||
|
||||
Monitoring
|
||||
----------
|
||||
|
||||
Keep an eye on:
|
||||
|
||||
1. **Redis latency:** High latency means slow requests
|
||||
2. **Redis memory:** Rate limit data shouldn't use much, but monitor it
|
||||
3. **Connection count:** Make sure you're not exhausting connections
|
||||
4. **Rate limit hits:** Track how often clients are being limited
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def on_rate_limited(request: Request, result):
|
||||
logger.info(
|
||||
"Rate limited: client=%s path=%s remaining=%d",
|
||||
request.client.host,
|
||||
request.url.path,
|
||||
result.info.remaining,
|
||||
)
|
||||
|
||||
@rate_limit(100, 60, on_blocked=on_rate_limited)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
Testing Distributed Rate Limits
|
||||
-------------------------------
|
||||
|
||||
Testing distributed behavior is tricky. Here's an approach:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
|
||||
async def test_distributed_limit():
|
||||
"""Simulate requests from multiple 'instances'."""
|
||||
async with httpx.AsyncClient() as client:
|
||||
# Fire 150 requests concurrently
|
||||
tasks = [
|
||||
client.get("http://localhost:8000/api/data")
|
||||
for _ in range(150)
|
||||
]
|
||||
responses = await asyncio.gather(*tasks)
|
||||
|
||||
# Count successes and rate limits
|
||||
successes = sum(1 for r in responses if r.status_code == 200)
|
||||
limited = sum(1 for r in responses if r.status_code == 429)
|
||||
|
||||
print(f"Successes: {successes}, Rate limited: {limited}")
|
||||
# With a limit of 100, expect ~100 successes and ~50 limited
|
||||
|
||||
asyncio.run(test_distributed_limit())
|
||||
Reference in New Issue
Block a user