release: bump version to 0.3.0

- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
2026-03-17 20:55:38 +00:00
parent 492410614f
commit f3453cb0fc
51 changed files with 6507 additions and 166 deletions
--- a/docs/advanced/distributed-systems.rst
+++ b/docs/advanced/distributed-systems.rst
@@ -0,0 +1,319 @@
+Distributed Systems
+===================
+
+Running rate limiting across multiple application instances requires careful
+consideration. This guide covers the patterns and pitfalls.
+
+The Challenge
+-------------
+
+In a distributed system, you might have:
+
+- Multiple application instances behind a load balancer
+- Kubernetes pods that scale up and down
+- Serverless functions that run independently
+
+Each instance needs to share rate limit state. Otherwise, a client could make
+100 requests to instance A and another 100 to instance B, effectively bypassing
+a 100 request limit.
+
+Redis: The Standard Solution
+----------------------------
+
+Redis is the go-to choice for distributed rate limiting:
+
+.. code-block:: python
+
+   from fastapi import FastAPI
+   from fastapi_traffic import RateLimiter
+   from fastapi_traffic.backends.redis import RedisBackend
+   from fastapi_traffic.core.limiter import set_limiter
+
+   app = FastAPI()
+
+   @app.on_event("startup")
+   async def startup():
+       backend = await RedisBackend.from_url(
+           "redis://redis-server:6379/0",
+           key_prefix="myapp:ratelimit",
+       )
+       limiter = RateLimiter(backend)
+       set_limiter(limiter)
+       await limiter.initialize()
+
+   @app.on_event("shutdown")
+   async def shutdown():
+       limiter = get_limiter()
+       await limiter.close()
+
+All instances connect to the same Redis server and share state.
+
+High Availability Redis
+-----------------------
+
+For production, you'll want Redis with high availability:
+
+**Redis Sentinel:**
+
+.. code-block:: python
+
+   backend = await RedisBackend.from_url(
+       "redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/0",
+       sentinel_master="mymaster",
+   )
+
+**Redis Cluster:**
+
+.. code-block:: python
+
+   backend = await RedisBackend.from_url(
+       "redis://node1:6379,node2:6379,node3:6379/0",
+   )
+
+Atomic Operations
+-----------------
+
+Race conditions are a real concern in distributed systems. Consider this scenario:
+
+1. Instance A reads: 99 requests made
+2. Instance B reads: 99 requests made
+3. Instance A writes: 100 requests (allows request)
+4. Instance B writes: 100 requests (allows request)
+
+Now you've allowed 101 requests when the limit was 100.
+
+FastAPI Traffic's Redis backend uses Lua scripts to make operations atomic:
+
+.. code-block:: lua
+
+   -- Simplified example of atomic check-and-increment
+   local current = redis.call('GET', KEYS[1])
+   if current and tonumber(current) >= limit then
+       return 0  -- Reject
+   end
+   redis.call('INCR', KEYS[1])
+   return 1  -- Allow
+
+The entire check-and-update happens in a single Redis operation.
+
+Network Latency
+---------------
+
+Redis adds network latency to every request. Some strategies to minimize impact:
+
+**1. Connection pooling (automatic):**
+
+The Redis backend maintains a connection pool, so you're not creating new
+connections for each request.
+
+**2. Local caching:**
+
+For very high-traffic endpoints, consider a two-tier approach:
+
+.. code-block:: python
+
+   from fastapi_traffic import MemoryBackend, RateLimiter
+
+   # Local memory backend for fast path
+   local_backend = MemoryBackend()
+   local_limiter = RateLimiter(local_backend)
+
+   # Redis backend for distributed state
+   redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")
+   distributed_limiter = RateLimiter(redis_backend)
+
+   async def check_rate_limit(request: Request, config: RateLimitConfig):
+       # Quick local check (may allow some extra requests)
+       local_result = await local_limiter.check(request, config)
+       if not local_result.allowed:
+           return local_result
+       
+       # Authoritative distributed check
+       return await distributed_limiter.check(request, config)
+
+**3. Skip on error:**
+
+If Redis latency is causing issues, you might prefer to allow requests through
+rather than block:
+
+.. code-block:: python
+
+   @rate_limit(100, 60, skip_on_error=True)
+   async def endpoint(request: Request):
+       return {"status": "ok"}
+
+Handling Redis Failures
+-----------------------
+
+What happens when Redis goes down?
+
+**Fail closed (default):**
+
+Requests fail. This is safer but impacts availability.
+
+**Fail open:**
+
+Allow requests through:
+
+.. code-block:: python
+
+   @rate_limit(100, 60, skip_on_error=True)
+
+**Circuit breaker pattern:**
+
+Implement a circuit breaker to avoid hammering a failing Redis:
+
+.. code-block:: python
+
+   import time
+
+   class CircuitBreaker:
+       def __init__(self, failure_threshold=5, reset_timeout=60):
+           self.failures = 0
+           self.threshold = failure_threshold
+           self.reset_timeout = reset_timeout
+           self.last_failure = 0
+           self.open = False
+
+       def record_failure(self):
+           self.failures += 1
+           self.last_failure = time.time()
+           if self.failures >= self.threshold:
+               self.open = True
+
+       def record_success(self):
+           self.failures = 0
+           self.open = False
+
+       def should_allow(self) -> bool:
+           if not self.open:
+               return True
+           # Check if we should try again
+           if time.time() - self.last_failure > self.reset_timeout:
+               return True
+           return False
+
+Kubernetes Deployment
+---------------------
+
+Here's a typical Kubernetes setup:
+
+.. code-block:: yaml
+
+   # redis-deployment.yaml
+   apiVersion: apps/v1
+   kind: Deployment
+   metadata:
+     name: redis
+   spec:
+     replicas: 1
+     selector:
+       matchLabels:
+         app: redis
+     template:
+       metadata:
+         labels:
+           app: redis
+       spec:
+         containers:
+         - name: redis
+           image: redis:7-alpine
+           ports:
+           - containerPort: 6379
+   ---
+   apiVersion: v1
+   kind: Service
+   metadata:
+     name: redis
+   spec:
+     selector:
+       app: redis
+     ports:
+     - port: 6379
+
+.. code-block:: yaml
+
+   # app-deployment.yaml
+   apiVersion: apps/v1
+   kind: Deployment
+   metadata:
+     name: api
+   spec:
+     replicas: 3
+     selector:
+       matchLabels:
+         app: api
+     template:
+       spec:
+         containers:
+         - name: api
+           image: myapp:latest
+           env:
+           - name: REDIS_URL
+             value: "redis://redis:6379/0"
+
+Your app connects to Redis via the service name:
+
+.. code-block:: python
+
+   import os
+
+   redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0")
+   backend = await RedisBackend.from_url(redis_url)
+
+Monitoring
+----------
+
+Keep an eye on:
+
+1. **Redis latency:** High latency means slow requests
+2. **Redis memory:** Rate limit data shouldn't use much, but monitor it
+3. **Connection count:** Make sure you're not exhausting connections
+4. **Rate limit hits:** Track how often clients are being limited
+
+.. code-block:: python
+
+   import logging
+
+   logger = logging.getLogger(__name__)
+
+   def on_rate_limited(request: Request, result):
+       logger.info(
+           "Rate limited: client=%s path=%s remaining=%d",
+           request.client.host,
+           request.url.path,
+           result.info.remaining,
+       )
+
+   @rate_limit(100, 60, on_blocked=on_rate_limited)
+   async def endpoint(request: Request):
+       return {"status": "ok"}
+
+Testing Distributed Rate Limits
+-------------------------------
+
+Testing distributed behavior is tricky. Here's an approach:
+
+.. code-block:: python
+
+   import asyncio
+   import httpx
+
+   async def test_distributed_limit():
+       """Simulate requests from multiple 'instances'."""
+       async with httpx.AsyncClient() as client:
+           # Fire 150 requests concurrently
+           tasks = [
+               client.get("http://localhost:8000/api/data")
+               for _ in range(150)
+           ]
+           responses = await asyncio.gather(*tasks)
+           
+           # Count successes and rate limits
+           successes = sum(1 for r in responses if r.status_code == 200)
+           limited = sum(1 for r in responses if r.status_code == 429)
+           
+           print(f"Successes: {successes}, Rate limited: {limited}")
+           # With a limit of 100, expect ~100 successes and ~50 limited
+
+   asyncio.run(test_distributed_limit())