Middleware ========== Sometimes you want rate limiting applied to your entire API, not just individual endpoints. That's where middleware comes in. Middleware sits between the client and your application, checking every request before it reaches your endpoints. Basic Usage ----------- Add the middleware to your FastAPI app: .. code-block:: python from fastapi import FastAPI from fastapi_traffic.middleware import RateLimitMiddleware app = FastAPI() app.add_middleware( RateLimitMiddleware, limit=1000, # 1000 requests window_size=60, # per minute ) @app.get("/api/users") async def get_users(): return {"users": []} @app.get("/api/posts") async def get_posts(): return {"posts": []} Now every endpoint shares the same rate limit pool. A client who makes 500 requests to ``/api/users`` only has 500 left for ``/api/posts``. Exempting Paths --------------- You probably don't want to rate limit your health checks or documentation: .. code-block:: python app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, exempt_paths={ "/health", "/ready", "/docs", "/redoc", "/openapi.json", }, ) These paths bypass rate limiting entirely. Exempting IPs ------------- Internal services, monitoring systems, or your own infrastructure might need unrestricted access: .. code-block:: python app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, exempt_ips={ "127.0.0.1", "10.0.0.0/8", # Internal network "192.168.1.100", # Monitoring server }, ) .. note:: IP exemptions are checked against the client IP extracted by the key extractor. Make sure your proxy headers are configured correctly if you're behind a load balancer. Custom Key Extraction --------------------- By default, clients are identified by IP address. You can change this: .. code-block:: python from starlette.requests import Request def get_client_id(request: Request) -> str: """Identify clients by API key, fall back to IP.""" api_key = request.headers.get("X-API-Key") if api_key: return f"api:{api_key}" return request.client.host if request.client else "unknown" app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, key_extractor=get_client_id, ) Choosing an Algorithm --------------------- The middleware supports all five algorithms: .. code-block:: python from fastapi_traffic.core.algorithms import Algorithm # Token bucket for burst-friendly limiting app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, algorithm=Algorithm.TOKEN_BUCKET, ) # Sliding window for precise limiting app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, algorithm=Algorithm.SLIDING_WINDOW, ) Using a Custom Backend ---------------------- By default, middleware uses the memory backend. For production, you'll want something persistent: .. code-block:: python from fastapi_traffic import SQLiteBackend from fastapi_traffic.middleware import RateLimitMiddleware backend = SQLiteBackend("rate_limits.db") app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, backend=backend, ) @app.on_event("shutdown") async def shutdown(): await backend.close() For Redis: .. code-block:: python from fastapi_traffic.backends.redis import RedisBackend # Create backend at startup redis_backend = None @app.on_event("startup") async def startup(): global redis_backend redis_backend = await RedisBackend.from_url("redis://localhost:6379/0") # Note: You'll need to configure middleware after startup # or use a factory pattern Convenience Middleware Classes ------------------------------ For common use cases, we provide pre-configured middleware: .. code-block:: python from fastapi_traffic.middleware import ( SlidingWindowMiddleware, TokenBucketMiddleware, ) # Sliding window algorithm app.add_middleware( SlidingWindowMiddleware, limit=1000, window_size=60, ) # Token bucket algorithm app.add_middleware( TokenBucketMiddleware, limit=1000, window_size=60, ) Combining with Decorator ------------------------ You can use both middleware and decorators. The middleware provides a baseline limit, and decorators can add stricter limits to specific endpoints: .. code-block:: python from fastapi_traffic import rate_limit from fastapi_traffic.middleware import RateLimitMiddleware # Global limit: 1000 req/min app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, ) # This endpoint has an additional, stricter limit @app.post("/api/expensive-operation") @rate_limit(10, 60) # Only 10 req/min for this endpoint async def expensive_operation(request: Request): return {"result": "done"} # This endpoint uses only the global limit @app.get("/api/cheap-operation") async def cheap_operation(): return {"result": "done"} Both limits are checked. A request must pass both the middleware limit AND the decorator limit. Error Responses --------------- When a client exceeds the rate limit, they get a 429 response: .. code-block:: json { "detail": "Rate limit exceeded. Please try again later.", "retry_after": 45.2 } You can customize the message: .. code-block:: python app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, error_message="Whoa there! You're making requests too fast.", status_code=429, ) Response Headers ---------------- By default, rate limit headers are included in every response: .. code-block:: http X-RateLimit-Limit: 1000 X-RateLimit-Remaining: 847 X-RateLimit-Reset: 1709834400 When rate limited: .. code-block:: http Retry-After: 45 Disable headers if you don't want to expose this information: .. code-block:: python app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, include_headers=False, ) Handling Backend Errors ----------------------- What happens if your Redis server goes down? By default, the middleware will raise an exception. You can change this behavior: .. code-block:: python app.add_middleware( RateLimitMiddleware, limit=1000, window_size=60, skip_on_error=True, # Allow requests through if backend fails ) With ``skip_on_error=True``, requests are allowed through when the backend is unavailable. This is a tradeoff between availability and protection. Full Configuration Reference ---------------------------- .. code-block:: python app.add_middleware( RateLimitMiddleware, limit=1000, # Max requests per window window_size=60.0, # Window size in seconds algorithm=Algorithm.SLIDING_WINDOW_COUNTER, # Algorithm to use backend=None, # Storage backend (default: MemoryBackend) key_prefix="middleware", # Prefix for rate limit keys include_headers=True, # Add rate limit headers to responses error_message="Rate limit exceeded. Please try again later.", status_code=429, # HTTP status when limited skip_on_error=False, # Allow requests if backend fails exempt_paths=None, # Set of paths to exempt exempt_ips=None, # Set of IPs to exempt key_extractor=default_key_extractor, # Function to identify clients )