fastapi-traffic/docs/user-guide/middleware.rst

Middleware
==========

Sometimes you want rate limiting applied to your entire API, not just individual
endpoints. That's where middleware comes in.

Middleware sits between the client and your application, checking every request
before it reaches your endpoints.

Basic Usage
-----------

Add the middleware to your FastAPI app:

.. code-block:: python

   from fastapi import FastAPI
   from fastapi_traffic.middleware import RateLimitMiddleware

   app = FastAPI()

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,       # 1000 requests
       window_size=60,   # per minute
   )

   @app.get("/api/users")
   async def get_users():
       return {"users": []}

   @app.get("/api/posts")
   async def get_posts():
       return {"posts": []}

Now every endpoint shares the same rate limit pool. A client who makes 500 requests
to ``/api/users`` only has 500 left for ``/api/posts``.

Exempting Paths
---------------

You probably don't want to rate limit your health checks or documentation:

.. code-block:: python

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       exempt_paths={
           "/health",
           "/ready",
           "/docs",
           "/redoc",
           "/openapi.json",
       },
   )

These paths bypass rate limiting entirely.

Exempting IPs
-------------

Internal services, monitoring systems, or your own infrastructure might need
unrestricted access:

.. code-block:: python

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       exempt_ips={
           "127.0.0.1",
           "10.0.0.0/8",      # Internal network
           "192.168.1.100",   # Monitoring server
       },
   )

.. note::

   IP exemptions are checked against the client IP extracted by the key extractor.
   Make sure your proxy headers are configured correctly if you're behind a load
   balancer.

Custom Key Extraction
---------------------

By default, clients are identified by IP address. You can change this:

.. code-block:: python

   from starlette.requests import Request

   def get_client_id(request: Request) -> str:
       """Identify clients by API key, fall back to IP."""
       api_key = request.headers.get("X-API-Key")
       if api_key:
           return f"api:{api_key}"
       return request.client.host if request.client else "unknown"

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       key_extractor=get_client_id,
   )

Choosing an Algorithm
---------------------

The middleware supports all five algorithms:

.. code-block:: python

   from fastapi_traffic.core.algorithms import Algorithm

   # Token bucket for burst-friendly limiting
   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       algorithm=Algorithm.TOKEN_BUCKET,
   )

   # Sliding window for precise limiting
   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       algorithm=Algorithm.SLIDING_WINDOW,
   )

Using a Custom Backend
----------------------

By default, middleware uses the memory backend. For production, you'll want
something persistent:

.. code-block:: python

   from fastapi_traffic import SQLiteBackend
   from fastapi_traffic.middleware import RateLimitMiddleware

   backend = SQLiteBackend("rate_limits.db")

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       backend=backend,
   )

   @app.on_event("shutdown")
   async def shutdown():
       await backend.close()

For Redis:

.. code-block:: python

   from fastapi_traffic.backends.redis import RedisBackend

   # Create backend at startup
   redis_backend = None

   @app.on_event("startup")
   async def startup():
       global redis_backend
       redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")

   # Note: You'll need to configure middleware after startup
   # or use a factory pattern

Convenience Middleware Classes
------------------------------

For common use cases, we provide pre-configured middleware:

.. code-block:: python

   from fastapi_traffic.middleware import (
       SlidingWindowMiddleware,
       TokenBucketMiddleware,
   )

   # Sliding window algorithm
   app.add_middleware(
       SlidingWindowMiddleware,
       limit=1000,
       window_size=60,
   )

   # Token bucket algorithm
   app.add_middleware(
       TokenBucketMiddleware,
       limit=1000,
       window_size=60,
   )

Combining with Decorator
------------------------

You can use both middleware and decorators. The middleware provides a baseline
limit, and decorators can add stricter limits to specific endpoints:

.. code-block:: python

   from fastapi_traffic import rate_limit
   from fastapi_traffic.middleware import RateLimitMiddleware

   # Global limit: 1000 req/min
   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
   )

   # This endpoint has an additional, stricter limit
   @app.post("/api/expensive-operation")
   @rate_limit(10, 60)  # Only 10 req/min for this endpoint
   async def expensive_operation(request: Request):
       return {"result": "done"}

   # This endpoint uses only the global limit
   @app.get("/api/cheap-operation")
   async def cheap_operation():
       return {"result": "done"}

Both limits are checked. A request must pass both the middleware limit AND the
decorator limit.

Error Responses
---------------

When a client exceeds the rate limit, they get a 429 response:

.. code-block:: json

   {
     "detail": "Rate limit exceeded. Please try again later.",
     "retry_after": 45.2
   }

You can customize the message:

.. code-block:: python

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       error_message="Whoa there! You're making requests too fast.",
       status_code=429,
   )

Response Headers
----------------

By default, rate limit headers are included in every response:

.. code-block:: http

   X-RateLimit-Limit: 1000
   X-RateLimit-Remaining: 847
   X-RateLimit-Reset: 1709834400

When rate limited:

.. code-block:: http

   Retry-After: 45

Disable headers if you don't want to expose this information:

.. code-block:: python

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       include_headers=False,
   )

Handling Backend Errors
-----------------------

What happens if your Redis server goes down? By default, the middleware will
raise an exception. You can change this behavior:

.. code-block:: python

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,
       window_size=60,
       skip_on_error=True,  # Allow requests through if backend fails
   )

With ``skip_on_error=True``, requests are allowed through when the backend is
unavailable. This is a tradeoff between availability and protection.

Full Configuration Reference
----------------------------

.. code-block:: python

   app.add_middleware(
       RateLimitMiddleware,
       limit=1000,                    # Max requests per window
       window_size=60.0,              # Window size in seconds
       algorithm=Algorithm.SLIDING_WINDOW_COUNTER,  # Algorithm to use
       backend=None,                  # Storage backend (default: MemoryBackend)
       key_prefix="middleware",       # Prefix for rate limit keys
       include_headers=True,          # Add rate limit headers to responses
       error_message="Rate limit exceeded. Please try again later.",
       status_code=429,               # HTTP status when limited
       skip_on_error=False,           # Allow requests if backend fails
       exempt_paths=None,             # Set of paths to exempt
       exempt_ips=None,               # Set of IPs to exempt
       key_extractor=default_key_extractor,  # Function to identify clients
   )