release: bump version to 0.3.0

- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
2026-03-17 20:55:38 +00:00
parent 492410614f
commit f3453cb0fc
51 changed files with 6507 additions and 166 deletions
--- a/docs/api/algorithms.rst
+++ b/docs/api/algorithms.rst
@@ -0,0 +1,211 @@
+Algorithms API
+==============
+
+Rate limiting algorithms and the factory function to create them.
+
+Algorithm Enum
+--------------
+
+.. py:class:: Algorithm
+
+   Enumeration of available rate limiting algorithms.
+
+   .. py:attribute:: TOKEN_BUCKET
+      :value: "token_bucket"
+
+      Token bucket algorithm. Allows bursts up to bucket capacity, then refills
+      at a steady rate.
+
+   .. py:attribute:: SLIDING_WINDOW
+      :value: "sliding_window"
+
+      Sliding window log algorithm. Tracks exact timestamps for precise limiting.
+      Higher memory usage.
+
+   .. py:attribute:: FIXED_WINDOW
+      :value: "fixed_window"
+
+      Fixed window algorithm. Simple time-based windows. Efficient but has
+      boundary issues.
+
+   .. py:attribute:: LEAKY_BUCKET
+      :value: "leaky_bucket"
+
+      Leaky bucket algorithm. Smooths out request rate for consistent throughput.
+
+   .. py:attribute:: SLIDING_WINDOW_COUNTER
+      :value: "sliding_window_counter"
+
+      Sliding window counter algorithm. Balances precision and efficiency.
+      This is the default.
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import Algorithm, rate_limit
+
+      @rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET)
+      async def endpoint(request: Request):
+          return {"status": "ok"}
+
+BaseAlgorithm
+-------------
+
+.. py:class:: BaseAlgorithm(limit, window_size, backend, *, burst_size=None)
+
+   Abstract base class for rate limiting algorithms.
+
+   :param limit: Maximum requests allowed in the window.
+   :type limit: int
+   :param window_size: Time window in seconds.
+   :type window_size: float
+   :param backend: Storage backend for rate limit state.
+   :type backend: Backend
+   :param burst_size: Maximum burst size. Defaults to limit.
+   :type burst_size: int | None
+
+   .. py:method:: check(key)
+      :async:
+
+      Check if a request is allowed and update state.
+
+      :param key: The rate limit key.
+      :type key: str
+      :returns: Tuple of (allowed, RateLimitInfo).
+      :rtype: tuple[bool, RateLimitInfo]
+
+   .. py:method:: reset(key)
+      :async:
+
+      Reset the rate limit state for a key.
+
+      :param key: The rate limit key.
+      :type key: str
+
+   .. py:method:: get_state(key)
+      :async:
+
+      Get current state without consuming a token.
+
+      :param key: The rate limit key.
+      :type key: str
+      :returns: Current rate limit info or None.
+      :rtype: RateLimitInfo | None
+
+TokenBucketAlgorithm
+--------------------
+
+.. py:class:: TokenBucketAlgorithm(limit, window_size, backend, *, burst_size=None)
+
+   Token bucket algorithm implementation.
+
+   Tokens are added to the bucket at a rate of ``limit / window_size`` per second.
+   Each request consumes one token. If no tokens are available, the request is
+   rejected.
+
+   The ``burst_size`` parameter controls the maximum bucket capacity, allowing
+   short bursts of traffic.
+
+   **State stored:**
+
+   - ``tokens``: Current number of tokens in the bucket
+   - ``last_update``: Timestamp of last update
+
+SlidingWindowAlgorithm
+----------------------
+
+.. py:class:: SlidingWindowAlgorithm(limit, window_size, backend, *, burst_size=None)
+
+   Sliding window log algorithm implementation.
+
+   Stores the timestamp of every request within the window. Provides the most
+   accurate rate limiting but uses more memory.
+
+   **State stored:**
+
+   - ``timestamps``: List of request timestamps within the window
+
+FixedWindowAlgorithm
+--------------------
+
+.. py:class:: FixedWindowAlgorithm(limit, window_size, backend, *, burst_size=None)
+
+   Fixed window algorithm implementation.
+
+   Divides time into fixed windows and counts requests in each window. Simple
+   and efficient, but allows up to 2x the limit at window boundaries.
+
+   **State stored:**
+
+   - ``count``: Number of requests in current window
+   - ``window_start``: Start timestamp of current window
+
+LeakyBucketAlgorithm
+--------------------
+
+.. py:class:: LeakyBucketAlgorithm(limit, window_size, backend, *, burst_size=None)
+
+   Leaky bucket algorithm implementation.
+
+   Requests fill a bucket that "leaks" at a constant rate. Smooths out traffic
+   for consistent throughput.
+
+   **State stored:**
+
+   - ``water_level``: Current water level in the bucket
+   - ``last_update``: Timestamp of last update
+
+SlidingWindowCounterAlgorithm
+-----------------------------
+
+.. py:class:: SlidingWindowCounterAlgorithm(limit, window_size, backend, *, burst_size=None)
+
+   Sliding window counter algorithm implementation.
+
+   Maintains counters for current and previous windows, calculating a weighted
+   average based on window progress. Balances precision and memory efficiency.
+
+   **State stored:**
+
+   - ``prev_count``: Count from previous window
+   - ``curr_count``: Count in current window
+   - ``current_window``: Start timestamp of current window
+
+get_algorithm
+-------------
+
+.. py:function:: get_algorithm(algorithm, limit, window_size, backend, *, burst_size=None)
+
+   Factory function to create algorithm instances.
+
+   :param algorithm: The algorithm type to create.
+   :type algorithm: Algorithm
+   :param limit: Maximum requests allowed.
+   :type limit: int
+   :param window_size: Time window in seconds.
+   :type window_size: float
+   :param backend: Storage backend.
+   :type backend: Backend
+   :param burst_size: Maximum burst size.
+   :type burst_size: int | None
+   :returns: An algorithm instance.
+   :rtype: BaseAlgorithm
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic.core.algorithms import get_algorithm, Algorithm
+      from fastapi_traffic import MemoryBackend
+
+      backend = MemoryBackend()
+      algorithm = get_algorithm(
+          Algorithm.TOKEN_BUCKET,
+          limit=100,
+          window_size=60,
+          backend=backend,
+          burst_size=20,
+      )
+
+      allowed, info = await algorithm.check("user:123")
--- a/docs/api/backends.rst
+++ b/docs/api/backends.rst
@@ -0,0 +1,266 @@
+Backends API
+============
+
+Storage backends for rate limit state.
+
+Backend (Base Class)
+--------------------
+
+.. py:class:: Backend
+
+   Abstract base class for rate limit storage backends.
+
+   All backends must implement these methods:
+
+   .. py:method:: get(key)
+      :async:
+
+      Get the current state for a key.
+
+      :param key: The rate limit key.
+      :type key: str
+      :returns: The stored state dictionary or None if not found.
+      :rtype: dict[str, Any] | None
+
+   .. py:method:: set(key, value, *, ttl)
+      :async:
+
+      Set the state for a key with TTL.
+
+      :param key: The rate limit key.
+      :type key: str
+      :param value: The state dictionary to store.
+      :type value: dict[str, Any]
+      :param ttl: Time-to-live in seconds.
+      :type ttl: float
+
+   .. py:method:: delete(key)
+      :async:
+
+      Delete the state for a key.
+
+      :param key: The rate limit key.
+      :type key: str
+
+   .. py:method:: exists(key)
+      :async:
+
+      Check if a key exists.
+
+      :param key: The rate limit key.
+      :type key: str
+      :returns: True if the key exists.
+      :rtype: bool
+
+   .. py:method:: increment(key, amount=1)
+      :async:
+
+      Atomically increment a counter.
+
+      :param key: The rate limit key.
+      :type key: str
+      :param amount: The amount to increment by.
+      :type amount: int
+      :returns: The new value after incrementing.
+      :rtype: int
+
+   .. py:method:: clear()
+      :async:
+
+      Clear all rate limit data.
+
+   .. py:method:: close()
+      :async:
+
+      Close the backend connection.
+
+   Backends support async context manager protocol:
+
+   .. code-block:: python
+
+      async with MemoryBackend() as backend:
+          await backend.set("key", {"count": 1}, ttl=60)
+
+MemoryBackend
+-------------
+
+.. py:class:: MemoryBackend(max_size=10000, cleanup_interval=60)
+
+   In-memory storage backend with LRU eviction and TTL cleanup.
+
+   :param max_size: Maximum number of keys to store.
+   :type max_size: int
+   :param cleanup_interval: How often to clean expired entries (seconds).
+   :type cleanup_interval: float
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import MemoryBackend, RateLimiter
+
+      backend = MemoryBackend(max_size=10000)
+      limiter = RateLimiter(backend)
+
+   .. py:method:: get_stats()
+
+      Get statistics about the backend.
+
+      :returns: Dictionary with stats like key count, memory usage.
+      :rtype: dict[str, Any]
+
+   .. py:method:: start_cleanup()
+      :async:
+
+      Start the background cleanup task.
+
+   .. py:method:: stop_cleanup()
+      :async:
+
+      Stop the background cleanup task.
+
+SQLiteBackend
+-------------
+
+.. py:class:: SQLiteBackend(db_path, cleanup_interval=300)
+
+   SQLite storage backend for persistent rate limiting.
+
+   :param db_path: Path to the SQLite database file.
+   :type db_path: str | Path
+   :param cleanup_interval: How often to clean expired entries (seconds).
+   :type cleanup_interval: float
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import SQLiteBackend, RateLimiter
+
+      backend = SQLiteBackend("rate_limits.db")
+      limiter = RateLimiter(backend)
+
+      @app.on_event("startup")
+      async def startup():
+          await limiter.initialize()
+
+      @app.on_event("shutdown")
+      async def shutdown():
+          await limiter.close()
+
+   .. py:method:: initialize()
+      :async:
+
+      Initialize the database schema.
+
+   Features:
+
+   - WAL mode for better concurrent performance
+   - Automatic schema creation
+   - Connection pooling
+   - Background cleanup of expired entries
+
+RedisBackend
+------------
+
+.. py:class:: RedisBackend
+
+   Redis storage backend for distributed rate limiting.
+
+   .. py:method:: from_url(url, *, key_prefix="", **kwargs)
+      :classmethod:
+
+      Create a RedisBackend from a Redis URL. This is an async classmethod.
+
+      :param url: Redis connection URL.
+      :type url: str
+      :param key_prefix: Prefix for all keys.
+      :type key_prefix: str
+      :returns: Configured RedisBackend instance.
+      :rtype: RedisBackend
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic.backends.redis import RedisBackend
+      from fastapi_traffic import RateLimiter
+
+      @app.on_event("startup")
+      async def startup():
+          backend = await RedisBackend.from_url("redis://localhost:6379/0")
+          limiter = RateLimiter(backend)
+
+   **Connection examples:**
+
+   .. code-block:: python
+
+      # Simple connection
+      backend = await RedisBackend.from_url("redis://localhost:6379/0")
+
+      # With password
+      backend = await RedisBackend.from_url("redis://:password@localhost:6379/0")
+
+      # With key prefix
+      backend = await RedisBackend.from_url(
+          "redis://localhost:6379/0",
+          key_prefix="myapp:ratelimit:",
+      )
+
+   .. py:method:: get_stats()
+      :async:
+
+      Get statistics about the Redis backend.
+
+      :returns: Dictionary with stats like key count, memory usage.
+      :rtype: dict[str, Any]
+
+   Features:
+
+   - Atomic operations via Lua scripts
+   - Automatic key expiration
+   - Connection pooling
+   - Support for Redis Sentinel and Cluster
+
+Implementing Custom Backends
+----------------------------
+
+To create a custom backend, inherit from ``Backend`` and implement all abstract
+methods:
+
+.. code-block:: python
+
+   from fastapi_traffic.backends.base import Backend
+   from typing import Any
+
+   class MyBackend(Backend):
+       async def get(self, key: str) -> dict[str, Any] | None:
+           # Retrieve state from your storage
+           pass
+
+       async def set(self, key: str, value: dict[str, Any], *, ttl: float) -> None:
+           # Store state with expiration
+           pass
+
+       async def delete(self, key: str) -> None:
+           # Remove a key
+           pass
+
+       async def exists(self, key: str) -> bool:
+           # Check if key exists
+           pass
+
+       async def increment(self, key: str, amount: int = 1) -> int:
+           # Atomically increment (important for accuracy)
+           pass
+
+       async def clear(self) -> None:
+           # Clear all data
+           pass
+
+       async def close(self) -> None:
+           # Clean up connections
+           pass
+
+The ``value`` dictionary contains algorithm-specific state. Your backend should
+serialize it appropriately (JSON works well for most cases).
--- a/docs/api/config.rst
+++ b/docs/api/config.rst
@@ -0,0 +1,245 @@
+Configuration API
+=================
+
+Configuration classes and loaders for rate limiting.
+
+RateLimitConfig
+---------------
+
+.. py:class:: RateLimitConfig(limit, window_size=60.0, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, include_headers=True, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None, on_blocked=None)
+
+   Configuration for a rate limit rule.
+
+   :param limit: Maximum requests allowed in the window. Must be positive.
+   :type limit: int
+   :param window_size: Time window in seconds. Must be positive.
+   :type window_size: float
+   :param algorithm: Rate limiting algorithm to use.
+   :type algorithm: Algorithm
+   :param key_prefix: Prefix for the rate limit key.
+   :type key_prefix: str
+   :param key_extractor: Function to extract client identifier from request.
+   :type key_extractor: Callable[[Request], str]
+   :param burst_size: Maximum burst size for token/leaky bucket.
+   :type burst_size: int | None
+   :param include_headers: Whether to include rate limit headers.
+   :type include_headers: bool
+   :param error_message: Error message when rate limited.
+   :type error_message: str
+   :param status_code: HTTP status code when rate limited.
+   :type status_code: int
+   :param skip_on_error: Skip rate limiting on backend errors.
+   :type skip_on_error: bool
+   :param cost: Cost per request.
+   :type cost: int
+   :param exempt_when: Function to check if request is exempt.
+   :type exempt_when: Callable[[Request], bool] | None
+   :param on_blocked: Callback when request is blocked.
+   :type on_blocked: Callable[[Request, Any], Any] | None
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import RateLimitConfig, Algorithm
+
+      config = RateLimitConfig(
+          limit=100,
+          window_size=60,
+          algorithm=Algorithm.TOKEN_BUCKET,
+          burst_size=20,
+      )
+
+GlobalConfig
+------------
+
+.. py:class:: GlobalConfig(backend=None, enabled=True, default_limit=100, default_window_size=60.0, default_algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="fastapi_traffic", include_headers=True, error_message="Rate limit exceeded. Please try again later.", status_code=429, skip_on_error=False, exempt_ips=set(), exempt_paths=set(), headers_prefix="X-RateLimit")
+
+   Global configuration for the rate limiter.
+
+   :param backend: Storage backend for rate limit data.
+   :type backend: Backend | None
+   :param enabled: Whether rate limiting is enabled.
+   :type enabled: bool
+   :param default_limit: Default maximum requests per window.
+   :type default_limit: int
+   :param default_window_size: Default time window in seconds.
+   :type default_window_size: float
+   :param default_algorithm: Default rate limiting algorithm.
+   :type default_algorithm: Algorithm
+   :param key_prefix: Global prefix for all rate limit keys.
+   :type key_prefix: str
+   :param include_headers: Include rate limit headers by default.
+   :type include_headers: bool
+   :param error_message: Default error message.
+   :type error_message: str
+   :param status_code: Default HTTP status code.
+   :type status_code: int
+   :param skip_on_error: Skip rate limiting on backend errors.
+   :type skip_on_error: bool
+   :param exempt_ips: IP addresses exempt from rate limiting.
+   :type exempt_ips: set[str]
+   :param exempt_paths: URL paths exempt from rate limiting.
+   :type exempt_paths: set[str]
+   :param headers_prefix: Prefix for rate limit headers.
+   :type headers_prefix: str
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import GlobalConfig, RateLimiter
+
+      config = GlobalConfig(
+          enabled=True,
+          default_limit=100,
+          exempt_paths={"/health", "/docs"},
+          exempt_ips={"127.0.0.1"},
+      )
+
+      limiter = RateLimiter(config=config)
+
+ConfigLoader
+------------
+
+.. py:class:: ConfigLoader(prefix="FASTAPI_TRAFFIC")
+
+   Load rate limit configuration from various sources.
+
+   :param prefix: Environment variable prefix.
+   :type prefix: str
+
+   .. py:method:: load_rate_limit_config_from_env(env_vars=None, **overrides)
+
+      Load RateLimitConfig from environment variables.
+
+      :param env_vars: Dictionary of environment variables. Uses os.environ if None.
+      :type env_vars: dict[str, str] | None
+      :param overrides: Values to override after loading.
+      :returns: Loaded configuration.
+      :rtype: RateLimitConfig
+
+   .. py:method:: load_rate_limit_config_from_json(file_path, **overrides)
+
+      Load RateLimitConfig from a JSON file.
+
+      :param file_path: Path to the JSON file.
+      :type file_path: str | Path
+      :param overrides: Values to override after loading.
+      :returns: Loaded configuration.
+      :rtype: RateLimitConfig
+
+   .. py:method:: load_rate_limit_config_from_env_file(file_path, **overrides)
+
+      Load RateLimitConfig from a .env file.
+
+      :param file_path: Path to the .env file.
+      :type file_path: str | Path
+      :param overrides: Values to override after loading.
+      :returns: Loaded configuration.
+      :rtype: RateLimitConfig
+
+   .. py:method:: load_global_config_from_env(env_vars=None, **overrides)
+
+      Load GlobalConfig from environment variables.
+
+   .. py:method:: load_global_config_from_json(file_path, **overrides)
+
+      Load GlobalConfig from a JSON file.
+
+   .. py:method:: load_global_config_from_env_file(file_path, **overrides)
+
+      Load GlobalConfig from a .env file.
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import ConfigLoader
+
+      loader = ConfigLoader()
+
+      # From environment
+      config = loader.load_rate_limit_config_from_env()
+
+      # From JSON file
+      config = loader.load_rate_limit_config_from_json("config.json")
+
+      # From .env file
+      config = loader.load_rate_limit_config_from_env_file(".env")
+
+      # With overrides
+      config = loader.load_rate_limit_config_from_json(
+          "config.json",
+          limit=200,  # Override the limit
+      )
+
+Convenience Functions
+---------------------
+
+.. py:function:: load_rate_limit_config(file_path, **overrides)
+
+   Load RateLimitConfig with automatic format detection.
+
+   :param file_path: Path to config file (.json or .env).
+   :type file_path: str | Path
+   :returns: Loaded configuration.
+   :rtype: RateLimitConfig
+
+.. py:function:: load_rate_limit_config_from_env(**overrides)
+
+   Load RateLimitConfig from environment variables.
+
+   :returns: Loaded configuration.
+   :rtype: RateLimitConfig
+
+.. py:function:: load_global_config(file_path, **overrides)
+
+   Load GlobalConfig with automatic format detection.
+
+   :param file_path: Path to config file (.json or .env).
+   :type file_path: str | Path
+   :returns: Loaded configuration.
+   :rtype: GlobalConfig
+
+.. py:function:: load_global_config_from_env(**overrides)
+
+   Load GlobalConfig from environment variables.
+
+   :returns: Loaded configuration.
+   :rtype: GlobalConfig
+
+**Usage:**
+
+.. code-block:: python
+
+   from fastapi_traffic import (
+       load_rate_limit_config,
+       load_rate_limit_config_from_env,
+   )
+
+   # Auto-detect format
+   config = load_rate_limit_config("config.json")
+   config = load_rate_limit_config(".env")
+
+   # From environment
+   config = load_rate_limit_config_from_env()
+
+default_key_extractor
+---------------------
+
+.. py:function:: default_key_extractor(request)
+
+   Extract client IP as the default rate limit key.
+
+   Checks in order:
+
+   1. ``X-Forwarded-For`` header (first IP)
+   2. ``X-Real-IP`` header
+   3. Direct connection IP
+   4. Falls back to "unknown"
+
+   :param request: The incoming request.
+   :type request: Request
+   :returns: Client identifier string.
+   :rtype: str
--- a/docs/api/decorator.rst
+++ b/docs/api/decorator.rst
@@ -0,0 +1,154 @@
+Decorator API
+=============
+
+The ``@rate_limit`` decorator is the primary way to add rate limiting to your
+FastAPI endpoints.
+
+rate_limit
+----------
+
+.. py:function:: rate_limit(limit, window_size=60.0, *, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, include_headers=True, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None, on_blocked=None)
+
+   Apply rate limiting to a FastAPI endpoint.
+
+   :param limit: Maximum number of requests allowed in the window.
+   :type limit: int
+   :param window_size: Time window in seconds. Defaults to 60.
+   :type window_size: float
+   :param algorithm: Rate limiting algorithm to use.
+   :type algorithm: Algorithm
+   :param key_prefix: Prefix for the rate limit key.
+   :type key_prefix: str
+   :param key_extractor: Function to extract client identifier from request.
+   :type key_extractor: Callable[[Request], str]
+   :param burst_size: Maximum burst size for token bucket/leaky bucket algorithms.
+   :type burst_size: int | None
+   :param include_headers: Whether to include rate limit headers in response.
+   :type include_headers: bool
+   :param error_message: Error message when rate limit is exceeded.
+   :type error_message: str
+   :param status_code: HTTP status code when rate limit is exceeded.
+   :type status_code: int
+   :param skip_on_error: Skip rate limiting if backend errors occur.
+   :type skip_on_error: bool
+   :param cost: Cost of each request (default 1).
+   :type cost: int
+   :param exempt_when: Function to determine if request should be exempt.
+   :type exempt_when: Callable[[Request], bool] | None
+   :param on_blocked: Callback when a request is blocked.
+   :type on_blocked: Callable[[Request, Any], Any] | None
+   :returns: Decorated function with rate limiting applied.
+   :rtype: Callable
+
+   **Basic usage:**
+
+   .. code-block:: python
+
+      from fastapi import FastAPI, Request
+      from fastapi_traffic import rate_limit
+
+      app = FastAPI()
+
+      @app.get("/api/data")
+      @rate_limit(100, 60)  # 100 requests per minute
+      async def get_data(request: Request):
+          return {"data": "here"}
+
+   **With algorithm:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import rate_limit, Algorithm
+
+      @app.get("/api/burst")
+      @rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=20)
+      async def burst_endpoint(request: Request):
+          return {"status": "ok"}
+
+   **With custom key extractor:**
+
+   .. code-block:: python
+
+      def get_api_key(request: Request) -> str:
+          return request.headers.get("X-API-Key", "anonymous")
+
+      @app.get("/api/data")
+      @rate_limit(1000, 3600, key_extractor=get_api_key)
+      async def api_endpoint(request: Request):
+          return {"data": "here"}
+
+   **With exemption:**
+
+   .. code-block:: python
+
+      def is_admin(request: Request) -> bool:
+          return getattr(request.state, "is_admin", False)
+
+      @app.get("/api/admin")
+      @rate_limit(100, 60, exempt_when=is_admin)
+      async def admin_endpoint(request: Request):
+          return {"admin": "data"}
+
+RateLimitDependency
+-------------------
+
+.. py:class:: RateLimitDependency(limit, window_size=60.0, *, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None)
+   :no-index:
+
+   FastAPI dependency for rate limiting. Returns rate limit info that can be
+   used in your endpoint. See :doc:`dependency` for full documentation.
+
+   :param limit: Maximum number of requests allowed in the window.
+   :type limit: int
+   :param window_size: Time window in seconds.
+   :type window_size: float
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi import FastAPI, Depends, Request
+      from fastapi_traffic.core.decorator import RateLimitDependency
+
+      app = FastAPI()
+      rate_dep = RateLimitDependency(limit=100, window_size=60)
+
+      @app.get("/api/data")
+      async def get_data(request: Request, rate_info=Depends(rate_dep)):
+          return {
+              "data": "here",
+              "remaining_requests": rate_info.remaining,
+              "reset_at": rate_info.reset_at,
+          }
+
+   The dependency returns a ``RateLimitInfo`` object with:
+
+   - ``limit``: The configured limit
+   - ``remaining``: Remaining requests in the current window
+   - ``reset_at``: Unix timestamp when the window resets
+   - ``retry_after``: Seconds until retry (if rate limited)
+
+create_rate_limit_response
+--------------------------
+
+.. py:function:: create_rate_limit_response(exc, *, include_headers=True)
+
+   Create a standard rate limit response from a RateLimitExceeded exception.
+
+   :param exc: The RateLimitExceeded exception.
+   :type exc: RateLimitExceeded
+   :param include_headers: Whether to include rate limit headers.
+   :type include_headers: bool
+   :returns: A JSONResponse with rate limit information.
+   :rtype: Response
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import RateLimitExceeded
+      from fastapi_traffic.core.decorator import create_rate_limit_response
+
+      @app.exception_handler(RateLimitExceeded)
+      async def handler(request: Request, exc: RateLimitExceeded):
+          return create_rate_limit_response(exc)
--- a/docs/api/dependency.rst
+++ b/docs/api/dependency.rst
@@ -0,0 +1,473 @@
+Dependency Injection API
+========================
+
+If you're already using FastAPI's dependency injection system, you'll feel right
+at home with ``RateLimitDependency``. It plugs directly into ``Depends``, giving
+you rate limiting that works just like any other dependency—plus you get access
+to rate limit info right inside your endpoint.
+
+RateLimitDependency
+-------------------
+
+.. py:class:: RateLimitDependency(limit, window_size=60.0, *, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None)
+
+   This is the main class you'll use for dependency-based rate limiting. Create
+   an instance, pass it to ``Depends()``, and you're done.
+
+   :param limit: Maximum number of requests allowed in the window.
+   :type limit: int
+   :param window_size: Time window in seconds. Defaults to 60.
+   :type window_size: float
+   :param algorithm: Rate limiting algorithm to use.
+   :type algorithm: Algorithm
+   :param key_prefix: Prefix for the rate limit key.
+   :type key_prefix: str
+   :param key_extractor: Function to extract client identifier from request.
+   :type key_extractor: Callable[[Request], str]
+   :param burst_size: Maximum burst size for token bucket/leaky bucket algorithms.
+   :type burst_size: int | None
+   :param error_message: Error message when rate limit is exceeded.
+   :type error_message: str
+   :param status_code: HTTP status code when rate limit is exceeded.
+   :type status_code: int
+   :param skip_on_error: Skip rate limiting if backend errors occur.
+   :type skip_on_error: bool
+   :param cost: Cost of each request (default 1).
+   :type cost: int
+   :param exempt_when: Function to determine if request should be exempt.
+   :type exempt_when: Callable[[Request], bool] | None
+
+   **Returns:** A ``RateLimitInfo`` object with details about the current rate limit state.
+
+RateLimitInfo
+-------------
+
+When the dependency runs, it hands you back a ``RateLimitInfo`` object. Here's
+what's inside:
+
+.. py:class:: RateLimitInfo
+
+   :param limit: The configured request limit.
+   :type limit: int
+   :param remaining: Remaining requests in the current window.
+   :type remaining: int
+   :param reset_at: Unix timestamp when the window resets.
+   :type reset_at: float
+   :param retry_after: Seconds until retry is allowed (if rate limited).
+   :type retry_after: float | None
+   :param window_size: The configured window size in seconds.
+   :type window_size: float
+
+   .. py:method:: to_headers() -> dict[str, str]
+
+      Converts the rate limit info into standard HTTP headers. Handy if you want
+      to add these headers to your response manually.
+
+      :returns: A dictionary with ``X-RateLimit-Limit``, ``X-RateLimit-Remaining``,
+                ``X-RateLimit-Reset``, and ``Retry-After`` (when applicable).
+
+Setup
+-----
+
+Before you can use the dependency, you need to set up the rate limiter. The
+cleanest way is with FastAPI's lifespan context manager:
+
+.. code-block:: python
+
+   from contextlib import asynccontextmanager
+   from fastapi import FastAPI
+   from fastapi_traffic import MemoryBackend, RateLimiter
+   from fastapi_traffic.core.limiter import set_limiter
+
+   backend = MemoryBackend()
+   limiter = RateLimiter(backend)
+
+   @asynccontextmanager
+   async def lifespan(app: FastAPI):
+       await limiter.initialize()
+       set_limiter(limiter)
+       yield
+       await limiter.close()
+
+   app = FastAPI(lifespan=lifespan)
+
+Basic Usage
+-----------
+
+Here's the simplest way to get started. Create a dependency instance and inject
+it with ``Depends``:
+
+.. code-block:: python
+
+   from fastapi import Depends, FastAPI, Request
+   from fastapi_traffic.core.decorator import RateLimitDependency
+
+   app = FastAPI()
+
+   # Create the rate limit dependency
+   rate_limit_dep = RateLimitDependency(limit=100, window_size=60)
+
+   @app.get("/api/data")
+   async def get_data(
+       request: Request,
+       rate_info=Depends(rate_limit_dep),
+   ):
+       return {
+           "data": "here",
+           "remaining_requests": rate_info.remaining,
+           "reset_at": rate_info.reset_at,
+       }
+
+Using Type Aliases
+------------------
+
+If you're using the same rate limit across multiple endpoints, type aliases
+with ``Annotated`` make your code much cleaner:
+
+.. code-block:: python
+
+   from typing import Annotated, TypeAlias
+   from fastapi import Depends, FastAPI, Request
+   from fastapi_traffic.core.decorator import RateLimitDependency
+   from fastapi_traffic.core.models import RateLimitInfo
+
+   app = FastAPI()
+
+   rate_limit_dep = RateLimitDependency(limit=100, window_size=60)
+
+   # Create a type alias for cleaner signatures
+   RateLimit: TypeAlias = Annotated[RateLimitInfo, Depends(rate_limit_dep)]
+
+   @app.get("/api/data")
+   async def get_data(request: Request, rate_info: RateLimit):
+       return {
+           "data": "here",
+           "remaining": rate_info.remaining,
+       }
+
+Tiered Rate Limits
+------------------
+
+This is where dependency injection really shines. You can apply different rate
+limits based on who's making the request—free users get 10 requests per minute,
+pro users get 100, and enterprise gets 1000:
+
+.. code-block:: python
+
+   from typing import Annotated, TypeAlias
+   from fastapi import Depends, FastAPI, Request
+   from fastapi_traffic.core.decorator import RateLimitDependency
+   from fastapi_traffic.core.models import RateLimitInfo
+
+   app = FastAPI()
+
+   # Define tier-specific limits
+   free_tier_limit = RateLimitDependency(
+       limit=10,
+       window_size=60,
+       key_prefix="free",
+   )
+
+   pro_tier_limit = RateLimitDependency(
+       limit=100,
+       window_size=60,
+       key_prefix="pro",
+   )
+
+   enterprise_tier_limit = RateLimitDependency(
+       limit=1000,
+       window_size=60,
+       key_prefix="enterprise",
+   )
+
+   def get_user_tier(request: Request) -> str:
+       """Get user tier from header (in real app, from JWT/database)."""
+       return request.headers.get("X-User-Tier", "free")
+
+   TierDep: TypeAlias = Annotated[str, Depends(get_user_tier)]
+
+   async def tiered_rate_limit(
+       request: Request,
+       tier: TierDep,
+   ) -> RateLimitInfo:
+       """Apply different rate limits based on user tier."""
+       if tier == "enterprise":
+           return await enterprise_tier_limit(request)
+       elif tier == "pro":
+           return await pro_tier_limit(request)
+       else:
+           return await free_tier_limit(request)
+
+   TieredRateLimit: TypeAlias = Annotated[RateLimitInfo, Depends(tiered_rate_limit)]
+
+   @app.get("/api/resource")
+   async def get_resource(request: Request, rate_info: TieredRateLimit):
+       tier = get_user_tier(request)
+       return {
+           "tier": tier,
+           "remaining": rate_info.remaining,
+           "limit": rate_info.limit,
+       }
+
+Custom Key Extraction
+---------------------
+
+By default, rate limits are tracked by IP address. But what if you want to rate
+limit by API key instead? Just pass a custom ``key_extractor``:
+
+.. code-block:: python
+
+   from fastapi import Depends, FastAPI, Request
+   from fastapi_traffic.core.decorator import RateLimitDependency
+
+   app = FastAPI()
+
+   def api_key_extractor(request: Request) -> str:
+       """Extract API key for rate limiting."""
+       api_key = request.headers.get("X-API-Key", "anonymous")
+       return f"api:{api_key}"
+
+   api_rate_limit = RateLimitDependency(
+       limit=100,
+       window_size=3600,  # 100 requests per hour
+       key_extractor=api_key_extractor,
+   )
+
+   @app.get("/api/resource")
+   async def api_resource(
+       request: Request,
+       rate_info=Depends(api_rate_limit),
+   ):
+       return {
+           "data": "Resource data",
+           "requests_remaining": rate_info.remaining,
+       }
+
+Multiple Rate Limits
+--------------------
+
+Sometimes you need layered protection—say, 10 requests per minute *and* 100
+requests per hour. Dependencies make this easy to compose:
+
+.. code-block:: python
+
+   from typing import Annotated, Any, TypeAlias
+   from fastapi import Depends, FastAPI, Request
+   from fastapi_traffic.core.decorator import RateLimitDependency
+   from fastapi_traffic.core.models import RateLimitInfo
+
+   app = FastAPI()
+
+   per_minute_limit = RateLimitDependency(
+       limit=10,
+       window_size=60,
+       key_prefix="minute",
+   )
+
+   per_hour_limit = RateLimitDependency(
+       limit=100,
+       window_size=3600,
+       key_prefix="hour",
+   )
+
+   PerMinuteLimit: TypeAlias = Annotated[RateLimitInfo, Depends(per_minute_limit)]
+   PerHourLimit: TypeAlias = Annotated[RateLimitInfo, Depends(per_hour_limit)]
+
+   async def combined_rate_limit(
+       request: Request,
+       minute_info: PerMinuteLimit,
+       hour_info: PerHourLimit,
+   ) -> dict[str, Any]:
+       """Apply both per-minute and per-hour limits."""
+       return {
+           "minute": {
+               "limit": minute_info.limit,
+               "remaining": minute_info.remaining,
+           },
+           "hour": {
+               "limit": hour_info.limit,
+               "remaining": hour_info.remaining,
+           },
+       }
+
+   CombinedRateLimit: TypeAlias = Annotated[dict[str, Any], Depends(combined_rate_limit)]
+
+   @app.get("/api/combined")
+   async def combined_endpoint(
+       request: Request,
+       rate_info: CombinedRateLimit,
+   ):
+       return {
+           "message": "Success",
+           "rate_limits": rate_info,
+       }
+
+Exemption Logic
+---------------
+
+Need to let certain requests bypass rate limiting entirely? Maybe internal
+services or admin users? Use the ``exempt_when`` parameter:
+
+.. code-block:: python
+
+   from fastapi import Depends, FastAPI, Request
+   from fastapi_traffic.core.decorator import RateLimitDependency
+
+   app = FastAPI()
+
+   def is_internal_request(request: Request) -> bool:
+       """Check if request is from internal service."""
+       internal_token = request.headers.get("X-Internal-Token")
+       return internal_token == "internal-secret-token"
+
+   internal_exempt_limit = RateLimitDependency(
+       limit=10,
+       window_size=60,
+       exempt_when=is_internal_request,
+   )
+
+   @app.get("/api/internal")
+   async def internal_endpoint(
+       request: Request,
+       rate_info=Depends(internal_exempt_limit),
+   ):
+       is_internal = is_internal_request(request)
+       return {
+           "message": "Success",
+           "is_internal": is_internal,
+           "rate_limit": None if is_internal else {
+               "remaining": rate_info.remaining,
+           },
+       }
+
+Exception Handling
+------------------
+
+When a request exceeds the rate limit, a ``RateLimitExceeded`` exception is
+raised. You'll want to catch this and return a proper response:
+
+.. code-block:: python
+
+   from fastapi import FastAPI, Request
+   from fastapi.responses import JSONResponse
+   from fastapi_traffic import RateLimitExceeded
+
+   app = FastAPI()
+
+   @app.exception_handler(RateLimitExceeded)
+   async def rate_limit_handler(
+       request: Request,
+       exc: RateLimitExceeded,
+   ) -> JSONResponse:
+       return JSONResponse(
+           status_code=429,
+           content={
+               "error": "rate_limit_exceeded",
+               "message": exc.message,
+               "retry_after": exc.retry_after,
+           },
+       )
+
+Or if you prefer, there's a built-in helper that does the work for you:
+
+.. code-block:: python
+
+   from fastapi import FastAPI, Request
+   from fastapi_traffic import RateLimitExceeded
+   from fastapi_traffic.core.decorator import create_rate_limit_response
+
+   app = FastAPI()
+
+   @app.exception_handler(RateLimitExceeded)
+   async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
+       return create_rate_limit_response(exc, include_headers=True)
+
+Complete Example
+----------------
+
+Here's everything put together in a working example you can copy and run:
+
+.. code-block:: python
+
+   from contextlib import asynccontextmanager
+   from typing import Annotated, TypeAlias
+
+   from fastapi import Depends, FastAPI, Request
+   from fastapi.responses import JSONResponse
+
+   from fastapi_traffic import (
+       MemoryBackend,
+       RateLimiter,
+       RateLimitExceeded,
+   )
+   from fastapi_traffic.core.decorator import RateLimitDependency
+   from fastapi_traffic.core.limiter import set_limiter
+   from fastapi_traffic.core.models import RateLimitInfo
+
+   # Initialize backend and limiter
+   backend = MemoryBackend()
+   limiter = RateLimiter(backend)
+
+   @asynccontextmanager
+   async def lifespan(app: FastAPI):
+       await limiter.initialize()
+       set_limiter(limiter)
+       yield
+       await limiter.close()
+
+   app = FastAPI(lifespan=lifespan)
+
+   # Exception handler
+   @app.exception_handler(RateLimitExceeded)
+   async def rate_limit_handler(
+       request: Request,
+       exc: RateLimitExceeded,
+   ) -> JSONResponse:
+       return JSONResponse(
+           status_code=429,
+           content={
+               "error": "rate_limit_exceeded",
+               "retry_after": exc.retry_after,
+           },
+       )
+
+   # Create dependency
+   api_rate_limit = RateLimitDependency(limit=100, window_size=60)
+   ApiRateLimit: TypeAlias = Annotated[RateLimitInfo, Depends(api_rate_limit)]
+
+   @app.get("/api/data")
+   async def get_data(request: Request, rate_info: ApiRateLimit):
+       return {
+           "data": "Your data here",
+           "rate_limit": {
+               "limit": rate_info.limit,
+               "remaining": rate_info.remaining,
+               "reset_at": rate_info.reset_at,
+           },
+       }
+
+Decorator vs Dependency
+-----------------------
+
+Not sure which approach to use? Here's a quick guide:
+
+**Go with the ``@rate_limit`` decorator if:**
+
+- You just want to slap a rate limit on an endpoint and move on
+- You don't care about the remaining request count inside your endpoint
+- You're applying the same limit to a bunch of endpoints
+
+**Go with ``RateLimitDependency`` if:**
+
+- You want to show users how many requests they have left
+- You need different limits for different user tiers
+- You're stacking multiple rate limits (per-minute + per-hour)
+- You're already using FastAPI's dependency system and want consistency
+
+See Also
+--------
+
+- :doc:`decorator` - Decorator-based rate limiting
+- :doc:`middleware` - Global middleware rate limiting
+- :doc:`config` - Configuration options
+- :doc:`exceptions` - Exception handling
--- a/docs/api/exceptions.rst
+++ b/docs/api/exceptions.rst
@@ -0,0 +1,165 @@
+Exceptions API
+==============
+
+Custom exceptions raised by FastAPI Traffic.
+
+FastAPITrafficError
+-------------------
+
+.. py:exception:: FastAPITrafficError
+
+   Base exception for all FastAPI Traffic errors.
+
+   All other exceptions in this library inherit from this class, so you can
+   catch all FastAPI Traffic errors with a single handler:
+
+   .. code-block:: python
+
+      from fastapi_traffic.exceptions import FastAPITrafficError
+
+      @app.exception_handler(FastAPITrafficError)
+      async def handle_traffic_error(request: Request, exc: FastAPITrafficError):
+          return JSONResponse(
+              status_code=500,
+              content={"error": str(exc)},
+          )
+
+RateLimitExceeded
+-----------------
+
+.. py:exception:: RateLimitExceeded(message="Rate limit exceeded", *, retry_after=None, limit_info=None)
+
+   Raised when a rate limit has been exceeded.
+
+   :param message: Error message.
+   :type message: str
+   :param retry_after: Seconds until the client can retry.
+   :type retry_after: float | None
+   :param limit_info: Detailed rate limit information.
+   :type limit_info: RateLimitInfo | None
+
+   .. py:attribute:: message
+      :type: str
+
+      The error message.
+
+   .. py:attribute:: retry_after
+      :type: float | None
+
+      Seconds until the client can retry. May be None if not calculable.
+
+   .. py:attribute:: limit_info
+      :type: RateLimitInfo | None
+
+      Detailed information about the rate limit state.
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi import Request
+      from fastapi.responses import JSONResponse
+      from fastapi_traffic import RateLimitExceeded
+
+      @app.exception_handler(RateLimitExceeded)
+      async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
+          headers = {}
+          if exc.limit_info:
+              headers = exc.limit_info.to_headers()
+
+          return JSONResponse(
+              status_code=429,
+              content={
+                  "error": "rate_limit_exceeded",
+                  "message": exc.message,
+                  "retry_after": exc.retry_after,
+              },
+              headers=headers,
+          )
+
+BackendError
+------------
+
+.. py:exception:: BackendError(message="Backend operation failed", *, original_error=None)
+
+   Raised when a backend operation fails.
+
+   :param message: Error message.
+   :type message: str
+   :param original_error: The original exception that caused this error.
+   :type original_error: Exception | None
+
+   .. py:attribute:: message
+      :type: str
+
+      The error message.
+
+   .. py:attribute:: original_error
+      :type: Exception | None
+
+      The underlying exception, if any.
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import BackendError
+
+      @app.exception_handler(BackendError)
+      async def backend_error_handler(request: Request, exc: BackendError):
+          # Log the original error for debugging
+          if exc.original_error:
+              logger.error("Backend error: %s", exc.original_error)
+
+          return JSONResponse(
+              status_code=503,
+              content={"error": "service_unavailable"},
+          )
+
+   This exception is raised when:
+
+   - Redis connection fails
+   - SQLite database is locked or corrupted
+   - Any other backend storage operation fails
+
+ConfigurationError
+------------------
+
+.. py:exception:: ConfigurationError
+
+   Raised when there is a configuration error.
+
+   This exception is raised when:
+
+   - Invalid values in configuration files
+   - Missing required configuration
+   - Type conversion failures
+   - Unknown configuration fields
+
+   **Usage:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import ConfigLoader, ConfigurationError
+
+      loader = ConfigLoader()
+
+      try:
+          config = loader.load_rate_limit_config_from_json("config.json")
+      except ConfigurationError as e:
+          print(f"Configuration error: {e}")
+          # Use default configuration
+          config = RateLimitConfig(limit=100, window_size=60)
+
+Exception Hierarchy
+-------------------
+
+.. code-block:: text
+
+   FastAPITrafficError
+   ├── RateLimitExceeded
+   ├── BackendError
+   └── ConfigurationError
+
+All exceptions inherit from ``FastAPITrafficError``, which inherits from
+Python's built-in ``Exception``.
--- a/docs/api/middleware.rst
+++ b/docs/api/middleware.rst
@@ -0,0 +1,118 @@
+Middleware API
+==============
+
+Middleware for applying rate limiting globally across your application.
+
+RateLimitMiddleware
+-------------------
+
+.. py:class:: RateLimitMiddleware(app, *, limit=100, window_size=60.0, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, backend=None, key_prefix="middleware", include_headers=True, error_message="Rate limit exceeded. Please try again later.", status_code=429, skip_on_error=False, exempt_paths=None, exempt_ips=None, key_extractor=default_key_extractor)
+
+   Middleware for global rate limiting across all endpoints.
+
+   :param app: The ASGI application.
+   :type app: ASGIApp
+   :param limit: Maximum requests per window.
+   :type limit: int
+   :param window_size: Time window in seconds.
+   :type window_size: float
+   :param algorithm: Rate limiting algorithm.
+   :type algorithm: Algorithm
+   :param backend: Storage backend. Defaults to MemoryBackend.
+   :type backend: Backend | None
+   :param key_prefix: Prefix for rate limit keys.
+   :type key_prefix: str
+   :param include_headers: Include rate limit headers in response.
+   :type include_headers: bool
+   :param error_message: Error message when rate limited.
+   :type error_message: str
+   :param status_code: HTTP status code when rate limited.
+   :type status_code: int
+   :param skip_on_error: Skip rate limiting on backend errors.
+   :type skip_on_error: bool
+   :param exempt_paths: Paths to exempt from rate limiting.
+   :type exempt_paths: set[str] | None
+   :param exempt_ips: IP addresses to exempt from rate limiting.
+   :type exempt_ips: set[str] | None
+   :param key_extractor: Function to extract client identifier.
+   :type key_extractor: Callable[[Request], str]
+
+   **Basic usage:**
+
+   .. code-block:: python
+
+      from fastapi import FastAPI
+      from fastapi_traffic.middleware import RateLimitMiddleware
+
+      app = FastAPI()
+
+      app.add_middleware(
+          RateLimitMiddleware,
+          limit=1000,
+          window_size=60,
+      )
+
+   **With exemptions:**
+
+   .. code-block:: python
+
+      app.add_middleware(
+          RateLimitMiddleware,
+          limit=1000,
+          window_size=60,
+          exempt_paths={"/health", "/docs"},
+          exempt_ips={"127.0.0.1"},
+      )
+
+   **With custom backend:**
+
+   .. code-block:: python
+
+      from fastapi_traffic import SQLiteBackend
+
+      backend = SQLiteBackend("rate_limits.db")
+
+      app.add_middleware(
+          RateLimitMiddleware,
+          limit=1000,
+          window_size=60,
+          backend=backend,
+      )
+
+SlidingWindowMiddleware
+-----------------------
+
+.. py:class:: SlidingWindowMiddleware(app, *, limit=100, window_size=60.0, **kwargs)
+
+   Convenience middleware using the sliding window algorithm.
+
+   Accepts all the same parameters as ``RateLimitMiddleware``.
+
+   .. code-block:: python
+
+      from fastapi_traffic.middleware import SlidingWindowMiddleware
+
+      app.add_middleware(
+          SlidingWindowMiddleware,
+          limit=1000,
+          window_size=60,
+      )
+
+TokenBucketMiddleware
+---------------------
+
+.. py:class:: TokenBucketMiddleware(app, *, limit=100, window_size=60.0, **kwargs)
+
+   Convenience middleware using the token bucket algorithm.
+
+   Accepts all the same parameters as ``RateLimitMiddleware``.
+
+   .. code-block:: python
+
+      from fastapi_traffic.middleware import TokenBucketMiddleware
+
+      app.add_middleware(
+          TokenBucketMiddleware,
+          limit=1000,
+          window_size=60,
+      )