release: bump version to 0.3.0
- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
This commit is contained in:
14
docs/Makefile
Normal file
14
docs/Makefile
Normal file
@@ -0,0 +1,14 @@
|
||||
# Minimal makefile for Sphinx documentation
|
||||
|
||||
SPHINXOPTS ?=
|
||||
SPHINXBUILD ?= sphinx-build
|
||||
SOURCEDIR = .
|
||||
BUILDDIR = _build
|
||||
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
%: Makefile
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
319
docs/advanced/distributed-systems.rst
Normal file
319
docs/advanced/distributed-systems.rst
Normal file
@@ -0,0 +1,319 @@
|
||||
Distributed Systems
|
||||
===================
|
||||
|
||||
Running rate limiting across multiple application instances requires careful
|
||||
consideration. This guide covers the patterns and pitfalls.
|
||||
|
||||
The Challenge
|
||||
-------------
|
||||
|
||||
In a distributed system, you might have:
|
||||
|
||||
- Multiple application instances behind a load balancer
|
||||
- Kubernetes pods that scale up and down
|
||||
- Serverless functions that run independently
|
||||
|
||||
Each instance needs to share rate limit state. Otherwise, a client could make
|
||||
100 requests to instance A and another 100 to instance B, effectively bypassing
|
||||
a 100 request limit.
|
||||
|
||||
Redis: The Standard Solution
|
||||
----------------------------
|
||||
|
||||
Redis is the go-to choice for distributed rate limiting:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi_traffic import RateLimiter
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://redis-server:6379/0",
|
||||
key_prefix="myapp:ratelimit",
|
||||
)
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
limiter = get_limiter()
|
||||
await limiter.close()
|
||||
|
||||
All instances connect to the same Redis server and share state.
|
||||
|
||||
High Availability Redis
|
||||
-----------------------
|
||||
|
||||
For production, you'll want Redis with high availability:
|
||||
|
||||
**Redis Sentinel:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/0",
|
||||
sentinel_master="mymaster",
|
||||
)
|
||||
|
||||
**Redis Cluster:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://node1:6379,node2:6379,node3:6379/0",
|
||||
)
|
||||
|
||||
Atomic Operations
|
||||
-----------------
|
||||
|
||||
Race conditions are a real concern in distributed systems. Consider this scenario:
|
||||
|
||||
1. Instance A reads: 99 requests made
|
||||
2. Instance B reads: 99 requests made
|
||||
3. Instance A writes: 100 requests (allows request)
|
||||
4. Instance B writes: 100 requests (allows request)
|
||||
|
||||
Now you've allowed 101 requests when the limit was 100.
|
||||
|
||||
FastAPI Traffic's Redis backend uses Lua scripts to make operations atomic:
|
||||
|
||||
.. code-block:: lua
|
||||
|
||||
-- Simplified example of atomic check-and-increment
|
||||
local current = redis.call('GET', KEYS[1])
|
||||
if current and tonumber(current) >= limit then
|
||||
return 0 -- Reject
|
||||
end
|
||||
redis.call('INCR', KEYS[1])
|
||||
return 1 -- Allow
|
||||
|
||||
The entire check-and-update happens in a single Redis operation.
|
||||
|
||||
Network Latency
|
||||
---------------
|
||||
|
||||
Redis adds network latency to every request. Some strategies to minimize impact:
|
||||
|
||||
**1. Connection pooling (automatic):**
|
||||
|
||||
The Redis backend maintains a connection pool, so you're not creating new
|
||||
connections for each request.
|
||||
|
||||
**2. Local caching:**
|
||||
|
||||
For very high-traffic endpoints, consider a two-tier approach:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
|
||||
# Local memory backend for fast path
|
||||
local_backend = MemoryBackend()
|
||||
local_limiter = RateLimiter(local_backend)
|
||||
|
||||
# Redis backend for distributed state
|
||||
redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
distributed_limiter = RateLimiter(redis_backend)
|
||||
|
||||
async def check_rate_limit(request: Request, config: RateLimitConfig):
|
||||
# Quick local check (may allow some extra requests)
|
||||
local_result = await local_limiter.check(request, config)
|
||||
if not local_result.allowed:
|
||||
return local_result
|
||||
|
||||
# Authoritative distributed check
|
||||
return await distributed_limiter.check(request, config)
|
||||
|
||||
**3. Skip on error:**
|
||||
|
||||
If Redis latency is causing issues, you might prefer to allow requests through
|
||||
rather than block:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
Handling Redis Failures
|
||||
-----------------------
|
||||
|
||||
What happens when Redis goes down?
|
||||
|
||||
**Fail closed (default):**
|
||||
|
||||
Requests fail. This is safer but impacts availability.
|
||||
|
||||
**Fail open:**
|
||||
|
||||
Allow requests through:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
|
||||
**Circuit breaker pattern:**
|
||||
|
||||
Implement a circuit breaker to avoid hammering a failing Redis:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import time
|
||||
|
||||
class CircuitBreaker:
|
||||
def __init__(self, failure_threshold=5, reset_timeout=60):
|
||||
self.failures = 0
|
||||
self.threshold = failure_threshold
|
||||
self.reset_timeout = reset_timeout
|
||||
self.last_failure = 0
|
||||
self.open = False
|
||||
|
||||
def record_failure(self):
|
||||
self.failures += 1
|
||||
self.last_failure = time.time()
|
||||
if self.failures >= self.threshold:
|
||||
self.open = True
|
||||
|
||||
def record_success(self):
|
||||
self.failures = 0
|
||||
self.open = False
|
||||
|
||||
def should_allow(self) -> bool:
|
||||
if not self.open:
|
||||
return True
|
||||
# Check if we should try again
|
||||
if time.time() - self.last_failure > self.reset_timeout:
|
||||
return True
|
||||
return False
|
||||
|
||||
Kubernetes Deployment
|
||||
---------------------
|
||||
|
||||
Here's a typical Kubernetes setup:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# redis-deployment.yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: redis
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: redis
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: redis
|
||||
spec:
|
||||
containers:
|
||||
- name: redis
|
||||
image: redis:7-alpine
|
||||
ports:
|
||||
- containerPort: 6379
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: redis
|
||||
spec:
|
||||
selector:
|
||||
app: redis
|
||||
ports:
|
||||
- port: 6379
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# app-deployment.yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: api
|
||||
spec:
|
||||
replicas: 3
|
||||
selector:
|
||||
matchLabels:
|
||||
app: api
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: api
|
||||
image: myapp:latest
|
||||
env:
|
||||
- name: REDIS_URL
|
||||
value: "redis://redis:6379/0"
|
||||
|
||||
Your app connects to Redis via the service name:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import os
|
||||
|
||||
redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0")
|
||||
backend = await RedisBackend.from_url(redis_url)
|
||||
|
||||
Monitoring
|
||||
----------
|
||||
|
||||
Keep an eye on:
|
||||
|
||||
1. **Redis latency:** High latency means slow requests
|
||||
2. **Redis memory:** Rate limit data shouldn't use much, but monitor it
|
||||
3. **Connection count:** Make sure you're not exhausting connections
|
||||
4. **Rate limit hits:** Track how often clients are being limited
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def on_rate_limited(request: Request, result):
|
||||
logger.info(
|
||||
"Rate limited: client=%s path=%s remaining=%d",
|
||||
request.client.host,
|
||||
request.url.path,
|
||||
result.info.remaining,
|
||||
)
|
||||
|
||||
@rate_limit(100, 60, on_blocked=on_rate_limited)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
Testing Distributed Rate Limits
|
||||
-------------------------------
|
||||
|
||||
Testing distributed behavior is tricky. Here's an approach:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
|
||||
async def test_distributed_limit():
|
||||
"""Simulate requests from multiple 'instances'."""
|
||||
async with httpx.AsyncClient() as client:
|
||||
# Fire 150 requests concurrently
|
||||
tasks = [
|
||||
client.get("http://localhost:8000/api/data")
|
||||
for _ in range(150)
|
||||
]
|
||||
responses = await asyncio.gather(*tasks)
|
||||
|
||||
# Count successes and rate limits
|
||||
successes = sum(1 for r in responses if r.status_code == 200)
|
||||
limited = sum(1 for r in responses if r.status_code == 429)
|
||||
|
||||
print(f"Successes: {successes}, Rate limited: {limited}")
|
||||
# With a limit of 100, expect ~100 successes and ~50 limited
|
||||
|
||||
asyncio.run(test_distributed_limit())
|
||||
291
docs/advanced/performance.rst
Normal file
291
docs/advanced/performance.rst
Normal file
@@ -0,0 +1,291 @@
|
||||
Performance
|
||||
===========
|
||||
|
||||
FastAPI Traffic is designed to be fast. But when you're handling thousands of
|
||||
requests per second, every microsecond counts. Here's how to squeeze out the
|
||||
best performance.
|
||||
|
||||
Baseline Performance
|
||||
--------------------
|
||||
|
||||
On typical hardware, you can expect:
|
||||
|
||||
- **Memory backend:** ~0.01ms per check
|
||||
- **SQLite backend:** ~0.1ms per check
|
||||
- **Redis backend:** ~1ms per check (network dependent)
|
||||
|
||||
For most applications, this overhead is negligible compared to your actual
|
||||
business logic.
|
||||
|
||||
Choosing the Right Algorithm
|
||||
----------------------------
|
||||
|
||||
Algorithms have different performance characteristics:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Algorithm
|
||||
- Time Complexity
|
||||
- Space Complexity
|
||||
- Notes
|
||||
* - Token Bucket
|
||||
- O(1)
|
||||
- O(1)
|
||||
- Two floats per key
|
||||
* - Fixed Window
|
||||
- O(1)
|
||||
- O(1)
|
||||
- One int + one float per key
|
||||
* - Sliding Window Counter
|
||||
- O(1)
|
||||
- O(1)
|
||||
- Three values per key
|
||||
* - Leaky Bucket
|
||||
- O(1)
|
||||
- O(1)
|
||||
- Two floats per key
|
||||
* - Sliding Window
|
||||
- O(n)
|
||||
- O(n)
|
||||
- Stores every timestamp
|
||||
|
||||
**Recommendation:** Use Sliding Window Counter (the default) unless you have
|
||||
specific requirements. It's O(1) and provides good accuracy.
|
||||
|
||||
**Avoid Sliding Window for high-volume endpoints.** If you're allowing 10,000
|
||||
requests per minute, that's 10,000 timestamps to store and filter per key.
|
||||
|
||||
Memory Backend Optimization
|
||||
---------------------------
|
||||
|
||||
The memory backend is already fast, but you can tune it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend
|
||||
|
||||
backend = MemoryBackend(
|
||||
max_size=10000, # Limit memory usage
|
||||
cleanup_interval=60, # Less frequent cleanup = less overhead
|
||||
)
|
||||
|
||||
**max_size:** Limits the number of keys stored. When exceeded, LRU eviction kicks
|
||||
in. Set this based on your expected number of unique clients.
|
||||
|
||||
**cleanup_interval:** How often to scan for expired entries. Higher values mean
|
||||
less CPU overhead but more memory usage from expired entries.
|
||||
|
||||
SQLite Backend Optimization
|
||||
---------------------------
|
||||
|
||||
SQLite is surprisingly fast for rate limiting:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import SQLiteBackend
|
||||
|
||||
backend = SQLiteBackend(
|
||||
"rate_limits.db",
|
||||
cleanup_interval=300, # Clean every 5 minutes
|
||||
)
|
||||
|
||||
**Tips:**
|
||||
|
||||
1. **Use an SSD.** SQLite performance depends heavily on disk I/O.
|
||||
|
||||
2. **Put the database on a local disk.** Network-attached storage adds latency.
|
||||
|
||||
3. **WAL mode is enabled by default.** This allows concurrent reads and writes.
|
||||
|
||||
4. **Increase cleanup_interval** if you have many keys. Cleanup scans the entire
|
||||
table.
|
||||
|
||||
Redis Backend Optimization
|
||||
--------------------------
|
||||
|
||||
Redis is the bottleneck in most distributed setups:
|
||||
|
||||
**1. Use connection pooling (automatic):**
|
||||
|
||||
The backend maintains a pool of connections. You don't need to do anything.
|
||||
|
||||
**2. Use pipelining for batch operations:**
|
||||
|
||||
If you're checking multiple rate limits, batch them:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Instead of multiple round trips
|
||||
result1 = await limiter.check(request, config1)
|
||||
result2 = await limiter.check(request, config2)
|
||||
|
||||
# Consider combining into one check with higher cost
|
||||
combined_config = RateLimitConfig(limit=100, window_size=60, cost=2)
|
||||
result = await limiter.check(request, combined_config)
|
||||
|
||||
**3. Use Redis close to your application:**
|
||||
|
||||
Network latency is usually the biggest factor. Run Redis in the same datacenter,
|
||||
or better yet, the same availability zone.
|
||||
|
||||
**4. Consider Redis Cluster for high throughput:**
|
||||
|
||||
Distributes load across multiple Redis nodes.
|
||||
|
||||
Reducing Overhead
|
||||
-----------------
|
||||
|
||||
**1. Exempt paths that don't need limiting:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
exempt_paths={"/health", "/metrics", "/ready"},
|
||||
)
|
||||
|
||||
**2. Use coarse-grained limits when possible:**
|
||||
|
||||
Instead of limiting every endpoint separately, use middleware for a global limit:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# One check per request
|
||||
app.add_middleware(RateLimitMiddleware, limit=1000, window_size=60)
|
||||
|
||||
# vs. multiple checks per request
|
||||
@rate_limit(100, 60) # Check 1
|
||||
@another_decorator # Check 2
|
||||
async def endpoint():
|
||||
pass
|
||||
|
||||
**3. Increase window size:**
|
||||
|
||||
Longer windows mean fewer state updates:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Updates state 60 times per minute per client
|
||||
@rate_limit(60, 60)
|
||||
|
||||
# Updates state 1 time per minute per client
|
||||
@rate_limit(1, 1) # Same rate, but per-second
|
||||
|
||||
Wait, that's backwards. Actually, the number of state updates equals the number
|
||||
of requests, regardless of window size. But longer windows mean:
|
||||
|
||||
- Fewer unique window boundaries
|
||||
- Better cache efficiency
|
||||
- More stable rate limiting
|
||||
|
||||
**4. Skip headers when not needed:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, include_headers=False)
|
||||
|
||||
Saves a tiny bit of response processing.
|
||||
|
||||
Benchmarking
|
||||
------------
|
||||
|
||||
Here's a simple benchmark script:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter, RateLimitConfig
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
async def benchmark():
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
await limiter.initialize()
|
||||
|
||||
config = RateLimitConfig(limit=10000, window_size=60)
|
||||
|
||||
# Mock request
|
||||
request = MagicMock()
|
||||
request.client.host = "127.0.0.1"
|
||||
request.url.path = "/test"
|
||||
request.method = "GET"
|
||||
request.headers = {}
|
||||
|
||||
# Warm up
|
||||
for _ in range(100):
|
||||
await limiter.check(request, config)
|
||||
|
||||
# Benchmark
|
||||
iterations = 10000
|
||||
start = time.perf_counter()
|
||||
|
||||
for _ in range(iterations):
|
||||
await limiter.check(request, config)
|
||||
|
||||
elapsed = time.perf_counter() - start
|
||||
|
||||
print(f"Total time: {elapsed:.3f}s")
|
||||
print(f"Per check: {elapsed/iterations*1000:.3f}ms")
|
||||
print(f"Checks/sec: {iterations/elapsed:.0f}")
|
||||
|
||||
await limiter.close()
|
||||
|
||||
asyncio.run(benchmark())
|
||||
|
||||
Typical output:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
Total time: 0.150s
|
||||
Per check: 0.015ms
|
||||
Checks/sec: 66666
|
||||
|
||||
Profiling
|
||||
---------
|
||||
|
||||
If you suspect rate limiting is a bottleneck, profile it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import cProfile
|
||||
import pstats
|
||||
|
||||
async def profile_rate_limiting():
|
||||
# Your rate limiting code here
|
||||
pass
|
||||
|
||||
cProfile.run('asyncio.run(profile_rate_limiting())', 'rate_limit.prof')
|
||||
|
||||
stats = pstats.Stats('rate_limit.prof')
|
||||
stats.sort_stats('cumulative')
|
||||
stats.print_stats(20)
|
||||
|
||||
Look for:
|
||||
|
||||
- Time spent in backend operations
|
||||
- Time spent in algorithm calculations
|
||||
- Unexpected hotspots
|
||||
|
||||
When Performance Really Matters
|
||||
-------------------------------
|
||||
|
||||
If you're handling millions of requests per second and rate limiting overhead
|
||||
is significant:
|
||||
|
||||
1. **Consider sampling:** Only check rate limits for a percentage of requests
|
||||
and extrapolate.
|
||||
|
||||
2. **Use probabilistic data structures:** Bloom filters or Count-Min Sketch can
|
||||
approximate rate limiting with less overhead.
|
||||
|
||||
3. **Push to the edge:** Use CDN-level rate limiting (Cloudflare, AWS WAF) to
|
||||
handle the bulk of traffic.
|
||||
|
||||
4. **Accept some inaccuracy:** Fixed window with ``skip_on_error=True`` is very
|
||||
fast and "good enough" for many use cases.
|
||||
|
||||
For most applications, though, the default configuration is plenty fast.
|
||||
367
docs/advanced/testing.rst
Normal file
367
docs/advanced/testing.rst
Normal file
@@ -0,0 +1,367 @@
|
||||
Testing
|
||||
=======
|
||||
|
||||
Testing rate-limited endpoints requires some care. You don't want your tests to
|
||||
be flaky because of timing issues, and you need to verify that limits actually work.
|
||||
|
||||
Basic Testing Setup
|
||||
-------------------
|
||||
|
||||
Use pytest with pytest-asyncio for async tests:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# conftest.py
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
@pytest.fixture
|
||||
def app():
|
||||
"""Create a fresh app for each test."""
|
||||
from myapp import create_app
|
||||
return create_app()
|
||||
|
||||
@pytest.fixture
|
||||
def client(app):
|
||||
"""Test client with fresh rate limiter."""
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
|
||||
with TestClient(app) as client:
|
||||
yield client
|
||||
|
||||
Testing Rate Limit Enforcement
|
||||
------------------------------
|
||||
|
||||
Verify that the limit is actually enforced:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_rate_limit_enforced(client):
|
||||
"""Test that requests are blocked after limit is reached."""
|
||||
# Make requests up to the limit
|
||||
for i in range(10):
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200, f"Request {i+1} should succeed"
|
||||
|
||||
# Next request should be rate limited
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 429
|
||||
assert "retry_after" in response.json()
|
||||
|
||||
Testing Rate Limit Headers
|
||||
--------------------------
|
||||
|
||||
Check that headers are included correctly:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_rate_limit_headers(client):
|
||||
"""Test that rate limit headers are present."""
|
||||
response = client.get("/api/data")
|
||||
|
||||
assert "X-RateLimit-Limit" in response.headers
|
||||
assert "X-RateLimit-Remaining" in response.headers
|
||||
assert "X-RateLimit-Reset" in response.headers
|
||||
|
||||
# Verify values make sense
|
||||
limit = int(response.headers["X-RateLimit-Limit"])
|
||||
remaining = int(response.headers["X-RateLimit-Remaining"])
|
||||
|
||||
assert limit == 100 # Your configured limit
|
||||
assert remaining == 99 # One request made
|
||||
|
||||
Testing Different Clients
|
||||
-------------------------
|
||||
|
||||
Verify that different clients have separate limits:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_separate_limits_per_client(client):
|
||||
"""Test that different IPs have separate limits."""
|
||||
# Client A makes requests
|
||||
for _ in range(10):
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "1.1.1.1"}
|
||||
)
|
||||
assert response.status_code == 200
|
||||
|
||||
# Client A is now limited
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "1.1.1.1"}
|
||||
)
|
||||
assert response.status_code == 429
|
||||
|
||||
# Client B should still have full quota
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "2.2.2.2"}
|
||||
)
|
||||
assert response.status_code == 200
|
||||
|
||||
Testing Window Reset
|
||||
--------------------
|
||||
|
||||
Test that limits reset after the window expires:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import time
|
||||
from unittest.mock import patch
|
||||
|
||||
def test_limit_resets_after_window(client):
|
||||
"""Test that limits reset after window expires."""
|
||||
# Exhaust the limit
|
||||
for _ in range(10):
|
||||
client.get("/api/data")
|
||||
|
||||
# Should be limited
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 429
|
||||
|
||||
# Fast-forward time (mock time.time)
|
||||
with patch('time.time') as mock_time:
|
||||
# Move 61 seconds into the future
|
||||
mock_time.return_value = time.time() + 61
|
||||
|
||||
# Should be allowed again
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200
|
||||
|
||||
Testing Exemptions
|
||||
------------------
|
||||
|
||||
Verify that exemptions work:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_exempt_paths(client):
|
||||
"""Test that exempt paths bypass rate limiting."""
|
||||
# Exhaust limit on a regular endpoint
|
||||
for _ in range(100):
|
||||
client.get("/api/data")
|
||||
|
||||
# Regular endpoint should be limited
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 429
|
||||
|
||||
# Health check should still work
|
||||
response = client.get("/health")
|
||||
assert response.status_code == 200
|
||||
|
||||
def test_exempt_ips(client):
|
||||
"""Test that exempt IPs bypass rate limiting."""
|
||||
# Make many requests from exempt IP
|
||||
for _ in range(1000):
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "127.0.0.1"}
|
||||
)
|
||||
assert response.status_code == 200 # Never limited
|
||||
|
||||
Testing with Async Client
|
||||
-------------------------
|
||||
|
||||
For async endpoints, use httpx:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import pytest
|
||||
import httpx
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_async_rate_limiting():
|
||||
"""Test rate limiting with async client."""
|
||||
async with httpx.AsyncClient(app=app, base_url="http://test") as client:
|
||||
# Make concurrent requests
|
||||
responses = await asyncio.gather(*[
|
||||
client.get("/api/data")
|
||||
for _ in range(15)
|
||||
])
|
||||
|
||||
successes = sum(1 for r in responses if r.status_code == 200)
|
||||
limited = sum(1 for r in responses if r.status_code == 429)
|
||||
|
||||
assert successes == 10 # Limit
|
||||
assert limited == 5 # Over limit
|
||||
|
||||
Testing Backend Failures
|
||||
------------------------
|
||||
|
||||
Test behavior when the backend fails:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from unittest.mock import AsyncMock, patch
|
||||
from fastapi_traffic import BackendError
|
||||
|
||||
def test_skip_on_error(client):
|
||||
"""Test that requests are allowed when backend fails and skip_on_error=True."""
|
||||
with patch.object(
|
||||
MemoryBackend, 'get',
|
||||
side_effect=BackendError("Connection failed")
|
||||
):
|
||||
# With skip_on_error=True, should still work
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200
|
||||
|
||||
def test_fail_on_error(client):
|
||||
"""Test that requests fail when backend fails and skip_on_error=False."""
|
||||
with patch.object(
|
||||
MemoryBackend, 'get',
|
||||
side_effect=BackendError("Connection failed")
|
||||
):
|
||||
# With skip_on_error=False (default), should fail
|
||||
response = client.get("/api/strict-data")
|
||||
assert response.status_code == 500
|
||||
|
||||
Mocking the Rate Limiter
|
||||
------------------------
|
||||
|
||||
For unit tests, you might want to mock the rate limiter entirely:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
from fastapi_traffic.core.models import RateLimitInfo, RateLimitResult
|
||||
|
||||
def test_with_mocked_limiter(client):
|
||||
"""Test endpoint logic without actual rate limiting."""
|
||||
mock_limiter = MagicMock()
|
||||
mock_limiter.hit = AsyncMock(return_value=RateLimitResult(
|
||||
allowed=True,
|
||||
info=RateLimitInfo(
|
||||
limit=100,
|
||||
remaining=99,
|
||||
reset_at=time.time() + 60,
|
||||
window_size=60,
|
||||
),
|
||||
key="test",
|
||||
))
|
||||
|
||||
set_limiter(mock_limiter)
|
||||
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200
|
||||
mock_limiter.hit.assert_called_once()
|
||||
|
||||
Integration Testing with Redis
|
||||
------------------------------
|
||||
|
||||
For integration tests with Redis:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import pytest
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
|
||||
@pytest.fixture
|
||||
async def redis_backend():
|
||||
"""Create a Redis backend for testing."""
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://localhost:6379/15", # Use a test database
|
||||
key_prefix="test:",
|
||||
)
|
||||
yield backend
|
||||
await backend.clear() # Clean up after test
|
||||
await backend.close()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_redis_rate_limiting(redis_backend):
|
||||
"""Test rate limiting with real Redis."""
|
||||
limiter = RateLimiter(redis_backend)
|
||||
await limiter.initialize()
|
||||
|
||||
config = RateLimitConfig(limit=5, window_size=60)
|
||||
request = create_mock_request("1.1.1.1")
|
||||
|
||||
# Make requests up to limit
|
||||
for _ in range(5):
|
||||
result = await limiter.check(request, config)
|
||||
assert result.allowed
|
||||
|
||||
# Next should be blocked
|
||||
result = await limiter.check(request, config)
|
||||
assert not result.allowed
|
||||
|
||||
await limiter.close()
|
||||
|
||||
Fixtures for Common Scenarios
|
||||
-----------------------------
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# conftest.py
|
||||
import pytest
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter, RateLimitConfig
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
@pytest.fixture
|
||||
def fresh_limiter():
|
||||
"""Fresh rate limiter for each test."""
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
return limiter
|
||||
|
||||
@pytest.fixture
|
||||
def rate_limit_config():
|
||||
"""Standard rate limit config for tests."""
|
||||
return RateLimitConfig(
|
||||
limit=10,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
@pytest.fixture
|
||||
def mock_request():
|
||||
"""Create a mock request."""
|
||||
def _create(ip="127.0.0.1", path="/test"):
|
||||
request = MagicMock()
|
||||
request.client.host = ip
|
||||
request.url.path = path
|
||||
request.method = "GET"
|
||||
request.headers = {}
|
||||
return request
|
||||
return _create
|
||||
|
||||
Avoiding Flaky Tests
|
||||
--------------------
|
||||
|
||||
Rate limiting tests can be flaky due to timing. Tips:
|
||||
|
||||
1. **Use short windows for tests:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(10, 1) # 10 per second, not 10 per minute
|
||||
|
||||
2. **Mock time instead of sleeping:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
with patch('time.time', return_value=future_time):
|
||||
# Test window reset
|
||||
|
||||
3. **Reset state between tests:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
async def reset_limiter():
|
||||
yield
|
||||
limiter = get_limiter()
|
||||
await limiter.backend.clear()
|
||||
|
||||
4. **Use unique keys per test:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_something(mock_request):
|
||||
request = mock_request(ip=f"test-{uuid.uuid4()}")
|
||||
211
docs/api/algorithms.rst
Normal file
211
docs/api/algorithms.rst
Normal file
@@ -0,0 +1,211 @@
|
||||
Algorithms API
|
||||
==============
|
||||
|
||||
Rate limiting algorithms and the factory function to create them.
|
||||
|
||||
Algorithm Enum
|
||||
--------------
|
||||
|
||||
.. py:class:: Algorithm
|
||||
|
||||
Enumeration of available rate limiting algorithms.
|
||||
|
||||
.. py:attribute:: TOKEN_BUCKET
|
||||
:value: "token_bucket"
|
||||
|
||||
Token bucket algorithm. Allows bursts up to bucket capacity, then refills
|
||||
at a steady rate.
|
||||
|
||||
.. py:attribute:: SLIDING_WINDOW
|
||||
:value: "sliding_window"
|
||||
|
||||
Sliding window log algorithm. Tracks exact timestamps for precise limiting.
|
||||
Higher memory usage.
|
||||
|
||||
.. py:attribute:: FIXED_WINDOW
|
||||
:value: "fixed_window"
|
||||
|
||||
Fixed window algorithm. Simple time-based windows. Efficient but has
|
||||
boundary issues.
|
||||
|
||||
.. py:attribute:: LEAKY_BUCKET
|
||||
:value: "leaky_bucket"
|
||||
|
||||
Leaky bucket algorithm. Smooths out request rate for consistent throughput.
|
||||
|
||||
.. py:attribute:: SLIDING_WINDOW_COUNTER
|
||||
:value: "sliding_window_counter"
|
||||
|
||||
Sliding window counter algorithm. Balances precision and efficiency.
|
||||
This is the default.
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import Algorithm, rate_limit
|
||||
|
||||
@rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
BaseAlgorithm
|
||||
-------------
|
||||
|
||||
.. py:class:: BaseAlgorithm(limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Abstract base class for rate limiting algorithms.
|
||||
|
||||
:param limit: Maximum requests allowed in the window.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds.
|
||||
:type window_size: float
|
||||
:param backend: Storage backend for rate limit state.
|
||||
:type backend: Backend
|
||||
:param burst_size: Maximum burst size. Defaults to limit.
|
||||
:type burst_size: int | None
|
||||
|
||||
.. py:method:: check(key)
|
||||
:async:
|
||||
|
||||
Check if a request is allowed and update state.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
:returns: Tuple of (allowed, RateLimitInfo).
|
||||
:rtype: tuple[bool, RateLimitInfo]
|
||||
|
||||
.. py:method:: reset(key)
|
||||
:async:
|
||||
|
||||
Reset the rate limit state for a key.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
|
||||
.. py:method:: get_state(key)
|
||||
:async:
|
||||
|
||||
Get current state without consuming a token.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
:returns: Current rate limit info or None.
|
||||
:rtype: RateLimitInfo | None
|
||||
|
||||
TokenBucketAlgorithm
|
||||
--------------------
|
||||
|
||||
.. py:class:: TokenBucketAlgorithm(limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Token bucket algorithm implementation.
|
||||
|
||||
Tokens are added to the bucket at a rate of ``limit / window_size`` per second.
|
||||
Each request consumes one token. If no tokens are available, the request is
|
||||
rejected.
|
||||
|
||||
The ``burst_size`` parameter controls the maximum bucket capacity, allowing
|
||||
short bursts of traffic.
|
||||
|
||||
**State stored:**
|
||||
|
||||
- ``tokens``: Current number of tokens in the bucket
|
||||
- ``last_update``: Timestamp of last update
|
||||
|
||||
SlidingWindowAlgorithm
|
||||
----------------------
|
||||
|
||||
.. py:class:: SlidingWindowAlgorithm(limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Sliding window log algorithm implementation.
|
||||
|
||||
Stores the timestamp of every request within the window. Provides the most
|
||||
accurate rate limiting but uses more memory.
|
||||
|
||||
**State stored:**
|
||||
|
||||
- ``timestamps``: List of request timestamps within the window
|
||||
|
||||
FixedWindowAlgorithm
|
||||
--------------------
|
||||
|
||||
.. py:class:: FixedWindowAlgorithm(limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Fixed window algorithm implementation.
|
||||
|
||||
Divides time into fixed windows and counts requests in each window. Simple
|
||||
and efficient, but allows up to 2x the limit at window boundaries.
|
||||
|
||||
**State stored:**
|
||||
|
||||
- ``count``: Number of requests in current window
|
||||
- ``window_start``: Start timestamp of current window
|
||||
|
||||
LeakyBucketAlgorithm
|
||||
--------------------
|
||||
|
||||
.. py:class:: LeakyBucketAlgorithm(limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Leaky bucket algorithm implementation.
|
||||
|
||||
Requests fill a bucket that "leaks" at a constant rate. Smooths out traffic
|
||||
for consistent throughput.
|
||||
|
||||
**State stored:**
|
||||
|
||||
- ``water_level``: Current water level in the bucket
|
||||
- ``last_update``: Timestamp of last update
|
||||
|
||||
SlidingWindowCounterAlgorithm
|
||||
-----------------------------
|
||||
|
||||
.. py:class:: SlidingWindowCounterAlgorithm(limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Sliding window counter algorithm implementation.
|
||||
|
||||
Maintains counters for current and previous windows, calculating a weighted
|
||||
average based on window progress. Balances precision and memory efficiency.
|
||||
|
||||
**State stored:**
|
||||
|
||||
- ``prev_count``: Count from previous window
|
||||
- ``curr_count``: Count in current window
|
||||
- ``current_window``: Start timestamp of current window
|
||||
|
||||
get_algorithm
|
||||
-------------
|
||||
|
||||
.. py:function:: get_algorithm(algorithm, limit, window_size, backend, *, burst_size=None)
|
||||
|
||||
Factory function to create algorithm instances.
|
||||
|
||||
:param algorithm: The algorithm type to create.
|
||||
:type algorithm: Algorithm
|
||||
:param limit: Maximum requests allowed.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds.
|
||||
:type window_size: float
|
||||
:param backend: Storage backend.
|
||||
:type backend: Backend
|
||||
:param burst_size: Maximum burst size.
|
||||
:type burst_size: int | None
|
||||
:returns: An algorithm instance.
|
||||
:rtype: BaseAlgorithm
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.core.algorithms import get_algorithm, Algorithm
|
||||
from fastapi_traffic import MemoryBackend
|
||||
|
||||
backend = MemoryBackend()
|
||||
algorithm = get_algorithm(
|
||||
Algorithm.TOKEN_BUCKET,
|
||||
limit=100,
|
||||
window_size=60,
|
||||
backend=backend,
|
||||
burst_size=20,
|
||||
)
|
||||
|
||||
allowed, info = await algorithm.check("user:123")
|
||||
266
docs/api/backends.rst
Normal file
266
docs/api/backends.rst
Normal file
@@ -0,0 +1,266 @@
|
||||
Backends API
|
||||
============
|
||||
|
||||
Storage backends for rate limit state.
|
||||
|
||||
Backend (Base Class)
|
||||
--------------------
|
||||
|
||||
.. py:class:: Backend
|
||||
|
||||
Abstract base class for rate limit storage backends.
|
||||
|
||||
All backends must implement these methods:
|
||||
|
||||
.. py:method:: get(key)
|
||||
:async:
|
||||
|
||||
Get the current state for a key.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
:returns: The stored state dictionary or None if not found.
|
||||
:rtype: dict[str, Any] | None
|
||||
|
||||
.. py:method:: set(key, value, *, ttl)
|
||||
:async:
|
||||
|
||||
Set the state for a key with TTL.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
:param value: The state dictionary to store.
|
||||
:type value: dict[str, Any]
|
||||
:param ttl: Time-to-live in seconds.
|
||||
:type ttl: float
|
||||
|
||||
.. py:method:: delete(key)
|
||||
:async:
|
||||
|
||||
Delete the state for a key.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
|
||||
.. py:method:: exists(key)
|
||||
:async:
|
||||
|
||||
Check if a key exists.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
:returns: True if the key exists.
|
||||
:rtype: bool
|
||||
|
||||
.. py:method:: increment(key, amount=1)
|
||||
:async:
|
||||
|
||||
Atomically increment a counter.
|
||||
|
||||
:param key: The rate limit key.
|
||||
:type key: str
|
||||
:param amount: The amount to increment by.
|
||||
:type amount: int
|
||||
:returns: The new value after incrementing.
|
||||
:rtype: int
|
||||
|
||||
.. py:method:: clear()
|
||||
:async:
|
||||
|
||||
Clear all rate limit data.
|
||||
|
||||
.. py:method:: close()
|
||||
:async:
|
||||
|
||||
Close the backend connection.
|
||||
|
||||
Backends support async context manager protocol:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
async with MemoryBackend() as backend:
|
||||
await backend.set("key", {"count": 1}, ttl=60)
|
||||
|
||||
MemoryBackend
|
||||
-------------
|
||||
|
||||
.. py:class:: MemoryBackend(max_size=10000, cleanup_interval=60)
|
||||
|
||||
In-memory storage backend with LRU eviction and TTL cleanup.
|
||||
|
||||
:param max_size: Maximum number of keys to store.
|
||||
:type max_size: int
|
||||
:param cleanup_interval: How often to clean expired entries (seconds).
|
||||
:type cleanup_interval: float
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
|
||||
backend = MemoryBackend(max_size=10000)
|
||||
limiter = RateLimiter(backend)
|
||||
|
||||
.. py:method:: get_stats()
|
||||
|
||||
Get statistics about the backend.
|
||||
|
||||
:returns: Dictionary with stats like key count, memory usage.
|
||||
:rtype: dict[str, Any]
|
||||
|
||||
.. py:method:: start_cleanup()
|
||||
:async:
|
||||
|
||||
Start the background cleanup task.
|
||||
|
||||
.. py:method:: stop_cleanup()
|
||||
:async:
|
||||
|
||||
Stop the background cleanup task.
|
||||
|
||||
SQLiteBackend
|
||||
-------------
|
||||
|
||||
.. py:class:: SQLiteBackend(db_path, cleanup_interval=300)
|
||||
|
||||
SQLite storage backend for persistent rate limiting.
|
||||
|
||||
:param db_path: Path to the SQLite database file.
|
||||
:type db_path: str | Path
|
||||
:param cleanup_interval: How often to clean expired entries (seconds).
|
||||
:type cleanup_interval: float
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import SQLiteBackend, RateLimiter
|
||||
|
||||
backend = SQLiteBackend("rate_limits.db")
|
||||
limiter = RateLimiter(backend)
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
await limiter.close()
|
||||
|
||||
.. py:method:: initialize()
|
||||
:async:
|
||||
|
||||
Initialize the database schema.
|
||||
|
||||
Features:
|
||||
|
||||
- WAL mode for better concurrent performance
|
||||
- Automatic schema creation
|
||||
- Connection pooling
|
||||
- Background cleanup of expired entries
|
||||
|
||||
RedisBackend
|
||||
------------
|
||||
|
||||
.. py:class:: RedisBackend
|
||||
|
||||
Redis storage backend for distributed rate limiting.
|
||||
|
||||
.. py:method:: from_url(url, *, key_prefix="", **kwargs)
|
||||
:classmethod:
|
||||
|
||||
Create a RedisBackend from a Redis URL. This is an async classmethod.
|
||||
|
||||
:param url: Redis connection URL.
|
||||
:type url: str
|
||||
:param key_prefix: Prefix for all keys.
|
||||
:type key_prefix: str
|
||||
:returns: Configured RedisBackend instance.
|
||||
:rtype: RedisBackend
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
from fastapi_traffic import RateLimiter
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
limiter = RateLimiter(backend)
|
||||
|
||||
**Connection examples:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Simple connection
|
||||
backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
|
||||
# With password
|
||||
backend = await RedisBackend.from_url("redis://:password@localhost:6379/0")
|
||||
|
||||
# With key prefix
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://localhost:6379/0",
|
||||
key_prefix="myapp:ratelimit:",
|
||||
)
|
||||
|
||||
.. py:method:: get_stats()
|
||||
:async:
|
||||
|
||||
Get statistics about the Redis backend.
|
||||
|
||||
:returns: Dictionary with stats like key count, memory usage.
|
||||
:rtype: dict[str, Any]
|
||||
|
||||
Features:
|
||||
|
||||
- Atomic operations via Lua scripts
|
||||
- Automatic key expiration
|
||||
- Connection pooling
|
||||
- Support for Redis Sentinel and Cluster
|
||||
|
||||
Implementing Custom Backends
|
||||
----------------------------
|
||||
|
||||
To create a custom backend, inherit from ``Backend`` and implement all abstract
|
||||
methods:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.backends.base import Backend
|
||||
from typing import Any
|
||||
|
||||
class MyBackend(Backend):
|
||||
async def get(self, key: str) -> dict[str, Any] | None:
|
||||
# Retrieve state from your storage
|
||||
pass
|
||||
|
||||
async def set(self, key: str, value: dict[str, Any], *, ttl: float) -> None:
|
||||
# Store state with expiration
|
||||
pass
|
||||
|
||||
async def delete(self, key: str) -> None:
|
||||
# Remove a key
|
||||
pass
|
||||
|
||||
async def exists(self, key: str) -> bool:
|
||||
# Check if key exists
|
||||
pass
|
||||
|
||||
async def increment(self, key: str, amount: int = 1) -> int:
|
||||
# Atomically increment (important for accuracy)
|
||||
pass
|
||||
|
||||
async def clear(self) -> None:
|
||||
# Clear all data
|
||||
pass
|
||||
|
||||
async def close(self) -> None:
|
||||
# Clean up connections
|
||||
pass
|
||||
|
||||
The ``value`` dictionary contains algorithm-specific state. Your backend should
|
||||
serialize it appropriately (JSON works well for most cases).
|
||||
245
docs/api/config.rst
Normal file
245
docs/api/config.rst
Normal file
@@ -0,0 +1,245 @@
|
||||
Configuration API
|
||||
=================
|
||||
|
||||
Configuration classes and loaders for rate limiting.
|
||||
|
||||
RateLimitConfig
|
||||
---------------
|
||||
|
||||
.. py:class:: RateLimitConfig(limit, window_size=60.0, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, include_headers=True, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None, on_blocked=None)
|
||||
|
||||
Configuration for a rate limit rule.
|
||||
|
||||
:param limit: Maximum requests allowed in the window. Must be positive.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds. Must be positive.
|
||||
:type window_size: float
|
||||
:param algorithm: Rate limiting algorithm to use.
|
||||
:type algorithm: Algorithm
|
||||
:param key_prefix: Prefix for the rate limit key.
|
||||
:type key_prefix: str
|
||||
:param key_extractor: Function to extract client identifier from request.
|
||||
:type key_extractor: Callable[[Request], str]
|
||||
:param burst_size: Maximum burst size for token/leaky bucket.
|
||||
:type burst_size: int | None
|
||||
:param include_headers: Whether to include rate limit headers.
|
||||
:type include_headers: bool
|
||||
:param error_message: Error message when rate limited.
|
||||
:type error_message: str
|
||||
:param status_code: HTTP status code when rate limited.
|
||||
:type status_code: int
|
||||
:param skip_on_error: Skip rate limiting on backend errors.
|
||||
:type skip_on_error: bool
|
||||
:param cost: Cost per request.
|
||||
:type cost: int
|
||||
:param exempt_when: Function to check if request is exempt.
|
||||
:type exempt_when: Callable[[Request], bool] | None
|
||||
:param on_blocked: Callback when request is blocked.
|
||||
:type on_blocked: Callable[[Request, Any], Any] | None
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimitConfig, Algorithm
|
||||
|
||||
config = RateLimitConfig(
|
||||
limit=100,
|
||||
window_size=60,
|
||||
algorithm=Algorithm.TOKEN_BUCKET,
|
||||
burst_size=20,
|
||||
)
|
||||
|
||||
GlobalConfig
|
||||
------------
|
||||
|
||||
.. py:class:: GlobalConfig(backend=None, enabled=True, default_limit=100, default_window_size=60.0, default_algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="fastapi_traffic", include_headers=True, error_message="Rate limit exceeded. Please try again later.", status_code=429, skip_on_error=False, exempt_ips=set(), exempt_paths=set(), headers_prefix="X-RateLimit")
|
||||
|
||||
Global configuration for the rate limiter.
|
||||
|
||||
:param backend: Storage backend for rate limit data.
|
||||
:type backend: Backend | None
|
||||
:param enabled: Whether rate limiting is enabled.
|
||||
:type enabled: bool
|
||||
:param default_limit: Default maximum requests per window.
|
||||
:type default_limit: int
|
||||
:param default_window_size: Default time window in seconds.
|
||||
:type default_window_size: float
|
||||
:param default_algorithm: Default rate limiting algorithm.
|
||||
:type default_algorithm: Algorithm
|
||||
:param key_prefix: Global prefix for all rate limit keys.
|
||||
:type key_prefix: str
|
||||
:param include_headers: Include rate limit headers by default.
|
||||
:type include_headers: bool
|
||||
:param error_message: Default error message.
|
||||
:type error_message: str
|
||||
:param status_code: Default HTTP status code.
|
||||
:type status_code: int
|
||||
:param skip_on_error: Skip rate limiting on backend errors.
|
||||
:type skip_on_error: bool
|
||||
:param exempt_ips: IP addresses exempt from rate limiting.
|
||||
:type exempt_ips: set[str]
|
||||
:param exempt_paths: URL paths exempt from rate limiting.
|
||||
:type exempt_paths: set[str]
|
||||
:param headers_prefix: Prefix for rate limit headers.
|
||||
:type headers_prefix: str
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import GlobalConfig, RateLimiter
|
||||
|
||||
config = GlobalConfig(
|
||||
enabled=True,
|
||||
default_limit=100,
|
||||
exempt_paths={"/health", "/docs"},
|
||||
exempt_ips={"127.0.0.1"},
|
||||
)
|
||||
|
||||
limiter = RateLimiter(config=config)
|
||||
|
||||
ConfigLoader
|
||||
------------
|
||||
|
||||
.. py:class:: ConfigLoader(prefix="FASTAPI_TRAFFIC")
|
||||
|
||||
Load rate limit configuration from various sources.
|
||||
|
||||
:param prefix: Environment variable prefix.
|
||||
:type prefix: str
|
||||
|
||||
.. py:method:: load_rate_limit_config_from_env(env_vars=None, **overrides)
|
||||
|
||||
Load RateLimitConfig from environment variables.
|
||||
|
||||
:param env_vars: Dictionary of environment variables. Uses os.environ if None.
|
||||
:type env_vars: dict[str, str] | None
|
||||
:param overrides: Values to override after loading.
|
||||
:returns: Loaded configuration.
|
||||
:rtype: RateLimitConfig
|
||||
|
||||
.. py:method:: load_rate_limit_config_from_json(file_path, **overrides)
|
||||
|
||||
Load RateLimitConfig from a JSON file.
|
||||
|
||||
:param file_path: Path to the JSON file.
|
||||
:type file_path: str | Path
|
||||
:param overrides: Values to override after loading.
|
||||
:returns: Loaded configuration.
|
||||
:rtype: RateLimitConfig
|
||||
|
||||
.. py:method:: load_rate_limit_config_from_env_file(file_path, **overrides)
|
||||
|
||||
Load RateLimitConfig from a .env file.
|
||||
|
||||
:param file_path: Path to the .env file.
|
||||
:type file_path: str | Path
|
||||
:param overrides: Values to override after loading.
|
||||
:returns: Loaded configuration.
|
||||
:rtype: RateLimitConfig
|
||||
|
||||
.. py:method:: load_global_config_from_env(env_vars=None, **overrides)
|
||||
|
||||
Load GlobalConfig from environment variables.
|
||||
|
||||
.. py:method:: load_global_config_from_json(file_path, **overrides)
|
||||
|
||||
Load GlobalConfig from a JSON file.
|
||||
|
||||
.. py:method:: load_global_config_from_env_file(file_path, **overrides)
|
||||
|
||||
Load GlobalConfig from a .env file.
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import ConfigLoader
|
||||
|
||||
loader = ConfigLoader()
|
||||
|
||||
# From environment
|
||||
config = loader.load_rate_limit_config_from_env()
|
||||
|
||||
# From JSON file
|
||||
config = loader.load_rate_limit_config_from_json("config.json")
|
||||
|
||||
# From .env file
|
||||
config = loader.load_rate_limit_config_from_env_file(".env")
|
||||
|
||||
# With overrides
|
||||
config = loader.load_rate_limit_config_from_json(
|
||||
"config.json",
|
||||
limit=200, # Override the limit
|
||||
)
|
||||
|
||||
Convenience Functions
|
||||
---------------------
|
||||
|
||||
.. py:function:: load_rate_limit_config(file_path, **overrides)
|
||||
|
||||
Load RateLimitConfig with automatic format detection.
|
||||
|
||||
:param file_path: Path to config file (.json or .env).
|
||||
:type file_path: str | Path
|
||||
:returns: Loaded configuration.
|
||||
:rtype: RateLimitConfig
|
||||
|
||||
.. py:function:: load_rate_limit_config_from_env(**overrides)
|
||||
|
||||
Load RateLimitConfig from environment variables.
|
||||
|
||||
:returns: Loaded configuration.
|
||||
:rtype: RateLimitConfig
|
||||
|
||||
.. py:function:: load_global_config(file_path, **overrides)
|
||||
|
||||
Load GlobalConfig with automatic format detection.
|
||||
|
||||
:param file_path: Path to config file (.json or .env).
|
||||
:type file_path: str | Path
|
||||
:returns: Loaded configuration.
|
||||
:rtype: GlobalConfig
|
||||
|
||||
.. py:function:: load_global_config_from_env(**overrides)
|
||||
|
||||
Load GlobalConfig from environment variables.
|
||||
|
||||
:returns: Loaded configuration.
|
||||
:rtype: GlobalConfig
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import (
|
||||
load_rate_limit_config,
|
||||
load_rate_limit_config_from_env,
|
||||
)
|
||||
|
||||
# Auto-detect format
|
||||
config = load_rate_limit_config("config.json")
|
||||
config = load_rate_limit_config(".env")
|
||||
|
||||
# From environment
|
||||
config = load_rate_limit_config_from_env()
|
||||
|
||||
default_key_extractor
|
||||
---------------------
|
||||
|
||||
.. py:function:: default_key_extractor(request)
|
||||
|
||||
Extract client IP as the default rate limit key.
|
||||
|
||||
Checks in order:
|
||||
|
||||
1. ``X-Forwarded-For`` header (first IP)
|
||||
2. ``X-Real-IP`` header
|
||||
3. Direct connection IP
|
||||
4. Falls back to "unknown"
|
||||
|
||||
:param request: The incoming request.
|
||||
:type request: Request
|
||||
:returns: Client identifier string.
|
||||
:rtype: str
|
||||
154
docs/api/decorator.rst
Normal file
154
docs/api/decorator.rst
Normal file
@@ -0,0 +1,154 @@
|
||||
Decorator API
|
||||
=============
|
||||
|
||||
The ``@rate_limit`` decorator is the primary way to add rate limiting to your
|
||||
FastAPI endpoints.
|
||||
|
||||
rate_limit
|
||||
----------
|
||||
|
||||
.. py:function:: rate_limit(limit, window_size=60.0, *, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, include_headers=True, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None, on_blocked=None)
|
||||
|
||||
Apply rate limiting to a FastAPI endpoint.
|
||||
|
||||
:param limit: Maximum number of requests allowed in the window.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds. Defaults to 60.
|
||||
:type window_size: float
|
||||
:param algorithm: Rate limiting algorithm to use.
|
||||
:type algorithm: Algorithm
|
||||
:param key_prefix: Prefix for the rate limit key.
|
||||
:type key_prefix: str
|
||||
:param key_extractor: Function to extract client identifier from request.
|
||||
:type key_extractor: Callable[[Request], str]
|
||||
:param burst_size: Maximum burst size for token bucket/leaky bucket algorithms.
|
||||
:type burst_size: int | None
|
||||
:param include_headers: Whether to include rate limit headers in response.
|
||||
:type include_headers: bool
|
||||
:param error_message: Error message when rate limit is exceeded.
|
||||
:type error_message: str
|
||||
:param status_code: HTTP status code when rate limit is exceeded.
|
||||
:type status_code: int
|
||||
:param skip_on_error: Skip rate limiting if backend errors occur.
|
||||
:type skip_on_error: bool
|
||||
:param cost: Cost of each request (default 1).
|
||||
:type cost: int
|
||||
:param exempt_when: Function to determine if request should be exempt.
|
||||
:type exempt_when: Callable[[Request], bool] | None
|
||||
:param on_blocked: Callback when a request is blocked.
|
||||
:type on_blocked: Callable[[Request, Any], Any] | None
|
||||
:returns: Decorated function with rate limiting applied.
|
||||
:rtype: Callable
|
||||
|
||||
**Basic usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import rate_limit
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(100, 60) # 100 requests per minute
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
**With algorithm:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import rate_limit, Algorithm
|
||||
|
||||
@app.get("/api/burst")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=20)
|
||||
async def burst_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**With custom key extractor:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def get_api_key(request: Request) -> str:
|
||||
return request.headers.get("X-API-Key", "anonymous")
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(1000, 3600, key_extractor=get_api_key)
|
||||
async def api_endpoint(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
**With exemption:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def is_admin(request: Request) -> bool:
|
||||
return getattr(request.state, "is_admin", False)
|
||||
|
||||
@app.get("/api/admin")
|
||||
@rate_limit(100, 60, exempt_when=is_admin)
|
||||
async def admin_endpoint(request: Request):
|
||||
return {"admin": "data"}
|
||||
|
||||
RateLimitDependency
|
||||
-------------------
|
||||
|
||||
.. py:class:: RateLimitDependency(limit, window_size=60.0, *, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None)
|
||||
:no-index:
|
||||
|
||||
FastAPI dependency for rate limiting. Returns rate limit info that can be
|
||||
used in your endpoint. See :doc:`dependency` for full documentation.
|
||||
|
||||
:param limit: Maximum number of requests allowed in the window.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds.
|
||||
:type window_size: float
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Depends, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
|
||||
app = FastAPI()
|
||||
rate_dep = RateLimitDependency(limit=100, window_size=60)
|
||||
|
||||
@app.get("/api/data")
|
||||
async def get_data(request: Request, rate_info=Depends(rate_dep)):
|
||||
return {
|
||||
"data": "here",
|
||||
"remaining_requests": rate_info.remaining,
|
||||
"reset_at": rate_info.reset_at,
|
||||
}
|
||||
|
||||
The dependency returns a ``RateLimitInfo`` object with:
|
||||
|
||||
- ``limit``: The configured limit
|
||||
- ``remaining``: Remaining requests in the current window
|
||||
- ``reset_at``: Unix timestamp when the window resets
|
||||
- ``retry_after``: Seconds until retry (if rate limited)
|
||||
|
||||
create_rate_limit_response
|
||||
--------------------------
|
||||
|
||||
.. py:function:: create_rate_limit_response(exc, *, include_headers=True)
|
||||
|
||||
Create a standard rate limit response from a RateLimitExceeded exception.
|
||||
|
||||
:param exc: The RateLimitExceeded exception.
|
||||
:type exc: RateLimitExceeded
|
||||
:param include_headers: Whether to include rate limit headers.
|
||||
:type include_headers: bool
|
||||
:returns: A JSONResponse with rate limit information.
|
||||
:rtype: Response
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
from fastapi_traffic.core.decorator import create_rate_limit_response
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def handler(request: Request, exc: RateLimitExceeded):
|
||||
return create_rate_limit_response(exc)
|
||||
473
docs/api/dependency.rst
Normal file
473
docs/api/dependency.rst
Normal file
@@ -0,0 +1,473 @@
|
||||
Dependency Injection API
|
||||
========================
|
||||
|
||||
If you're already using FastAPI's dependency injection system, you'll feel right
|
||||
at home with ``RateLimitDependency``. It plugs directly into ``Depends``, giving
|
||||
you rate limiting that works just like any other dependency—plus you get access
|
||||
to rate limit info right inside your endpoint.
|
||||
|
||||
RateLimitDependency
|
||||
-------------------
|
||||
|
||||
.. py:class:: RateLimitDependency(limit, window_size=60.0, *, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, key_prefix="ratelimit", key_extractor=default_key_extractor, burst_size=None, error_message="Rate limit exceeded", status_code=429, skip_on_error=False, cost=1, exempt_when=None)
|
||||
|
||||
This is the main class you'll use for dependency-based rate limiting. Create
|
||||
an instance, pass it to ``Depends()``, and you're done.
|
||||
|
||||
:param limit: Maximum number of requests allowed in the window.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds. Defaults to 60.
|
||||
:type window_size: float
|
||||
:param algorithm: Rate limiting algorithm to use.
|
||||
:type algorithm: Algorithm
|
||||
:param key_prefix: Prefix for the rate limit key.
|
||||
:type key_prefix: str
|
||||
:param key_extractor: Function to extract client identifier from request.
|
||||
:type key_extractor: Callable[[Request], str]
|
||||
:param burst_size: Maximum burst size for token bucket/leaky bucket algorithms.
|
||||
:type burst_size: int | None
|
||||
:param error_message: Error message when rate limit is exceeded.
|
||||
:type error_message: str
|
||||
:param status_code: HTTP status code when rate limit is exceeded.
|
||||
:type status_code: int
|
||||
:param skip_on_error: Skip rate limiting if backend errors occur.
|
||||
:type skip_on_error: bool
|
||||
:param cost: Cost of each request (default 1).
|
||||
:type cost: int
|
||||
:param exempt_when: Function to determine if request should be exempt.
|
||||
:type exempt_when: Callable[[Request], bool] | None
|
||||
|
||||
**Returns:** A ``RateLimitInfo`` object with details about the current rate limit state.
|
||||
|
||||
RateLimitInfo
|
||||
-------------
|
||||
|
||||
When the dependency runs, it hands you back a ``RateLimitInfo`` object. Here's
|
||||
what's inside:
|
||||
|
||||
.. py:class:: RateLimitInfo
|
||||
|
||||
:param limit: The configured request limit.
|
||||
:type limit: int
|
||||
:param remaining: Remaining requests in the current window.
|
||||
:type remaining: int
|
||||
:param reset_at: Unix timestamp when the window resets.
|
||||
:type reset_at: float
|
||||
:param retry_after: Seconds until retry is allowed (if rate limited).
|
||||
:type retry_after: float | None
|
||||
:param window_size: The configured window size in seconds.
|
||||
:type window_size: float
|
||||
|
||||
.. py:method:: to_headers() -> dict[str, str]
|
||||
|
||||
Converts the rate limit info into standard HTTP headers. Handy if you want
|
||||
to add these headers to your response manually.
|
||||
|
||||
:returns: A dictionary with ``X-RateLimit-Limit``, ``X-RateLimit-Remaining``,
|
||||
``X-RateLimit-Reset``, and ``Retry-After`` (when applicable).
|
||||
|
||||
Setup
|
||||
-----
|
||||
|
||||
Before you can use the dependency, you need to set up the rate limiter. The
|
||||
cleanest way is with FastAPI's lifespan context manager:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from contextlib import asynccontextmanager
|
||||
from fastapi import FastAPI
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
await limiter.initialize()
|
||||
set_limiter(limiter)
|
||||
yield
|
||||
await limiter.close()
|
||||
|
||||
app = FastAPI(lifespan=lifespan)
|
||||
|
||||
Basic Usage
|
||||
-----------
|
||||
|
||||
Here's the simplest way to get started. Create a dependency instance and inject
|
||||
it with ``Depends``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
# Create the rate limit dependency
|
||||
rate_limit_dep = RateLimitDependency(limit=100, window_size=60)
|
||||
|
||||
@app.get("/api/data")
|
||||
async def get_data(
|
||||
request: Request,
|
||||
rate_info=Depends(rate_limit_dep),
|
||||
):
|
||||
return {
|
||||
"data": "here",
|
||||
"remaining_requests": rate_info.remaining,
|
||||
"reset_at": rate_info.reset_at,
|
||||
}
|
||||
|
||||
Using Type Aliases
|
||||
------------------
|
||||
|
||||
If you're using the same rate limit across multiple endpoints, type aliases
|
||||
with ``Annotated`` make your code much cleaner:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from typing import Annotated, TypeAlias
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
from fastapi_traffic.core.models import RateLimitInfo
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
rate_limit_dep = RateLimitDependency(limit=100, window_size=60)
|
||||
|
||||
# Create a type alias for cleaner signatures
|
||||
RateLimit: TypeAlias = Annotated[RateLimitInfo, Depends(rate_limit_dep)]
|
||||
|
||||
@app.get("/api/data")
|
||||
async def get_data(request: Request, rate_info: RateLimit):
|
||||
return {
|
||||
"data": "here",
|
||||
"remaining": rate_info.remaining,
|
||||
}
|
||||
|
||||
Tiered Rate Limits
|
||||
------------------
|
||||
|
||||
This is where dependency injection really shines. You can apply different rate
|
||||
limits based on who's making the request—free users get 10 requests per minute,
|
||||
pro users get 100, and enterprise gets 1000:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from typing import Annotated, TypeAlias
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
from fastapi_traffic.core.models import RateLimitInfo
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
# Define tier-specific limits
|
||||
free_tier_limit = RateLimitDependency(
|
||||
limit=10,
|
||||
window_size=60,
|
||||
key_prefix="free",
|
||||
)
|
||||
|
||||
pro_tier_limit = RateLimitDependency(
|
||||
limit=100,
|
||||
window_size=60,
|
||||
key_prefix="pro",
|
||||
)
|
||||
|
||||
enterprise_tier_limit = RateLimitDependency(
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
key_prefix="enterprise",
|
||||
)
|
||||
|
||||
def get_user_tier(request: Request) -> str:
|
||||
"""Get user tier from header (in real app, from JWT/database)."""
|
||||
return request.headers.get("X-User-Tier", "free")
|
||||
|
||||
TierDep: TypeAlias = Annotated[str, Depends(get_user_tier)]
|
||||
|
||||
async def tiered_rate_limit(
|
||||
request: Request,
|
||||
tier: TierDep,
|
||||
) -> RateLimitInfo:
|
||||
"""Apply different rate limits based on user tier."""
|
||||
if tier == "enterprise":
|
||||
return await enterprise_tier_limit(request)
|
||||
elif tier == "pro":
|
||||
return await pro_tier_limit(request)
|
||||
else:
|
||||
return await free_tier_limit(request)
|
||||
|
||||
TieredRateLimit: TypeAlias = Annotated[RateLimitInfo, Depends(tiered_rate_limit)]
|
||||
|
||||
@app.get("/api/resource")
|
||||
async def get_resource(request: Request, rate_info: TieredRateLimit):
|
||||
tier = get_user_tier(request)
|
||||
return {
|
||||
"tier": tier,
|
||||
"remaining": rate_info.remaining,
|
||||
"limit": rate_info.limit,
|
||||
}
|
||||
|
||||
Custom Key Extraction
|
||||
---------------------
|
||||
|
||||
By default, rate limits are tracked by IP address. But what if you want to rate
|
||||
limit by API key instead? Just pass a custom ``key_extractor``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
def api_key_extractor(request: Request) -> str:
|
||||
"""Extract API key for rate limiting."""
|
||||
api_key = request.headers.get("X-API-Key", "anonymous")
|
||||
return f"api:{api_key}"
|
||||
|
||||
api_rate_limit = RateLimitDependency(
|
||||
limit=100,
|
||||
window_size=3600, # 100 requests per hour
|
||||
key_extractor=api_key_extractor,
|
||||
)
|
||||
|
||||
@app.get("/api/resource")
|
||||
async def api_resource(
|
||||
request: Request,
|
||||
rate_info=Depends(api_rate_limit),
|
||||
):
|
||||
return {
|
||||
"data": "Resource data",
|
||||
"requests_remaining": rate_info.remaining,
|
||||
}
|
||||
|
||||
Multiple Rate Limits
|
||||
--------------------
|
||||
|
||||
Sometimes you need layered protection—say, 10 requests per minute *and* 100
|
||||
requests per hour. Dependencies make this easy to compose:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from typing import Annotated, Any, TypeAlias
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
from fastapi_traffic.core.models import RateLimitInfo
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
per_minute_limit = RateLimitDependency(
|
||||
limit=10,
|
||||
window_size=60,
|
||||
key_prefix="minute",
|
||||
)
|
||||
|
||||
per_hour_limit = RateLimitDependency(
|
||||
limit=100,
|
||||
window_size=3600,
|
||||
key_prefix="hour",
|
||||
)
|
||||
|
||||
PerMinuteLimit: TypeAlias = Annotated[RateLimitInfo, Depends(per_minute_limit)]
|
||||
PerHourLimit: TypeAlias = Annotated[RateLimitInfo, Depends(per_hour_limit)]
|
||||
|
||||
async def combined_rate_limit(
|
||||
request: Request,
|
||||
minute_info: PerMinuteLimit,
|
||||
hour_info: PerHourLimit,
|
||||
) -> dict[str, Any]:
|
||||
"""Apply both per-minute and per-hour limits."""
|
||||
return {
|
||||
"minute": {
|
||||
"limit": minute_info.limit,
|
||||
"remaining": minute_info.remaining,
|
||||
},
|
||||
"hour": {
|
||||
"limit": hour_info.limit,
|
||||
"remaining": hour_info.remaining,
|
||||
},
|
||||
}
|
||||
|
||||
CombinedRateLimit: TypeAlias = Annotated[dict[str, Any], Depends(combined_rate_limit)]
|
||||
|
||||
@app.get("/api/combined")
|
||||
async def combined_endpoint(
|
||||
request: Request,
|
||||
rate_info: CombinedRateLimit,
|
||||
):
|
||||
return {
|
||||
"message": "Success",
|
||||
"rate_limits": rate_info,
|
||||
}
|
||||
|
||||
Exemption Logic
|
||||
---------------
|
||||
|
||||
Need to let certain requests bypass rate limiting entirely? Maybe internal
|
||||
services or admin users? Use the ``exempt_when`` parameter:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
def is_internal_request(request: Request) -> bool:
|
||||
"""Check if request is from internal service."""
|
||||
internal_token = request.headers.get("X-Internal-Token")
|
||||
return internal_token == "internal-secret-token"
|
||||
|
||||
internal_exempt_limit = RateLimitDependency(
|
||||
limit=10,
|
||||
window_size=60,
|
||||
exempt_when=is_internal_request,
|
||||
)
|
||||
|
||||
@app.get("/api/internal")
|
||||
async def internal_endpoint(
|
||||
request: Request,
|
||||
rate_info=Depends(internal_exempt_limit),
|
||||
):
|
||||
is_internal = is_internal_request(request)
|
||||
return {
|
||||
"message": "Success",
|
||||
"is_internal": is_internal,
|
||||
"rate_limit": None if is_internal else {
|
||||
"remaining": rate_info.remaining,
|
||||
},
|
||||
}
|
||||
|
||||
Exception Handling
|
||||
------------------
|
||||
|
||||
When a request exceeds the rate limit, a ``RateLimitExceeded`` exception is
|
||||
raised. You'll want to catch this and return a proper response:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi.responses import JSONResponse
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(
|
||||
request: Request,
|
||||
exc: RateLimitExceeded,
|
||||
) -> JSONResponse:
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"error": "rate_limit_exceeded",
|
||||
"message": exc.message,
|
||||
"retry_after": exc.retry_after,
|
||||
},
|
||||
)
|
||||
|
||||
Or if you prefer, there's a built-in helper that does the work for you:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
from fastapi_traffic.core.decorator import create_rate_limit_response
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
return create_rate_limit_response(exc, include_headers=True)
|
||||
|
||||
Complete Example
|
||||
----------------
|
||||
|
||||
Here's everything put together in a working example you can copy and run:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from contextlib import asynccontextmanager
|
||||
from typing import Annotated, TypeAlias
|
||||
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
from fastapi_traffic import (
|
||||
MemoryBackend,
|
||||
RateLimiter,
|
||||
RateLimitExceeded,
|
||||
)
|
||||
from fastapi_traffic.core.decorator import RateLimitDependency
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
from fastapi_traffic.core.models import RateLimitInfo
|
||||
|
||||
# Initialize backend and limiter
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
await limiter.initialize()
|
||||
set_limiter(limiter)
|
||||
yield
|
||||
await limiter.close()
|
||||
|
||||
app = FastAPI(lifespan=lifespan)
|
||||
|
||||
# Exception handler
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(
|
||||
request: Request,
|
||||
exc: RateLimitExceeded,
|
||||
) -> JSONResponse:
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"error": "rate_limit_exceeded",
|
||||
"retry_after": exc.retry_after,
|
||||
},
|
||||
)
|
||||
|
||||
# Create dependency
|
||||
api_rate_limit = RateLimitDependency(limit=100, window_size=60)
|
||||
ApiRateLimit: TypeAlias = Annotated[RateLimitInfo, Depends(api_rate_limit)]
|
||||
|
||||
@app.get("/api/data")
|
||||
async def get_data(request: Request, rate_info: ApiRateLimit):
|
||||
return {
|
||||
"data": "Your data here",
|
||||
"rate_limit": {
|
||||
"limit": rate_info.limit,
|
||||
"remaining": rate_info.remaining,
|
||||
"reset_at": rate_info.reset_at,
|
||||
},
|
||||
}
|
||||
|
||||
Decorator vs Dependency
|
||||
-----------------------
|
||||
|
||||
Not sure which approach to use? Here's a quick guide:
|
||||
|
||||
**Go with the ``@rate_limit`` decorator if:**
|
||||
|
||||
- You just want to slap a rate limit on an endpoint and move on
|
||||
- You don't care about the remaining request count inside your endpoint
|
||||
- You're applying the same limit to a bunch of endpoints
|
||||
|
||||
**Go with ``RateLimitDependency`` if:**
|
||||
|
||||
- You want to show users how many requests they have left
|
||||
- You need different limits for different user tiers
|
||||
- You're stacking multiple rate limits (per-minute + per-hour)
|
||||
- You're already using FastAPI's dependency system and want consistency
|
||||
|
||||
See Also
|
||||
--------
|
||||
|
||||
- :doc:`decorator` - Decorator-based rate limiting
|
||||
- :doc:`middleware` - Global middleware rate limiting
|
||||
- :doc:`config` - Configuration options
|
||||
- :doc:`exceptions` - Exception handling
|
||||
165
docs/api/exceptions.rst
Normal file
165
docs/api/exceptions.rst
Normal file
@@ -0,0 +1,165 @@
|
||||
Exceptions API
|
||||
==============
|
||||
|
||||
Custom exceptions raised by FastAPI Traffic.
|
||||
|
||||
FastAPITrafficError
|
||||
-------------------
|
||||
|
||||
.. py:exception:: FastAPITrafficError
|
||||
|
||||
Base exception for all FastAPI Traffic errors.
|
||||
|
||||
All other exceptions in this library inherit from this class, so you can
|
||||
catch all FastAPI Traffic errors with a single handler:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.exceptions import FastAPITrafficError
|
||||
|
||||
@app.exception_handler(FastAPITrafficError)
|
||||
async def handle_traffic_error(request: Request, exc: FastAPITrafficError):
|
||||
return JSONResponse(
|
||||
status_code=500,
|
||||
content={"error": str(exc)},
|
||||
)
|
||||
|
||||
RateLimitExceeded
|
||||
-----------------
|
||||
|
||||
.. py:exception:: RateLimitExceeded(message="Rate limit exceeded", *, retry_after=None, limit_info=None)
|
||||
|
||||
Raised when a rate limit has been exceeded.
|
||||
|
||||
:param message: Error message.
|
||||
:type message: str
|
||||
:param retry_after: Seconds until the client can retry.
|
||||
:type retry_after: float | None
|
||||
:param limit_info: Detailed rate limit information.
|
||||
:type limit_info: RateLimitInfo | None
|
||||
|
||||
.. py:attribute:: message
|
||||
:type: str
|
||||
|
||||
The error message.
|
||||
|
||||
.. py:attribute:: retry_after
|
||||
:type: float | None
|
||||
|
||||
Seconds until the client can retry. May be None if not calculable.
|
||||
|
||||
.. py:attribute:: limit_info
|
||||
:type: RateLimitInfo | None
|
||||
|
||||
Detailed information about the rate limit state.
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import Request
|
||||
from fastapi.responses import JSONResponse
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
headers = {}
|
||||
if exc.limit_info:
|
||||
headers = exc.limit_info.to_headers()
|
||||
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"error": "rate_limit_exceeded",
|
||||
"message": exc.message,
|
||||
"retry_after": exc.retry_after,
|
||||
},
|
||||
headers=headers,
|
||||
)
|
||||
|
||||
BackendError
|
||||
------------
|
||||
|
||||
.. py:exception:: BackendError(message="Backend operation failed", *, original_error=None)
|
||||
|
||||
Raised when a backend operation fails.
|
||||
|
||||
:param message: Error message.
|
||||
:type message: str
|
||||
:param original_error: The original exception that caused this error.
|
||||
:type original_error: Exception | None
|
||||
|
||||
.. py:attribute:: message
|
||||
:type: str
|
||||
|
||||
The error message.
|
||||
|
||||
.. py:attribute:: original_error
|
||||
:type: Exception | None
|
||||
|
||||
The underlying exception, if any.
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import BackendError
|
||||
|
||||
@app.exception_handler(BackendError)
|
||||
async def backend_error_handler(request: Request, exc: BackendError):
|
||||
# Log the original error for debugging
|
||||
if exc.original_error:
|
||||
logger.error("Backend error: %s", exc.original_error)
|
||||
|
||||
return JSONResponse(
|
||||
status_code=503,
|
||||
content={"error": "service_unavailable"},
|
||||
)
|
||||
|
||||
This exception is raised when:
|
||||
|
||||
- Redis connection fails
|
||||
- SQLite database is locked or corrupted
|
||||
- Any other backend storage operation fails
|
||||
|
||||
ConfigurationError
|
||||
------------------
|
||||
|
||||
.. py:exception:: ConfigurationError
|
||||
|
||||
Raised when there is a configuration error.
|
||||
|
||||
This exception is raised when:
|
||||
|
||||
- Invalid values in configuration files
|
||||
- Missing required configuration
|
||||
- Type conversion failures
|
||||
- Unknown configuration fields
|
||||
|
||||
**Usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import ConfigLoader, ConfigurationError
|
||||
|
||||
loader = ConfigLoader()
|
||||
|
||||
try:
|
||||
config = loader.load_rate_limit_config_from_json("config.json")
|
||||
except ConfigurationError as e:
|
||||
print(f"Configuration error: {e}")
|
||||
# Use default configuration
|
||||
config = RateLimitConfig(limit=100, window_size=60)
|
||||
|
||||
Exception Hierarchy
|
||||
-------------------
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
FastAPITrafficError
|
||||
├── RateLimitExceeded
|
||||
├── BackendError
|
||||
└── ConfigurationError
|
||||
|
||||
All exceptions inherit from ``FastAPITrafficError``, which inherits from
|
||||
Python's built-in ``Exception``.
|
||||
118
docs/api/middleware.rst
Normal file
118
docs/api/middleware.rst
Normal file
@@ -0,0 +1,118 @@
|
||||
Middleware API
|
||||
==============
|
||||
|
||||
Middleware for applying rate limiting globally across your application.
|
||||
|
||||
RateLimitMiddleware
|
||||
-------------------
|
||||
|
||||
.. py:class:: RateLimitMiddleware(app, *, limit=100, window_size=60.0, algorithm=Algorithm.SLIDING_WINDOW_COUNTER, backend=None, key_prefix="middleware", include_headers=True, error_message="Rate limit exceeded. Please try again later.", status_code=429, skip_on_error=False, exempt_paths=None, exempt_ips=None, key_extractor=default_key_extractor)
|
||||
|
||||
Middleware for global rate limiting across all endpoints.
|
||||
|
||||
:param app: The ASGI application.
|
||||
:type app: ASGIApp
|
||||
:param limit: Maximum requests per window.
|
||||
:type limit: int
|
||||
:param window_size: Time window in seconds.
|
||||
:type window_size: float
|
||||
:param algorithm: Rate limiting algorithm.
|
||||
:type algorithm: Algorithm
|
||||
:param backend: Storage backend. Defaults to MemoryBackend.
|
||||
:type backend: Backend | None
|
||||
:param key_prefix: Prefix for rate limit keys.
|
||||
:type key_prefix: str
|
||||
:param include_headers: Include rate limit headers in response.
|
||||
:type include_headers: bool
|
||||
:param error_message: Error message when rate limited.
|
||||
:type error_message: str
|
||||
:param status_code: HTTP status code when rate limited.
|
||||
:type status_code: int
|
||||
:param skip_on_error: Skip rate limiting on backend errors.
|
||||
:type skip_on_error: bool
|
||||
:param exempt_paths: Paths to exempt from rate limiting.
|
||||
:type exempt_paths: set[str] | None
|
||||
:param exempt_ips: IP addresses to exempt from rate limiting.
|
||||
:type exempt_ips: set[str] | None
|
||||
:param key_extractor: Function to extract client identifier.
|
||||
:type key_extractor: Callable[[Request], str]
|
||||
|
||||
**Basic usage:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi_traffic.middleware import RateLimitMiddleware
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
**With exemptions:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
exempt_paths={"/health", "/docs"},
|
||||
exempt_ips={"127.0.0.1"},
|
||||
)
|
||||
|
||||
**With custom backend:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import SQLiteBackend
|
||||
|
||||
backend = SQLiteBackend("rate_limits.db")
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
backend=backend,
|
||||
)
|
||||
|
||||
SlidingWindowMiddleware
|
||||
-----------------------
|
||||
|
||||
.. py:class:: SlidingWindowMiddleware(app, *, limit=100, window_size=60.0, **kwargs)
|
||||
|
||||
Convenience middleware using the sliding window algorithm.
|
||||
|
||||
Accepts all the same parameters as ``RateLimitMiddleware``.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.middleware import SlidingWindowMiddleware
|
||||
|
||||
app.add_middleware(
|
||||
SlidingWindowMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
TokenBucketMiddleware
|
||||
---------------------
|
||||
|
||||
.. py:class:: TokenBucketMiddleware(app, *, limit=100, window_size=60.0, **kwargs)
|
||||
|
||||
Convenience middleware using the token bucket algorithm.
|
||||
|
||||
Accepts all the same parameters as ``RateLimitMiddleware``.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.middleware import TokenBucketMiddleware
|
||||
|
||||
app.add_middleware(
|
||||
TokenBucketMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
)
|
||||
91
docs/changelog.rst
Normal file
91
docs/changelog.rst
Normal file
@@ -0,0 +1,91 @@
|
||||
Changelog
|
||||
=========
|
||||
|
||||
All notable changes to FastAPI Traffic are documented here.
|
||||
|
||||
The format is based on `Keep a Changelog <https://keepachangelog.com/en/1.1.0/>`_,
|
||||
and this project adheres to `Semantic Versioning <https://semver.org/spec/v2.0.0.html>`_.
|
||||
|
||||
[0.2.1] - 2026-03-07
|
||||
--------------------
|
||||
|
||||
Changed
|
||||
^^^^^^^
|
||||
|
||||
- Improved config loader validation using Pydantic schemas
|
||||
- Added pydantic>=2.0 as a core dependency
|
||||
- Fixed sync wrapper in decorator to properly handle rate limiting
|
||||
- Updated pyright settings for stricter type checking
|
||||
- Fixed repository URL in pyproject.toml
|
||||
|
||||
Removed
|
||||
^^^^^^^
|
||||
|
||||
- Removed unused main.py
|
||||
|
||||
[0.2.0] - 2026-02-04
|
||||
--------------------
|
||||
|
||||
Added
|
||||
^^^^^
|
||||
|
||||
- **Configuration Loader** — Load rate limiting configuration from external files:
|
||||
|
||||
- ``ConfigLoader`` class for loading ``RateLimitConfig`` and ``GlobalConfig``
|
||||
- Support for ``.env`` files with ``FASTAPI_TRAFFIC_*`` prefixed variables
|
||||
- Support for JSON configuration files
|
||||
- Environment variable loading with ``load_rate_limit_config_from_env()`` and ``load_global_config_from_env()``
|
||||
- Auto-detection of file format with ``load_rate_limit_config()`` and ``load_global_config()``
|
||||
- Custom environment variable prefix support
|
||||
- Type validation and comprehensive error handling
|
||||
- 47 new tests for configuration loading
|
||||
|
||||
- Example ``11_config_loader.py`` demonstrating all configuration loading patterns
|
||||
- ``get_stats()`` method to ``MemoryBackend`` for consistency with ``RedisBackend``
|
||||
- Comprehensive test suite with 134 tests covering:
|
||||
|
||||
- All five rate limiting algorithms with timing and concurrency tests
|
||||
- Backend tests for Memory and SQLite with edge cases
|
||||
- Decorator and middleware integration tests
|
||||
- Exception handling and configuration validation
|
||||
- End-to-end integration tests with FastAPI apps
|
||||
|
||||
- ``httpx`` and ``pytest-asyncio`` as dev dependencies for testing
|
||||
|
||||
Changed
|
||||
^^^^^^^
|
||||
|
||||
- Improved documentation in README.md and DEVELOPMENT.md
|
||||
- Added ``asyncio_default_fixture_loop_scope`` config for pytest-asyncio compatibility
|
||||
|
||||
[0.1.0] - 2025-01-09
|
||||
--------------------
|
||||
|
||||
Initial release.
|
||||
|
||||
Added
|
||||
^^^^^
|
||||
|
||||
- Core rate limiting with ``@rate_limit`` decorator
|
||||
- Five algorithms:
|
||||
|
||||
- Token Bucket
|
||||
- Sliding Window
|
||||
- Fixed Window
|
||||
- Leaky Bucket
|
||||
- Sliding Window Counter
|
||||
|
||||
- Three storage backends:
|
||||
|
||||
- Memory (default) — In-memory with LRU eviction
|
||||
- SQLite — Persistent storage with WAL mode
|
||||
- Redis — Distributed storage with Lua scripts
|
||||
|
||||
- Middleware support for global rate limiting via ``RateLimitMiddleware``
|
||||
- Dependency injection support with ``RateLimitDependency``
|
||||
- Custom key extractors for flexible rate limit grouping (by IP, API key, user, etc.)
|
||||
- Configurable exemptions with ``exempt_when`` callback
|
||||
- Rate limit headers (``X-RateLimit-Limit``, ``X-RateLimit-Remaining``, ``X-RateLimit-Reset``)
|
||||
- ``RateLimitExceeded`` exception with ``retry_after`` and ``limit_info``
|
||||
- Full async support throughout
|
||||
- Strict type hints (pyright/mypy compatible)
|
||||
103
docs/conf.py
Normal file
103
docs/conf.py
Normal file
@@ -0,0 +1,103 @@
|
||||
# Configuration file for the Sphinx documentation builder.
|
||||
#
|
||||
# For the full list of built-in configuration values, see the documentation:
|
||||
# https://www.sphinx-doc.org/en/master/usage/configuration.html
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add the project root to the path so autodoc can find the modules
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent.resolve()))
|
||||
|
||||
# -- Project information -----------------------------------------------------
|
||||
project = "fastapi-traffic"
|
||||
copyright = "2026, zanewalker"
|
||||
author = "zanewalker"
|
||||
release = "0.2.1"
|
||||
version = "0.2.1"
|
||||
|
||||
# -- General configuration ---------------------------------------------------
|
||||
extensions = [
|
||||
"sphinx.ext.autodoc",
|
||||
"sphinx.ext.napoleon",
|
||||
"sphinx.ext.viewcode",
|
||||
"sphinx.ext.intersphinx",
|
||||
"sphinx.ext.autosummary",
|
||||
"sphinx_copybutton",
|
||||
"sphinx_design",
|
||||
"myst_parser",
|
||||
]
|
||||
|
||||
templates_path = ["_templates"]
|
||||
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
|
||||
|
||||
# The suffix(es) of source filenames.
|
||||
source_suffix = {
|
||||
".rst": "restructuredtext",
|
||||
".md": "markdown",
|
||||
}
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = "index"
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
html_theme = "furo"
|
||||
html_title = "fastapi-traffic"
|
||||
html_static_path = ["_static"]
|
||||
|
||||
html_theme_options = {
|
||||
"light_css_variables": {
|
||||
"color-brand-primary": "#009485",
|
||||
"color-brand-content": "#009485",
|
||||
},
|
||||
"dark_css_variables": {
|
||||
"color-brand-primary": "#00d4aa",
|
||||
"color-brand-content": "#00d4aa",
|
||||
},
|
||||
"sidebar_hide_name": False,
|
||||
"navigation_with_keys": True,
|
||||
}
|
||||
|
||||
# -- Options for autodoc -----------------------------------------------------
|
||||
autodoc_default_options = {
|
||||
"members": True,
|
||||
"member-order": "bysource",
|
||||
"special-members": "__init__",
|
||||
"undoc-members": True,
|
||||
"exclude-members": "__weakref__",
|
||||
}
|
||||
|
||||
autodoc_typehints = "description"
|
||||
autodoc_class_signature = "separated"
|
||||
|
||||
# -- Options for Napoleon (Google/NumPy docstrings) --------------------------
|
||||
napoleon_google_docstring = True
|
||||
napoleon_numpy_docstring = True
|
||||
napoleon_include_init_with_doc = True
|
||||
napoleon_include_private_with_doc = False
|
||||
napoleon_include_special_with_doc = True
|
||||
napoleon_use_admonition_for_examples = True
|
||||
napoleon_use_admonition_for_notes = True
|
||||
napoleon_use_admonition_for_references = True
|
||||
napoleon_use_ivar = False
|
||||
napoleon_use_param = True
|
||||
napoleon_use_rtype = True
|
||||
napoleon_preprocess_types = False
|
||||
napoleon_type_aliases = None
|
||||
napoleon_attr_annotations = True
|
||||
|
||||
# -- Options for intersphinx -------------------------------------------------
|
||||
intersphinx_mapping = {
|
||||
"python": ("https://docs.python.org/3", None),
|
||||
"starlette": ("https://www.starlette.io", None),
|
||||
"fastapi": ("https://fastapi.tiangolo.com", None),
|
||||
}
|
||||
|
||||
# -- MyST Parser options -----------------------------------------------------
|
||||
myst_enable_extensions = [
|
||||
"colon_fence",
|
||||
"deflist",
|
||||
"fieldlist",
|
||||
"tasklist",
|
||||
]
|
||||
myst_heading_anchors = 3
|
||||
204
docs/contributing.rst
Normal file
204
docs/contributing.rst
Normal file
@@ -0,0 +1,204 @@
|
||||
Contributing
|
||||
============
|
||||
|
||||
Thanks for your interest in contributing to FastAPI Traffic! This guide will help
|
||||
you get started.
|
||||
|
||||
Development Setup
|
||||
-----------------
|
||||
|
||||
1. **Clone the repository:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
git clone https://gitlab.com/zanewalker/fastapi-traffic.git
|
||||
cd fastapi-traffic
|
||||
|
||||
2. **Install uv** (if you don't have it):
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
3. **Create a virtual environment and install dependencies:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
uv venv
|
||||
source .venv/bin/activate # or .venv\Scripts\activate on Windows
|
||||
uv pip install -e ".[dev]"
|
||||
|
||||
4. **Verify everything works:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pytest
|
||||
|
||||
Running Tests
|
||||
-------------
|
||||
|
||||
Run the full test suite:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pytest
|
||||
|
||||
Run with coverage:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pytest --cov=fastapi_traffic --cov-report=html
|
||||
|
||||
Run specific tests:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pytest tests/test_algorithms.py
|
||||
pytest -k "test_token_bucket"
|
||||
|
||||
Code Style
|
||||
----------
|
||||
|
||||
We use ruff for linting and formatting:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Check for issues
|
||||
ruff check .
|
||||
|
||||
# Auto-fix issues
|
||||
ruff check --fix .
|
||||
|
||||
# Format code
|
||||
ruff format .
|
||||
|
||||
Type Checking
|
||||
-------------
|
||||
|
||||
We use pyright for type checking:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pyright
|
||||
|
||||
The codebase is strictly typed. All public APIs should have complete type hints.
|
||||
|
||||
Making Changes
|
||||
--------------
|
||||
|
||||
1. **Create a branch:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
git checkout -b feature/my-feature
|
||||
|
||||
2. **Make your changes.** Follow the existing code style.
|
||||
|
||||
3. **Add tests.** All new features should have tests.
|
||||
|
||||
4. **Run the checks:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
ruff check .
|
||||
ruff format .
|
||||
pyright
|
||||
pytest
|
||||
|
||||
5. **Commit your changes:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
git commit -m "feat: add my feature"
|
||||
|
||||
We follow `Conventional Commits <https://www.conventionalcommits.org/>`_:
|
||||
|
||||
- ``feat:`` New features
|
||||
- ``fix:`` Bug fixes
|
||||
- ``docs:`` Documentation changes
|
||||
- ``style:`` Code style changes (formatting, etc.)
|
||||
- ``refactor:`` Code refactoring
|
||||
- ``test:`` Adding or updating tests
|
||||
- ``chore:`` Maintenance tasks
|
||||
|
||||
6. **Push and create a merge request:**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
git push origin feature/my-feature
|
||||
|
||||
Project Structure
|
||||
-----------------
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
fastapi-traffic/
|
||||
├── fastapi_traffic/
|
||||
│ ├── __init__.py # Public API exports
|
||||
│ ├── exceptions.py # Custom exceptions
|
||||
│ ├── middleware.py # Rate limit middleware
|
||||
│ ├── backends/
|
||||
│ │ ├── base.py # Backend abstract class
|
||||
│ │ ├── memory.py # In-memory backend
|
||||
│ │ ├── sqlite.py # SQLite backend
|
||||
│ │ └── redis.py # Redis backend
|
||||
│ └── core/
|
||||
│ ├── algorithms.py # Rate limiting algorithms
|
||||
│ ├── config.py # Configuration classes
|
||||
│ ├── config_loader.py # Configuration loading
|
||||
│ ├── decorator.py # @rate_limit decorator
|
||||
│ ├── limiter.py # Main RateLimiter class
|
||||
│ └── models.py # Data models
|
||||
├── tests/
|
||||
│ ├── test_algorithms.py
|
||||
│ ├── test_backends.py
|
||||
│ ├── test_decorator.py
|
||||
│ └── ...
|
||||
├── examples/
|
||||
│ ├── 01_quickstart.py
|
||||
│ └── ...
|
||||
└── docs/
|
||||
└── ...
|
||||
|
||||
Guidelines
|
||||
----------
|
||||
|
||||
**Code:**
|
||||
|
||||
- Keep functions focused and small
|
||||
- Use descriptive variable names
|
||||
- Add docstrings to public functions and classes
|
||||
- Follow existing patterns in the codebase
|
||||
|
||||
**Tests:**
|
||||
|
||||
- Test both happy path and edge cases
|
||||
- Use descriptive test names
|
||||
- Mock external dependencies (Redis, etc.)
|
||||
- Keep tests fast and isolated
|
||||
|
||||
**Documentation:**
|
||||
|
||||
- Update docs when adding features
|
||||
- Include code examples
|
||||
- Keep language clear and concise
|
||||
|
||||
Reporting Issues
|
||||
----------------
|
||||
|
||||
Found a bug? Have a feature request? Please open an issue on GitLab:
|
||||
|
||||
https://gitlab.com/zanewalker/fastapi-traffic/issues
|
||||
|
||||
Include:
|
||||
|
||||
- What you expected to happen
|
||||
- What actually happened
|
||||
- Steps to reproduce
|
||||
- Python version and OS
|
||||
- FastAPI Traffic version
|
||||
|
||||
Questions?
|
||||
----------
|
||||
|
||||
Feel free to open an issue for questions. We're happy to help!
|
||||
105
docs/getting-started/installation.rst
Normal file
105
docs/getting-started/installation.rst
Normal file
@@ -0,0 +1,105 @@
|
||||
Installation
|
||||
============
|
||||
|
||||
FastAPI Traffic supports Python 3.10 and above. You can install it using pip, uv, or
|
||||
any other Python package manager.
|
||||
|
||||
Basic Installation
|
||||
------------------
|
||||
|
||||
The basic installation includes the memory backend, which is perfect for development
|
||||
and single-process applications:
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: pip
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install git+https://gitlab.com/zanewalker/fastapi-traffic.git
|
||||
|
||||
.. tab-item:: uv
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
uv add git+https://gitlab.com/zanewalker/fastapi-traffic.git
|
||||
|
||||
.. tab-item:: poetry
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
poetry add git+https://gitlab.com/zanewalker/fastapi-traffic.git
|
||||
|
||||
With Redis Support
|
||||
------------------
|
||||
|
||||
If you're running a distributed system with multiple application instances, you'll
|
||||
want the Redis backend:
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: pip
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install "git+https://gitlab.com/zanewalker/fastapi-traffic.git[redis]"
|
||||
|
||||
.. tab-item:: uv
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
uv add "git+https://gitlab.com/zanewalker/fastapi-traffic.git[redis]"
|
||||
|
||||
Everything
|
||||
----------
|
||||
|
||||
Want it all? Install with the ``all`` extra:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install "git+https://gitlab.com/zanewalker/fastapi-traffic.git[all]"
|
||||
|
||||
This includes Redis support and ensures FastAPI is installed as well.
|
||||
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
FastAPI Traffic has minimal dependencies:
|
||||
|
||||
- **pydantic** (>=2.0) — For configuration validation
|
||||
- **starlette** (>=0.27.0) — The ASGI framework that FastAPI is built on
|
||||
|
||||
Optional dependencies:
|
||||
|
||||
- **redis** (>=5.0.0) — Required for the Redis backend
|
||||
- **fastapi** (>=0.100.0) — While not strictly required (we work with Starlette directly),
|
||||
you probably want this
|
||||
|
||||
Verifying the Installation
|
||||
--------------------------
|
||||
|
||||
After installation, you can verify everything is working:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import fastapi_traffic
|
||||
print(fastapi_traffic.__version__)
|
||||
# Should print: 0.2.1
|
||||
|
||||
Or check which backends are available:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend, SQLiteBackend
|
||||
print("Memory and SQLite backends available!")
|
||||
|
||||
try:
|
||||
from fastapi_traffic import RedisBackend
|
||||
print("Redis backend available!")
|
||||
except ImportError:
|
||||
print("Redis backend not installed (install with [redis] extra)")
|
||||
|
||||
What's Next?
|
||||
------------
|
||||
|
||||
Head over to the :doc:`quickstart` guide to start rate limiting your endpoints.
|
||||
220
docs/getting-started/quickstart.rst
Normal file
220
docs/getting-started/quickstart.rst
Normal file
@@ -0,0 +1,220 @@
|
||||
Quickstart
|
||||
==========
|
||||
|
||||
Let's get rate limiting working in your FastAPI app. This guide covers the basics —
|
||||
you'll have something running in under five minutes.
|
||||
|
||||
Your First Rate Limit
|
||||
---------------------
|
||||
|
||||
The simplest way to add rate limiting is with the ``@rate_limit`` decorator:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import rate_limit
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.get("/api/hello")
|
||||
@rate_limit(10, 60) # 10 requests per 60 seconds
|
||||
async def hello(request: Request):
|
||||
return {"message": "Hello, World!"}
|
||||
|
||||
That's the whole thing. Let's break down what's happening:
|
||||
|
||||
1. The decorator takes two arguments: ``limit`` (max requests) and ``window_size`` (in seconds)
|
||||
2. Each client is identified by their IP address by default
|
||||
3. When a client exceeds the limit, they get a 429 response with a ``Retry-After`` header
|
||||
|
||||
.. note::
|
||||
|
||||
The ``request: Request`` parameter is required. FastAPI Traffic needs access to the
|
||||
request to identify the client and track their usage.
|
||||
|
||||
Testing It Out
|
||||
--------------
|
||||
|
||||
Fire up your app and hit the endpoint a few times:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Start your app
|
||||
uvicorn main:app --reload
|
||||
|
||||
# In another terminal, make some requests
|
||||
curl -i http://localhost:8000/api/hello
|
||||
|
||||
You'll see headers like these in the response:
|
||||
|
||||
.. code-block:: http
|
||||
|
||||
HTTP/1.1 200 OK
|
||||
X-RateLimit-Limit: 10
|
||||
X-RateLimit-Remaining: 9
|
||||
X-RateLimit-Reset: 1709834400
|
||||
|
||||
After 10 requests, you'll get:
|
||||
|
||||
.. code-block:: http
|
||||
|
||||
HTTP/1.1 429 Too Many Requests
|
||||
Retry-After: 45
|
||||
X-RateLimit-Limit: 10
|
||||
X-RateLimit-Remaining: 0
|
||||
|
||||
Choosing an Algorithm
|
||||
---------------------
|
||||
|
||||
Different situations call for different rate limiting strategies. Here's a quick guide:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import rate_limit, Algorithm
|
||||
|
||||
# Token Bucket - great for APIs that need burst handling
|
||||
# Allows short bursts of traffic, then smooths out
|
||||
@app.get("/api/burst-friendly")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=20)
|
||||
async def burst_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
# Sliding Window - most accurate, but uses more memory
|
||||
# Perfect when you need precise rate limiting
|
||||
@app.get("/api/precise")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
|
||||
async def precise_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
# Fixed Window - simple and efficient
|
||||
# Good for most use cases, slight edge case at window boundaries
|
||||
@app.get("/api/simple")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
|
||||
async def simple_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
See :doc:`/user-guide/algorithms` for a deep dive into each algorithm.
|
||||
|
||||
Rate Limiting by API Key
|
||||
------------------------
|
||||
|
||||
IP-based limiting is fine for public endpoints, but for authenticated APIs you
|
||||
probably want to limit by API key:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def get_api_key(request: Request) -> str:
|
||||
"""Extract API key from header, fall back to IP."""
|
||||
api_key = request.headers.get("X-API-Key")
|
||||
if api_key:
|
||||
return f"key:{api_key}"
|
||||
# Fall back to IP for unauthenticated requests
|
||||
return request.client.host if request.client else "unknown"
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(1000, 3600, key_extractor=get_api_key) # 1000/hour per API key
|
||||
async def get_data(request: Request):
|
||||
return {"data": "sensitive stuff"}
|
||||
|
||||
Global Rate Limiting with Middleware
|
||||
------------------------------------
|
||||
|
||||
Sometimes you want a blanket rate limit across your entire API. That's what
|
||||
middleware is for:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.middleware import RateLimitMiddleware
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
exempt_paths={"/health", "/docs", "/openapi.json"},
|
||||
)
|
||||
|
||||
# All endpoints now have a shared 1000 req/min limit
|
||||
@app.get("/api/users")
|
||||
async def get_users():
|
||||
return {"users": []}
|
||||
|
||||
@app.get("/api/posts")
|
||||
async def get_posts():
|
||||
return {"posts": []}
|
||||
|
||||
Using a Persistent Backend
|
||||
--------------------------
|
||||
|
||||
The default memory backend works great for development, but it doesn't survive
|
||||
restarts and doesn't work across multiple processes. For production, use SQLite
|
||||
or Redis:
|
||||
|
||||
**SQLite** — Good for single-node deployments:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimiter, SQLiteBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
# Set up persistent storage
|
||||
backend = SQLiteBackend("rate_limits.db")
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
await limiter.close()
|
||||
|
||||
**Redis** — Required for distributed systems:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimiter
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
Handling Rate Limit Errors
|
||||
--------------------------
|
||||
|
||||
By default, exceeding the rate limit raises a ``RateLimitExceeded`` exception that
|
||||
returns a 429 response. You can customize this:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import Request
|
||||
from fastapi.responses import JSONResponse
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"error": "slow_down",
|
||||
"message": "You're making too many requests. Take a breather.",
|
||||
"retry_after": exc.retry_after,
|
||||
},
|
||||
)
|
||||
|
||||
What's Next?
|
||||
------------
|
||||
|
||||
You've got the basics down. Here's where to go from here:
|
||||
|
||||
- :doc:`/user-guide/algorithms` — Understand when to use each algorithm
|
||||
- :doc:`/user-guide/backends` — Learn about storage options
|
||||
- :doc:`/user-guide/key-extractors` — Advanced client identification
|
||||
- :doc:`/user-guide/configuration` — Load settings from files and environment variables
|
||||
148
docs/index.rst
Normal file
148
docs/index.rst
Normal file
@@ -0,0 +1,148 @@
|
||||
FastAPI Traffic
|
||||
===============
|
||||
|
||||
**Production-grade rate limiting for FastAPI that just works.**
|
||||
|
||||
.. image:: https://img.shields.io/badge/python-3.10+-blue.svg
|
||||
:target: https://www.python.org/downloads/
|
||||
|
||||
.. image:: https://img.shields.io/badge/license-Apache%202.0-green.svg
|
||||
:target: https://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
----
|
||||
|
||||
FastAPI Traffic is a rate limiting library designed for real-world FastAPI applications.
|
||||
It gives you five battle-tested algorithms, three storage backends, and a clean API that
|
||||
stays out of your way.
|
||||
|
||||
Whether you're building a public API that needs to handle thousands of requests per second
|
||||
or a small internal service that just needs basic protection, this library has you covered.
|
||||
|
||||
Quick Example
|
||||
-------------
|
||||
|
||||
Here's how simple it is to add rate limiting to your FastAPI app:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import rate_limit
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.get("/api/users")
|
||||
@rate_limit(100, 60) # 100 requests per minute
|
||||
async def get_users(request: Request):
|
||||
return {"users": ["alice", "bob"]}
|
||||
|
||||
That's it. Your endpoint is now rate limited. Clients get helpful headers telling them
|
||||
how many requests they have left, and when they can try again if they hit the limit.
|
||||
|
||||
Why FastAPI Traffic?
|
||||
--------------------
|
||||
|
||||
Most rate limiting libraries fall into one of two camps: either they're too simple
|
||||
(fixed window only, no persistence) or they're way too complicated (requires reading
|
||||
a 50-page manual just to get started).
|
||||
|
||||
We tried to hit the sweet spot:
|
||||
|
||||
- **Five algorithms** — Pick the one that fits your use case. Token bucket for burst
|
||||
handling, sliding window for precision, fixed window for simplicity.
|
||||
|
||||
- **Three backends** — Memory for development, SQLite for single-node production,
|
||||
Redis for distributed systems.
|
||||
|
||||
- **Works how you'd expect** — Decorator for endpoints, middleware for global limits,
|
||||
dependency injection if that's your style.
|
||||
|
||||
- **Fully async** — Built from the ground up for async Python. No blocking calls,
|
||||
no thread pool hacks.
|
||||
|
||||
- **Type-checked** — Full type hints throughout. Works great with pyright and mypy.
|
||||
|
||||
What's in the Box
|
||||
-----------------
|
||||
|
||||
.. grid:: 2
|
||||
:gutter: 3
|
||||
|
||||
.. grid-item-card:: 🚦 Rate Limiting
|
||||
:link: getting-started/quickstart
|
||||
:link-type: doc
|
||||
|
||||
Decorator-based rate limiting with sensible defaults.
|
||||
|
||||
.. grid-item-card:: 🔧 Algorithms
|
||||
:link: user-guide/algorithms
|
||||
:link-type: doc
|
||||
|
||||
Token bucket, sliding window, fixed window, leaky bucket, and more.
|
||||
|
||||
.. grid-item-card:: 💾 Backends
|
||||
:link: user-guide/backends
|
||||
:link-type: doc
|
||||
|
||||
Memory, SQLite, and Redis storage options.
|
||||
|
||||
.. grid-item-card:: ⚙️ Configuration
|
||||
:link: user-guide/configuration
|
||||
:link-type: doc
|
||||
|
||||
Load settings from environment variables or config files.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Getting Started
|
||||
:hidden:
|
||||
|
||||
getting-started/installation
|
||||
getting-started/quickstart
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: User Guide
|
||||
:hidden:
|
||||
|
||||
user-guide/algorithms
|
||||
user-guide/backends
|
||||
user-guide/middleware
|
||||
user-guide/configuration
|
||||
user-guide/key-extractors
|
||||
user-guide/exception-handling
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Advanced Topics
|
||||
:hidden:
|
||||
|
||||
advanced/distributed-systems
|
||||
advanced/performance
|
||||
advanced/testing
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: API Reference
|
||||
:hidden:
|
||||
|
||||
api/decorator
|
||||
api/middleware
|
||||
api/algorithms
|
||||
api/backends
|
||||
api/config
|
||||
api/exceptions
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Project
|
||||
:hidden:
|
||||
|
||||
changelog
|
||||
contributing
|
||||
|
||||
Indices and tables
|
||||
------------------
|
||||
|
||||
* :ref:`genindex`
|
||||
* :ref:`modindex`
|
||||
* :ref:`search`
|
||||
5
docs/requirements.txt
Normal file
5
docs/requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
||||
sphinx>=7.0.0
|
||||
furo>=2024.0.0
|
||||
sphinx-copybutton>=0.5.0
|
||||
myst-parser>=2.0.0
|
||||
sphinx-design>=0.5.0
|
||||
290
docs/user-guide/algorithms.rst
Normal file
290
docs/user-guide/algorithms.rst
Normal file
@@ -0,0 +1,290 @@
|
||||
Rate Limiting Algorithms
|
||||
========================
|
||||
|
||||
FastAPI Traffic ships with five rate limiting algorithms. Each has its own strengths,
|
||||
and picking the right one depends on what you're trying to achieve.
|
||||
|
||||
This guide will help you understand the tradeoffs and choose wisely.
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
Here's the quick comparison:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 20 40 40
|
||||
|
||||
* - Algorithm
|
||||
- Best For
|
||||
- Tradeoffs
|
||||
* - **Token Bucket**
|
||||
- APIs that need burst handling
|
||||
- Allows temporary spikes above average rate
|
||||
* - **Sliding Window**
|
||||
- Precise rate limiting
|
||||
- Higher memory usage
|
||||
* - **Fixed Window**
|
||||
- Simple, low-overhead limiting
|
||||
- Boundary issues (2x burst at window edges)
|
||||
* - **Leaky Bucket**
|
||||
- Consistent throughput
|
||||
- No burst handling
|
||||
* - **Sliding Window Counter**
|
||||
- General purpose (default)
|
||||
- Good balance of precision and efficiency
|
||||
|
||||
Token Bucket
|
||||
------------
|
||||
|
||||
Think of this as a bucket that holds tokens. Each request consumes a token, and
|
||||
tokens refill at a steady rate. If the bucket is empty, requests are rejected.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import rate_limit, Algorithm
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(
|
||||
100, # 100 tokens refill per minute
|
||||
60,
|
||||
algorithm=Algorithm.TOKEN_BUCKET,
|
||||
burst_size=20, # bucket can hold up to 20 tokens
|
||||
)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. The bucket starts full (at ``burst_size`` capacity)
|
||||
2. Each request removes one token
|
||||
3. Tokens refill at ``limit / window_size`` per second
|
||||
4. If no tokens are available, the request is rejected
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- Your API has legitimate burst traffic (e.g., page loads that trigger multiple requests)
|
||||
- You want to allow short spikes while maintaining an average rate
|
||||
- Mobile apps that batch requests when coming online
|
||||
|
||||
**Example scenario:** A mobile app that syncs data when it reconnects. You want to
|
||||
allow it to catch up quickly, but not overwhelm your servers.
|
||||
|
||||
Sliding Window
|
||||
--------------
|
||||
|
||||
This algorithm tracks the exact timestamp of every request within the window. It's
|
||||
the most accurate approach, but uses more memory.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/transactions")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
|
||||
async def get_transactions(request: Request):
|
||||
return {"transactions": []}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. Every request timestamp is stored
|
||||
2. When checking, we count requests in the last ``window_size`` seconds
|
||||
3. Old timestamps are cleaned up automatically
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- You need precise rate limiting (financial APIs, compliance requirements)
|
||||
- Memory isn't a major concern
|
||||
- The rate limit is relatively low (not millions of requests)
|
||||
|
||||
**Tradeoffs:**
|
||||
|
||||
- Memory usage grows with request volume
|
||||
- Slightly more CPU for timestamp management
|
||||
|
||||
Fixed Window
|
||||
------------
|
||||
|
||||
The simplest algorithm. Divide time into fixed windows (e.g., every minute) and
|
||||
count requests in each window.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/simple")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
|
||||
async def simple_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. Time is divided into fixed windows (0:00-1:00, 1:00-2:00, etc.)
|
||||
2. Each request increments the counter for the current window
|
||||
3. When the window changes, the counter resets
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- You want the simplest, most efficient option
|
||||
- Slight inaccuracy at window boundaries is acceptable
|
||||
- High-volume scenarios where memory matters
|
||||
|
||||
**The boundary problem:**
|
||||
|
||||
A client could make 100 requests at 0:59 and another 100 at 1:01, effectively
|
||||
getting 200 requests in 2 seconds. If this matters for your use case, use
|
||||
sliding window counter instead.
|
||||
|
||||
Leaky Bucket
|
||||
------------
|
||||
|
||||
Imagine a bucket with a hole in the bottom. Requests fill the bucket, and it
|
||||
"leaks" at a constant rate. If the bucket overflows, requests are rejected.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/steady")
|
||||
@rate_limit(
|
||||
100,
|
||||
60,
|
||||
algorithm=Algorithm.LEAKY_BUCKET,
|
||||
burst_size=10, # bucket capacity
|
||||
)
|
||||
async def steady_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. The bucket has a maximum capacity (``burst_size``)
|
||||
2. Each request adds "water" to the bucket
|
||||
3. Water leaks out at ``limit / window_size`` per second
|
||||
4. If the bucket would overflow, the request is rejected
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- You need consistent, smooth throughput
|
||||
- Downstream systems can't handle bursts
|
||||
- Processing capacity is truly fixed (e.g., hardware limitations)
|
||||
|
||||
**Difference from token bucket:**
|
||||
|
||||
- Token bucket allows bursts up to the bucket size
|
||||
- Leaky bucket smooths out traffic to a constant rate
|
||||
|
||||
Sliding Window Counter
|
||||
----------------------
|
||||
|
||||
This is the default algorithm, and it's a good choice for most use cases. It
|
||||
combines the efficiency of fixed windows with better accuracy.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/default")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW_COUNTER)
|
||||
async def default_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. Maintains counters for the current and previous windows
|
||||
2. Calculates a weighted average based on how far into the current window we are
|
||||
3. At 30 seconds into a 60-second window: ``count = prev_count * 0.5 + curr_count``
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- General purpose rate limiting
|
||||
- You want better accuracy than fixed window without the memory cost of sliding window
|
||||
- Most APIs fall into this category
|
||||
|
||||
**Why it's the default:**
|
||||
|
||||
It gives you 90% of the accuracy of sliding window with the memory efficiency of
|
||||
fixed window. Unless you have specific requirements, this is probably what you want.
|
||||
|
||||
Choosing the Right Algorithm
|
||||
----------------------------
|
||||
|
||||
Here's a decision tree:
|
||||
|
||||
1. **Do you need to allow bursts?**
|
||||
|
||||
- Yes → Token Bucket
|
||||
- No, I need smooth traffic → Leaky Bucket
|
||||
|
||||
2. **Do you need exact precision?**
|
||||
|
||||
- Yes, compliance/financial → Sliding Window
|
||||
- No, good enough is fine → Continue
|
||||
|
||||
3. **Is memory a concern?**
|
||||
|
||||
- Yes, high volume → Fixed Window
|
||||
- No → Sliding Window Counter (default)
|
||||
|
||||
Performance Comparison
|
||||
----------------------
|
||||
|
||||
All algorithms are O(1) for the check operation, but they differ in storage:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Algorithm
|
||||
- Storage per Key
|
||||
- Operations
|
||||
* - Token Bucket
|
||||
- 2 floats
|
||||
- 1 read, 1 write
|
||||
* - Sliding Window
|
||||
- N timestamps
|
||||
- 1 read, 1 write, cleanup
|
||||
* - Fixed Window
|
||||
- 1 int, 1 float
|
||||
- 1 read, 1 write
|
||||
* - Leaky Bucket
|
||||
- 2 floats
|
||||
- 1 read, 1 write
|
||||
* - Sliding Window Counter
|
||||
- 3 values
|
||||
- 1 read, 1 write
|
||||
|
||||
For most applications, the performance difference is negligible. Choose based on
|
||||
behavior, not performance, unless you're handling millions of requests per second.
|
||||
|
||||
Code Examples
|
||||
-------------
|
||||
|
||||
Here's a complete example showing all algorithms:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import rate_limit, Algorithm
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
# Burst-friendly endpoint
|
||||
@app.get("/api/burst")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=25)
|
||||
async def burst_endpoint(request: Request):
|
||||
return {"type": "token_bucket"}
|
||||
|
||||
# Precise limiting
|
||||
@app.get("/api/precise")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
|
||||
async def precise_endpoint(request: Request):
|
||||
return {"type": "sliding_window"}
|
||||
|
||||
# Simple and efficient
|
||||
@app.get("/api/simple")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
|
||||
async def simple_endpoint(request: Request):
|
||||
return {"type": "fixed_window"}
|
||||
|
||||
# Smooth throughput
|
||||
@app.get("/api/steady")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.LEAKY_BUCKET)
|
||||
async def steady_endpoint(request: Request):
|
||||
return {"type": "leaky_bucket"}
|
||||
|
||||
# Best of both worlds (default)
|
||||
@app.get("/api/balanced")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW_COUNTER)
|
||||
async def balanced_endpoint(request: Request):
|
||||
return {"type": "sliding_window_counter"}
|
||||
312
docs/user-guide/backends.rst
Normal file
312
docs/user-guide/backends.rst
Normal file
@@ -0,0 +1,312 @@
|
||||
Storage Backends
|
||||
================
|
||||
|
||||
FastAPI Traffic needs somewhere to store rate limit state — how many requests each
|
||||
client has made, when their window resets, and so on. That's what backends are for.
|
||||
|
||||
You have three options, each suited to different deployment scenarios.
|
||||
|
||||
Choosing a Backend
|
||||
------------------
|
||||
|
||||
Here's the quick guide:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 20 30 50
|
||||
|
||||
* - Backend
|
||||
- Use When
|
||||
- Limitations
|
||||
* - **Memory**
|
||||
- Development, single-process apps
|
||||
- Lost on restart, doesn't share across processes
|
||||
* - **SQLite**
|
||||
- Single-node production
|
||||
- Doesn't share across machines
|
||||
* - **Redis**
|
||||
- Distributed systems, multiple nodes
|
||||
- Requires Redis infrastructure
|
||||
|
||||
Memory Backend
|
||||
--------------
|
||||
|
||||
The default backend. It stores everything in memory using a dictionary with LRU
|
||||
eviction and automatic TTL cleanup.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
# This is what happens by default, but you can configure it:
|
||||
backend = MemoryBackend(
|
||||
max_size=10000, # Maximum number of keys to store
|
||||
cleanup_interval=60, # How often to clean expired entries (seconds)
|
||||
)
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- Local development
|
||||
- Single-process applications
|
||||
- Testing and CI/CD pipelines
|
||||
- When you don't need persistence
|
||||
|
||||
**Limitations:**
|
||||
|
||||
- State is lost when the process restarts
|
||||
- Doesn't work with multiple workers (each worker has its own memory)
|
||||
- Not suitable for ``gunicorn`` with multiple workers or Kubernetes pods
|
||||
|
||||
**Memory management:**
|
||||
|
||||
The backend automatically evicts old entries when it hits ``max_size``. It uses
|
||||
LRU (Least Recently Used) eviction, so inactive clients get cleaned up first.
|
||||
|
||||
SQLite Backend
|
||||
--------------
|
||||
|
||||
For single-node production deployments where you need persistence. Rate limits
|
||||
survive restarts and work across multiple processes on the same machine.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import SQLiteBackend, RateLimiter
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
backend = SQLiteBackend(
|
||||
"rate_limits.db", # Database file path
|
||||
cleanup_interval=300, # Clean expired entries every 5 minutes
|
||||
)
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
await limiter.close()
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- Single-server deployments
|
||||
- When you need rate limits to survive restarts
|
||||
- Multiple workers on the same machine (gunicorn, uvicorn with workers)
|
||||
- When Redis is overkill for your use case
|
||||
|
||||
**Performance notes:**
|
||||
|
||||
- Uses WAL (Write-Ahead Logging) mode for better concurrent performance
|
||||
- Connection pooling is handled automatically
|
||||
- Writes are batched where possible
|
||||
|
||||
**File location:**
|
||||
|
||||
Put the database file somewhere persistent. For Docker deployments, mount a volume:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# docker-compose.yml
|
||||
services:
|
||||
api:
|
||||
volumes:
|
||||
- ./data:/app/data
|
||||
environment:
|
||||
- RATE_LIMIT_DB=/app/data/rate_limits.db
|
||||
|
||||
Redis Backend
|
||||
-------------
|
||||
|
||||
The go-to choice for distributed systems. All your application instances share
|
||||
the same rate limit state.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimiter
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://localhost:6379/0",
|
||||
key_prefix="myapp:ratelimit", # Optional prefix for all keys
|
||||
)
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
await limiter.close()
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- Multiple application instances (Kubernetes, load-balanced servers)
|
||||
- When you need rate limits shared across your entire infrastructure
|
||||
- High-availability requirements
|
||||
|
||||
**Connection options:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Simple connection
|
||||
backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
|
||||
# With authentication
|
||||
backend = await RedisBackend.from_url("redis://:password@localhost:6379/0")
|
||||
|
||||
# Redis Sentinel for HA
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://sentinel1:26379/0",
|
||||
sentinel_master="mymaster",
|
||||
)
|
||||
|
||||
# Redis Cluster
|
||||
backend = await RedisBackend.from_url("redis://node1:6379,node2:6379,node3:6379/0")
|
||||
|
||||
**Atomic operations:**
|
||||
|
||||
The Redis backend uses Lua scripts to ensure atomic operations. This means rate
|
||||
limit checks are accurate even under high concurrency — no race conditions.
|
||||
|
||||
**Key expiration:**
|
||||
|
||||
Keys automatically expire based on the rate limit window. You don't need to worry
|
||||
about Redis filling up with stale data.
|
||||
|
||||
Switching Backends
|
||||
------------------
|
||||
|
||||
You can switch backends without changing your rate limiting code. Just configure
|
||||
a different backend at startup:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import os
|
||||
from fastapi_traffic import RateLimiter, MemoryBackend, SQLiteBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
def get_backend():
|
||||
"""Choose backend based on environment."""
|
||||
env = os.getenv("ENVIRONMENT", "development")
|
||||
|
||||
if env == "production":
|
||||
redis_url = os.getenv("REDIS_URL")
|
||||
if redis_url:
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
return RedisBackend.from_url(redis_url)
|
||||
return SQLiteBackend("/app/data/rate_limits.db")
|
||||
|
||||
return MemoryBackend()
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await get_backend()
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
Custom Backends
|
||||
---------------
|
||||
|
||||
Need something different? Maybe you want to use PostgreSQL, DynamoDB, or some
|
||||
other storage system. You can implement your own backend:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.backends.base import Backend
|
||||
from typing import Any
|
||||
|
||||
class MyCustomBackend(Backend):
|
||||
async def get(self, key: str) -> dict[str, Any] | None:
|
||||
"""Retrieve state for a key."""
|
||||
# Your implementation here
|
||||
pass
|
||||
|
||||
async def set(self, key: str, value: dict[str, Any], *, ttl: float) -> None:
|
||||
"""Store state with TTL."""
|
||||
pass
|
||||
|
||||
async def delete(self, key: str) -> None:
|
||||
"""Delete a key."""
|
||||
pass
|
||||
|
||||
async def exists(self, key: str) -> bool:
|
||||
"""Check if key exists."""
|
||||
pass
|
||||
|
||||
async def increment(self, key: str, amount: int = 1) -> int:
|
||||
"""Atomically increment a counter."""
|
||||
pass
|
||||
|
||||
async def clear(self) -> None:
|
||||
"""Clear all data."""
|
||||
pass
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Clean up resources."""
|
||||
pass
|
||||
|
||||
The key methods are ``get``, ``set``, and ``delete``. The state is stored as a
|
||||
dictionary, and the backend is responsible for serialization.
|
||||
|
||||
Backend Comparison
|
||||
------------------
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Feature
|
||||
- Memory
|
||||
- SQLite
|
||||
- Redis
|
||||
* - Persistence
|
||||
- ❌
|
||||
- ✅
|
||||
- ✅
|
||||
* - Multi-process
|
||||
- ❌
|
||||
- ✅
|
||||
- ✅
|
||||
* - Multi-node
|
||||
- ❌
|
||||
- ❌
|
||||
- ✅
|
||||
* - Setup complexity
|
||||
- None
|
||||
- Low
|
||||
- Medium
|
||||
* - Latency
|
||||
- ~0.01ms
|
||||
- ~0.1ms
|
||||
- ~1ms
|
||||
* - Dependencies
|
||||
- None
|
||||
- None
|
||||
- redis package
|
||||
|
||||
Best Practices
|
||||
--------------
|
||||
|
||||
1. **Start with Memory, upgrade when needed.** Don't over-engineer. Memory is
|
||||
fine for development and many production scenarios.
|
||||
|
||||
2. **Use Redis for distributed systems.** If you have multiple application
|
||||
instances, Redis is the only option that works correctly.
|
||||
|
||||
3. **Handle backend errors gracefully.** Set ``skip_on_error=True`` if you'd
|
||||
rather allow requests through than fail when the backend is down:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
4. **Monitor your backend.** Keep an eye on memory usage (Memory backend),
|
||||
disk space (SQLite), or Redis memory and connections.
|
||||
315
docs/user-guide/configuration.rst
Normal file
315
docs/user-guide/configuration.rst
Normal file
@@ -0,0 +1,315 @@
|
||||
Configuration
|
||||
=============
|
||||
|
||||
FastAPI Traffic supports loading configuration from environment variables and files.
|
||||
This makes it easy to manage settings across different environments without changing code.
|
||||
|
||||
Configuration Loader
|
||||
--------------------
|
||||
|
||||
The ``ConfigLoader`` class handles loading configuration from various sources:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import ConfigLoader, RateLimitConfig
|
||||
|
||||
loader = ConfigLoader()
|
||||
|
||||
# Load from environment variables
|
||||
config = loader.load_rate_limit_config_from_env()
|
||||
|
||||
# Load from a JSON file
|
||||
config = loader.load_rate_limit_config_from_json("config/rate_limits.json")
|
||||
|
||||
# Load from a .env file
|
||||
config = loader.load_rate_limit_config_from_env_file(".env")
|
||||
|
||||
Environment Variables
|
||||
---------------------
|
||||
|
||||
Set rate limit configuration using environment variables with the ``FASTAPI_TRAFFIC_``
|
||||
prefix:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Basic settings
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_LIMIT=100
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_WINDOW_SIZE=60
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_ALGORITHM=sliding_window_counter
|
||||
|
||||
# Optional settings
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_KEY_PREFIX=myapp
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_BURST_SIZE=20
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_INCLUDE_HEADERS=true
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_ERROR_MESSAGE="Too many requests"
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_STATUS_CODE=429
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_SKIP_ON_ERROR=false
|
||||
export FASTAPI_TRAFFIC_RATE_LIMIT_COST=1
|
||||
|
||||
Then load them in your app:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import load_rate_limit_config_from_env, rate_limit
|
||||
|
||||
# Load config from environment
|
||||
config = load_rate_limit_config_from_env()
|
||||
|
||||
# Use it with the decorator
|
||||
@app.get("/api/data")
|
||||
@rate_limit(config.limit, config.window_size, algorithm=config.algorithm)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
Custom Prefix
|
||||
-------------
|
||||
|
||||
If ``FASTAPI_TRAFFIC_`` conflicts with something else, use a custom prefix:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
loader = ConfigLoader(prefix="MYAPP_RATELIMIT")
|
||||
config = loader.load_rate_limit_config_from_env()
|
||||
|
||||
# Now reads from:
|
||||
# MYAPP_RATELIMIT_RATE_LIMIT_LIMIT=100
|
||||
# MYAPP_RATELIMIT_RATE_LIMIT_WINDOW_SIZE=60
|
||||
# etc.
|
||||
|
||||
JSON Configuration
|
||||
------------------
|
||||
|
||||
For more complex setups, use a JSON file:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"limit": 100,
|
||||
"window_size": 60,
|
||||
"algorithm": "token_bucket",
|
||||
"burst_size": 25,
|
||||
"key_prefix": "api",
|
||||
"include_headers": true,
|
||||
"error_message": "Rate limit exceeded. Please slow down.",
|
||||
"status_code": 429,
|
||||
"skip_on_error": false,
|
||||
"cost": 1
|
||||
}
|
||||
|
||||
Load it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import ConfigLoader
|
||||
|
||||
loader = ConfigLoader()
|
||||
config = loader.load_rate_limit_config_from_json("config/rate_limits.json")
|
||||
|
||||
.env Files
|
||||
----------
|
||||
|
||||
You can also use ``.env`` files, which is handy for local development:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# .env
|
||||
FASTAPI_TRAFFIC_RATE_LIMIT_LIMIT=100
|
||||
FASTAPI_TRAFFIC_RATE_LIMIT_WINDOW_SIZE=60
|
||||
FASTAPI_TRAFFIC_RATE_LIMIT_ALGORITHM=sliding_window
|
||||
|
||||
Load it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
loader = ConfigLoader()
|
||||
config = loader.load_rate_limit_config_from_env_file(".env")
|
||||
|
||||
Global Configuration
|
||||
--------------------
|
||||
|
||||
Besides per-endpoint configuration, you can set global defaults:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Global settings
|
||||
export FASTAPI_TRAFFIC_GLOBAL_ENABLED=true
|
||||
export FASTAPI_TRAFFIC_GLOBAL_DEFAULT_LIMIT=100
|
||||
export FASTAPI_TRAFFIC_GLOBAL_DEFAULT_WINDOW_SIZE=60
|
||||
export FASTAPI_TRAFFIC_GLOBAL_DEFAULT_ALGORITHM=sliding_window_counter
|
||||
export FASTAPI_TRAFFIC_GLOBAL_KEY_PREFIX=fastapi_traffic
|
||||
export FASTAPI_TRAFFIC_GLOBAL_INCLUDE_HEADERS=true
|
||||
export FASTAPI_TRAFFIC_GLOBAL_ERROR_MESSAGE="Rate limit exceeded"
|
||||
export FASTAPI_TRAFFIC_GLOBAL_STATUS_CODE=429
|
||||
export FASTAPI_TRAFFIC_GLOBAL_SKIP_ON_ERROR=false
|
||||
export FASTAPI_TRAFFIC_GLOBAL_HEADERS_PREFIX=X-RateLimit
|
||||
|
||||
Load global config:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import load_global_config_from_env, RateLimiter
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
global_config = load_global_config_from_env()
|
||||
limiter = RateLimiter(config=global_config)
|
||||
set_limiter(limiter)
|
||||
|
||||
Auto-Detection
|
||||
--------------
|
||||
|
||||
The convenience functions automatically detect file format:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import load_rate_limit_config, load_global_config
|
||||
|
||||
# Detects JSON by extension
|
||||
config = load_rate_limit_config("config/limits.json")
|
||||
|
||||
# Detects .env file
|
||||
config = load_rate_limit_config("config/.env")
|
||||
|
||||
# Works for global config too
|
||||
global_config = load_global_config("config/global.json")
|
||||
|
||||
Overriding Values
|
||||
-----------------
|
||||
|
||||
You can override loaded values programmatically:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
loader = ConfigLoader()
|
||||
|
||||
# Load base config from file
|
||||
config = loader.load_rate_limit_config_from_json(
|
||||
"config/base.json",
|
||||
limit=200, # Override the limit
|
||||
key_prefix="custom", # Override the prefix
|
||||
)
|
||||
|
||||
This is useful for environment-specific overrides:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import os
|
||||
|
||||
base_config = loader.load_rate_limit_config_from_json("config/base.json")
|
||||
|
||||
# Apply environment-specific overrides
|
||||
if os.getenv("ENVIRONMENT") == "production":
|
||||
config = loader.load_rate_limit_config_from_json(
|
||||
"config/base.json",
|
||||
limit=base_config.limit * 2, # Double the limit in production
|
||||
)
|
||||
|
||||
Validation
|
||||
----------
|
||||
|
||||
Configuration is validated when loaded. Invalid values raise ``ConfigurationError``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import ConfigLoader, ConfigurationError
|
||||
|
||||
loader = ConfigLoader()
|
||||
|
||||
try:
|
||||
config = loader.load_rate_limit_config_from_env()
|
||||
except ConfigurationError as e:
|
||||
print(f"Invalid configuration: {e}")
|
||||
# Handle the error appropriately
|
||||
|
||||
Common validation errors:
|
||||
|
||||
- ``limit`` must be a positive integer
|
||||
- ``window_size`` must be a positive number
|
||||
- ``algorithm`` must be one of the valid algorithm names
|
||||
- ``status_code`` must be a valid HTTP status code
|
||||
|
||||
Algorithm Names
|
||||
---------------
|
||||
|
||||
When specifying algorithms in configuration, use these names:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Config Value
|
||||
- Algorithm
|
||||
* - ``token_bucket``
|
||||
- Token Bucket
|
||||
* - ``sliding_window``
|
||||
- Sliding Window
|
||||
* - ``fixed_window``
|
||||
- Fixed Window
|
||||
* - ``leaky_bucket``
|
||||
- Leaky Bucket
|
||||
* - ``sliding_window_counter``
|
||||
- Sliding Window Counter (default)
|
||||
|
||||
Boolean Values
|
||||
--------------
|
||||
|
||||
Boolean settings accept various formats:
|
||||
|
||||
- **True:** ``true``, ``1``, ``yes``, ``on``
|
||||
- **False:** ``false``, ``0``, ``no``, ``off``
|
||||
|
||||
Case doesn't matter.
|
||||
|
||||
Complete Example
|
||||
----------------
|
||||
|
||||
Here's a full example showing configuration loading in a real app:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import os
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import (
|
||||
ConfigLoader,
|
||||
ConfigurationError,
|
||||
RateLimiter,
|
||||
rate_limit,
|
||||
)
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
loader = ConfigLoader()
|
||||
|
||||
try:
|
||||
# Try to load from environment first
|
||||
global_config = loader.load_global_config_from_env()
|
||||
except ConfigurationError:
|
||||
# Fall back to defaults
|
||||
global_config = None
|
||||
|
||||
limiter = RateLimiter(config=global_config)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(100, 60)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
# Or load endpoint-specific config
|
||||
loader = ConfigLoader()
|
||||
try:
|
||||
api_config = loader.load_rate_limit_config_from_json("config/api_limits.json")
|
||||
except (FileNotFoundError, ConfigurationError):
|
||||
api_config = None
|
||||
|
||||
if api_config:
|
||||
@app.get("/api/special")
|
||||
@rate_limit(
|
||||
api_config.limit,
|
||||
api_config.window_size,
|
||||
algorithm=api_config.algorithm,
|
||||
)
|
||||
async def special_endpoint(request: Request):
|
||||
return {"special": "data"}
|
||||
277
docs/user-guide/exception-handling.rst
Normal file
277
docs/user-guide/exception-handling.rst
Normal file
@@ -0,0 +1,277 @@
|
||||
Exception Handling
|
||||
==================
|
||||
|
||||
When a client exceeds their rate limit, FastAPI Traffic raises a ``RateLimitExceeded``
|
||||
exception. This guide covers how to handle it gracefully.
|
||||
|
||||
Default Behavior
|
||||
----------------
|
||||
|
||||
By default, when a rate limit is exceeded, the library raises ``RateLimitExceeded``.
|
||||
FastAPI will convert this to a 500 error unless you handle it.
|
||||
|
||||
The exception contains useful information:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
|
||||
try:
|
||||
# Rate limited operation
|
||||
pass
|
||||
except RateLimitExceeded as exc:
|
||||
print(exc.message) # "Rate limit exceeded"
|
||||
print(exc.retry_after) # Seconds until they can retry (e.g., 45.2)
|
||||
print(exc.limit_info) # RateLimitInfo object with full details
|
||||
|
||||
Custom Exception Handler
|
||||
------------------------
|
||||
|
||||
The most common approach is to register a custom exception handler:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi.responses import JSONResponse
|
||||
from fastapi_traffic import RateLimitExceeded
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"error": "rate_limit_exceeded",
|
||||
"message": "You're making too many requests. Please slow down.",
|
||||
"retry_after": exc.retry_after,
|
||||
},
|
||||
headers={
|
||||
"Retry-After": str(int(exc.retry_after or 60)),
|
||||
},
|
||||
)
|
||||
|
||||
Now clients get a clean JSON response instead of a generic error.
|
||||
|
||||
Including Rate Limit Headers
|
||||
----------------------------
|
||||
|
||||
The ``limit_info`` object can generate standard rate limit headers:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
headers = {}
|
||||
if exc.limit_info:
|
||||
headers = exc.limit_info.to_headers()
|
||||
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"error": "rate_limit_exceeded",
|
||||
"retry_after": exc.retry_after,
|
||||
},
|
||||
headers=headers,
|
||||
)
|
||||
|
||||
This adds headers like:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
X-RateLimit-Limit: 100
|
||||
X-RateLimit-Remaining: 0
|
||||
X-RateLimit-Reset: 1709834400
|
||||
Retry-After: 45
|
||||
|
||||
Different Responses for Different Endpoints
|
||||
-------------------------------------------
|
||||
|
||||
You might want different error messages for different parts of your API:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
path = request.url.path
|
||||
|
||||
if path.startswith("/api/v1/"):
|
||||
# API clients get JSON
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={"error": "rate_limit_exceeded", "retry_after": exc.retry_after},
|
||||
)
|
||||
elif path.startswith("/web/"):
|
||||
# Web users get a friendly HTML page
|
||||
return HTMLResponse(
|
||||
status_code=429,
|
||||
content="<h1>Slow down!</h1><p>Please wait a moment before trying again.</p>",
|
||||
)
|
||||
else:
|
||||
# Default response
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={"detail": exc.message},
|
||||
)
|
||||
|
||||
Using the on_blocked Callback
|
||||
-----------------------------
|
||||
|
||||
Instead of (or in addition to) exception handling, you can use the ``on_blocked``
|
||||
callback to run code when a request is blocked:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def log_blocked_request(request: Request, result):
|
||||
"""Log when a request is rate limited."""
|
||||
client_ip = request.client.host if request.client else "unknown"
|
||||
logger.warning(
|
||||
"Rate limit exceeded for %s on %s %s",
|
||||
client_ip,
|
||||
request.method,
|
||||
request.url.path,
|
||||
)
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(100, 60, on_blocked=log_blocked_request)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
The callback receives the request and the rate limit result. It runs before the
|
||||
exception is raised.
|
||||
|
||||
Exempting Certain Requests
|
||||
--------------------------
|
||||
|
||||
Use ``exempt_when`` to skip rate limiting for certain requests:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def is_admin(request: Request) -> bool:
|
||||
"""Check if request is from an admin."""
|
||||
user = getattr(request.state, "user", None)
|
||||
return user is not None and user.is_admin
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(100, 60, exempt_when=is_admin)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
Admin requests bypass rate limiting entirely.
|
||||
|
||||
Graceful Degradation
|
||||
--------------------
|
||||
|
||||
Sometimes you'd rather serve a degraded response than reject the request entirely:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import RateLimiter, RateLimitConfig
|
||||
from fastapi_traffic.core.limiter import get_limiter
|
||||
|
||||
@app.get("/api/search")
|
||||
async def search(request: Request, q: str):
|
||||
limiter = get_limiter()
|
||||
config = RateLimitConfig(limit=100, window_size=60)
|
||||
|
||||
result = await limiter.check(request, config)
|
||||
|
||||
if not result.allowed:
|
||||
# Return cached/simplified results instead of blocking
|
||||
return {
|
||||
"results": get_cached_results(q),
|
||||
"note": "Results may be stale. Please try again later.",
|
||||
"retry_after": result.info.retry_after,
|
||||
}
|
||||
|
||||
# Full search
|
||||
return {"results": perform_full_search(q)}
|
||||
|
||||
Backend Errors
|
||||
--------------
|
||||
|
||||
If the rate limit backend fails (Redis down, SQLite locked, etc.), you have options:
|
||||
|
||||
**Option 1: Fail closed (default)**
|
||||
|
||||
Requests fail when the backend is unavailable. Safer, but impacts availability.
|
||||
|
||||
**Option 2: Fail open**
|
||||
|
||||
Allow requests through when the backend fails:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
**Option 3: Handle the error explicitly**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import BackendError
|
||||
|
||||
@app.exception_handler(BackendError)
|
||||
async def backend_error_handler(request: Request, exc: BackendError):
|
||||
# Log the error
|
||||
logger.error("Rate limit backend error: %s", exc.original_error)
|
||||
|
||||
# Decide what to do
|
||||
# Option A: Allow the request
|
||||
return None # Let the request continue
|
||||
|
||||
# Option B: Return an error
|
||||
return JSONResponse(
|
||||
status_code=503,
|
||||
content={"error": "service_unavailable"},
|
||||
)
|
||||
|
||||
Other Exceptions
|
||||
----------------
|
||||
|
||||
FastAPI Traffic defines a few exception types:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import (
|
||||
RateLimitExceeded, # Rate limit was exceeded
|
||||
BackendError, # Storage backend failed
|
||||
ConfigurationError, # Invalid configuration
|
||||
)
|
||||
|
||||
All inherit from ``FastAPITrafficError``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.exceptions import FastAPITrafficError
|
||||
|
||||
@app.exception_handler(FastAPITrafficError)
|
||||
async def traffic_error_handler(request: Request, exc: FastAPITrafficError):
|
||||
"""Catch-all for FastAPI Traffic errors."""
|
||||
if isinstance(exc, RateLimitExceeded):
|
||||
return JSONResponse(status_code=429, content={"error": "rate_limited"})
|
||||
elif isinstance(exc, BackendError):
|
||||
return JSONResponse(status_code=503, content={"error": "backend_error"})
|
||||
else:
|
||||
return JSONResponse(status_code=500, content={"error": "internal_error"})
|
||||
|
||||
Helper Function
|
||||
---------------
|
||||
|
||||
FastAPI Traffic provides a helper to create rate limit responses:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.core.decorator import create_rate_limit_response
|
||||
|
||||
@app.exception_handler(RateLimitExceeded)
|
||||
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
|
||||
return create_rate_limit_response(exc, include_headers=True)
|
||||
|
||||
This creates a standard 429 response with all the appropriate headers.
|
||||
258
docs/user-guide/key-extractors.rst
Normal file
258
docs/user-guide/key-extractors.rst
Normal file
@@ -0,0 +1,258 @@
|
||||
Key Extractors
|
||||
==============
|
||||
|
||||
A key extractor is a function that identifies who's making a request. By default,
|
||||
FastAPI Traffic uses the client's IP address, but you can customize this to fit
|
||||
your authentication model.
|
||||
|
||||
How It Works
|
||||
------------
|
||||
|
||||
Every rate limit needs a way to group requests. The key extractor returns a string
|
||||
that identifies the client:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def my_key_extractor(request: Request) -> str:
|
||||
return "some-unique-identifier"
|
||||
|
||||
All requests that return the same identifier share the same rate limit bucket.
|
||||
|
||||
Default Behavior
|
||||
----------------
|
||||
|
||||
The default extractor looks for the client IP in this order:
|
||||
|
||||
1. ``X-Forwarded-For`` header (first IP in the list)
|
||||
2. ``X-Real-IP`` header
|
||||
3. Direct connection IP (``request.client.host``)
|
||||
4. Falls back to ``"unknown"``
|
||||
|
||||
This handles most reverse proxy setups automatically.
|
||||
|
||||
Rate Limiting by API Key
|
||||
------------------------
|
||||
|
||||
For authenticated APIs, you probably want to limit by API key:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import Request
|
||||
from fastapi_traffic import rate_limit
|
||||
|
||||
def api_key_extractor(request: Request) -> str:
|
||||
"""Rate limit by API key."""
|
||||
api_key = request.headers.get("X-API-Key")
|
||||
if api_key:
|
||||
return f"apikey:{api_key}"
|
||||
# Fall back to IP for unauthenticated requests
|
||||
return f"ip:{request.client.host}" if request.client else "ip:unknown"
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(1000, 3600, key_extractor=api_key_extractor)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
Now each API key gets its own rate limit bucket.
|
||||
|
||||
Rate Limiting by User
|
||||
---------------------
|
||||
|
||||
If you're using authentication middleware that sets the user:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def user_extractor(request: Request) -> str:
|
||||
"""Rate limit by authenticated user."""
|
||||
# Assuming your auth middleware sets request.state.user
|
||||
user = getattr(request.state, "user", None)
|
||||
if user:
|
||||
return f"user:{user.id}"
|
||||
return f"ip:{request.client.host}" if request.client else "ip:unknown"
|
||||
|
||||
@app.get("/api/profile")
|
||||
@rate_limit(100, 60, key_extractor=user_extractor)
|
||||
async def get_profile(request: Request):
|
||||
return {"profile": "data"}
|
||||
|
||||
Rate Limiting by Tenant
|
||||
-----------------------
|
||||
|
||||
For multi-tenant applications:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def tenant_extractor(request: Request) -> str:
|
||||
"""Rate limit by tenant."""
|
||||
# From subdomain
|
||||
host = request.headers.get("host", "")
|
||||
if "." in host:
|
||||
tenant = host.split(".")[0]
|
||||
return f"tenant:{tenant}"
|
||||
|
||||
# Or from header
|
||||
tenant = request.headers.get("X-Tenant-ID")
|
||||
if tenant:
|
||||
return f"tenant:{tenant}"
|
||||
|
||||
return "tenant:default"
|
||||
|
||||
Combining Identifiers
|
||||
---------------------
|
||||
|
||||
Sometimes you want to combine multiple factors:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def combined_extractor(request: Request) -> str:
|
||||
"""Rate limit by user AND endpoint."""
|
||||
user = getattr(request.state, "user", None)
|
||||
user_id = user.id if user else "anonymous"
|
||||
endpoint = request.url.path
|
||||
return f"{user_id}:{endpoint}"
|
||||
|
||||
This gives each user a separate limit for each endpoint.
|
||||
|
||||
Tiered Rate Limits
|
||||
------------------
|
||||
|
||||
Different users might have different limits. Handle this with a custom extractor
|
||||
that includes the tier:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def tiered_extractor(request: Request) -> str:
|
||||
"""Include tier in the key for different limits."""
|
||||
user = getattr(request.state, "user", None)
|
||||
if user:
|
||||
# Premium users get a different bucket
|
||||
tier = "premium" if user.is_premium else "free"
|
||||
return f"{tier}:{user.id}"
|
||||
return f"anonymous:{request.client.host}"
|
||||
|
||||
Then apply different limits based on tier:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# You'd typically do this with middleware or dependency injection
|
||||
# to check the tier and apply the appropriate limit
|
||||
|
||||
@app.get("/api/data")
|
||||
async def get_data(request: Request):
|
||||
user = getattr(request.state, "user", None)
|
||||
if user and user.is_premium:
|
||||
# Premium: 10000 req/hour
|
||||
limit, window = 10000, 3600
|
||||
else:
|
||||
# Free: 100 req/hour
|
||||
limit, window = 100, 3600
|
||||
|
||||
# Apply rate limit manually
|
||||
limiter = get_limiter()
|
||||
config = RateLimitConfig(limit=limit, window_size=window)
|
||||
await limiter.hit(request, config)
|
||||
|
||||
return {"data": "here"}
|
||||
|
||||
Geographic Rate Limiting
|
||||
------------------------
|
||||
|
||||
Limit by country or region:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def geo_extractor(request: Request) -> str:
|
||||
"""Rate limit by country."""
|
||||
# Assuming you have a GeoIP lookup
|
||||
country = request.headers.get("CF-IPCountry", "XX") # Cloudflare header
|
||||
ip = request.client.host if request.client else "unknown"
|
||||
return f"{country}:{ip}"
|
||||
|
||||
This lets you apply different limits to different regions if needed.
|
||||
|
||||
Endpoint-Specific Keys
|
||||
----------------------
|
||||
|
||||
Rate limit the same user differently per endpoint:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def endpoint_user_extractor(request: Request) -> str:
|
||||
"""Separate limits per endpoint per user."""
|
||||
user = getattr(request.state, "user", None)
|
||||
user_id = user.id if user else request.client.host
|
||||
method = request.method
|
||||
path = request.url.path
|
||||
return f"{user_id}:{method}:{path}"
|
||||
|
||||
Best Practices
|
||||
--------------
|
||||
|
||||
1. **Always have a fallback.** If your primary identifier isn't available, fall
|
||||
back to IP:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def safe_extractor(request: Request) -> str:
|
||||
api_key = request.headers.get("X-API-Key")
|
||||
if api_key:
|
||||
return f"key:{api_key}"
|
||||
return f"ip:{request.client.host if request.client else 'unknown'}"
|
||||
|
||||
2. **Use prefixes.** When mixing identifier types, prefix them to avoid collisions:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Good - clear what each key represents
|
||||
return f"user:{user_id}"
|
||||
return f"ip:{ip_address}"
|
||||
return f"key:{api_key}"
|
||||
|
||||
# Bad - could collide
|
||||
return user_id
|
||||
return ip_address
|
||||
|
||||
3. **Keep it fast.** The extractor runs on every request. Avoid database lookups
|
||||
or expensive operations:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Bad - database lookup on every request
|
||||
def slow_extractor(request: Request) -> str:
|
||||
user = db.get_user(request.headers.get("Authorization"))
|
||||
return user.id
|
||||
|
||||
# Good - use data already in the request
|
||||
def fast_extractor(request: Request) -> str:
|
||||
return request.state.user.id # Set by auth middleware
|
||||
|
||||
4. **Be consistent.** The same client should always get the same key. Watch out
|
||||
for things like:
|
||||
|
||||
- IP addresses changing (mobile users)
|
||||
- Case sensitivity (normalize to lowercase)
|
||||
- Whitespace (strip it)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def normalized_extractor(request: Request) -> str:
|
||||
api_key = request.headers.get("X-API-Key", "").strip().lower()
|
||||
if api_key:
|
||||
return f"key:{api_key}"
|
||||
return f"ip:{request.client.host}"
|
||||
|
||||
Using with Middleware
|
||||
---------------------
|
||||
|
||||
Key extractors work the same way with middleware:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.middleware import RateLimitMiddleware
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
key_extractor=api_key_extractor,
|
||||
)
|
||||
322
docs/user-guide/middleware.rst
Normal file
322
docs/user-guide/middleware.rst
Normal file
@@ -0,0 +1,322 @@
|
||||
Middleware
|
||||
==========
|
||||
|
||||
Sometimes you want rate limiting applied to your entire API, not just individual
|
||||
endpoints. That's where middleware comes in.
|
||||
|
||||
Middleware sits between the client and your application, checking every request
|
||||
before it reaches your endpoints.
|
||||
|
||||
Basic Usage
|
||||
-----------
|
||||
|
||||
Add the middleware to your FastAPI app:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi_traffic.middleware import RateLimitMiddleware
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000, # 1000 requests
|
||||
window_size=60, # per minute
|
||||
)
|
||||
|
||||
@app.get("/api/users")
|
||||
async def get_users():
|
||||
return {"users": []}
|
||||
|
||||
@app.get("/api/posts")
|
||||
async def get_posts():
|
||||
return {"posts": []}
|
||||
|
||||
Now every endpoint shares the same rate limit pool. A client who makes 500 requests
|
||||
to ``/api/users`` only has 500 left for ``/api/posts``.
|
||||
|
||||
Exempting Paths
|
||||
---------------
|
||||
|
||||
You probably don't want to rate limit your health checks or documentation:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
exempt_paths={
|
||||
"/health",
|
||||
"/ready",
|
||||
"/docs",
|
||||
"/redoc",
|
||||
"/openapi.json",
|
||||
},
|
||||
)
|
||||
|
||||
These paths bypass rate limiting entirely.
|
||||
|
||||
Exempting IPs
|
||||
-------------
|
||||
|
||||
Internal services, monitoring systems, or your own infrastructure might need
|
||||
unrestricted access:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
exempt_ips={
|
||||
"127.0.0.1",
|
||||
"10.0.0.0/8", # Internal network
|
||||
"192.168.1.100", # Monitoring server
|
||||
},
|
||||
)
|
||||
|
||||
.. note::
|
||||
|
||||
IP exemptions are checked against the client IP extracted by the key extractor.
|
||||
Make sure your proxy headers are configured correctly if you're behind a load
|
||||
balancer.
|
||||
|
||||
Custom Key Extraction
|
||||
---------------------
|
||||
|
||||
By default, clients are identified by IP address. You can change this:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from starlette.requests import Request
|
||||
|
||||
def get_client_id(request: Request) -> str:
|
||||
"""Identify clients by API key, fall back to IP."""
|
||||
api_key = request.headers.get("X-API-Key")
|
||||
if api_key:
|
||||
return f"api:{api_key}"
|
||||
return request.client.host if request.client else "unknown"
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
key_extractor=get_client_id,
|
||||
)
|
||||
|
||||
Choosing an Algorithm
|
||||
---------------------
|
||||
|
||||
The middleware supports all five algorithms:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.core.algorithms import Algorithm
|
||||
|
||||
# Token bucket for burst-friendly limiting
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
algorithm=Algorithm.TOKEN_BUCKET,
|
||||
)
|
||||
|
||||
# Sliding window for precise limiting
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
algorithm=Algorithm.SLIDING_WINDOW,
|
||||
)
|
||||
|
||||
Using a Custom Backend
|
||||
----------------------
|
||||
|
||||
By default, middleware uses the memory backend. For production, you'll want
|
||||
something persistent:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import SQLiteBackend
|
||||
from fastapi_traffic.middleware import RateLimitMiddleware
|
||||
|
||||
backend = SQLiteBackend("rate_limits.db")
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
backend=backend,
|
||||
)
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
await backend.close()
|
||||
|
||||
For Redis:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
|
||||
# Create backend at startup
|
||||
redis_backend = None
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
global redis_backend
|
||||
redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
|
||||
# Note: You'll need to configure middleware after startup
|
||||
# or use a factory pattern
|
||||
|
||||
Convenience Middleware Classes
|
||||
------------------------------
|
||||
|
||||
For common use cases, we provide pre-configured middleware:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic.middleware import (
|
||||
SlidingWindowMiddleware,
|
||||
TokenBucketMiddleware,
|
||||
)
|
||||
|
||||
# Sliding window algorithm
|
||||
app.add_middleware(
|
||||
SlidingWindowMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
# Token bucket algorithm
|
||||
app.add_middleware(
|
||||
TokenBucketMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
Combining with Decorator
|
||||
------------------------
|
||||
|
||||
You can use both middleware and decorators. The middleware provides a baseline
|
||||
limit, and decorators can add stricter limits to specific endpoints:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import rate_limit
|
||||
from fastapi_traffic.middleware import RateLimitMiddleware
|
||||
|
||||
# Global limit: 1000 req/min
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
# This endpoint has an additional, stricter limit
|
||||
@app.post("/api/expensive-operation")
|
||||
@rate_limit(10, 60) # Only 10 req/min for this endpoint
|
||||
async def expensive_operation(request: Request):
|
||||
return {"result": "done"}
|
||||
|
||||
# This endpoint uses only the global limit
|
||||
@app.get("/api/cheap-operation")
|
||||
async def cheap_operation():
|
||||
return {"result": "done"}
|
||||
|
||||
Both limits are checked. A request must pass both the middleware limit AND the
|
||||
decorator limit.
|
||||
|
||||
Error Responses
|
||||
---------------
|
||||
|
||||
When a client exceeds the rate limit, they get a 429 response:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"detail": "Rate limit exceeded. Please try again later.",
|
||||
"retry_after": 45.2
|
||||
}
|
||||
|
||||
You can customize the message:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
error_message="Whoa there! You're making requests too fast.",
|
||||
status_code=429,
|
||||
)
|
||||
|
||||
Response Headers
|
||||
----------------
|
||||
|
||||
By default, rate limit headers are included in every response:
|
||||
|
||||
.. code-block:: http
|
||||
|
||||
X-RateLimit-Limit: 1000
|
||||
X-RateLimit-Remaining: 847
|
||||
X-RateLimit-Reset: 1709834400
|
||||
|
||||
When rate limited:
|
||||
|
||||
.. code-block:: http
|
||||
|
||||
Retry-After: 45
|
||||
|
||||
Disable headers if you don't want to expose this information:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
include_headers=False,
|
||||
)
|
||||
|
||||
Handling Backend Errors
|
||||
-----------------------
|
||||
|
||||
What happens if your Redis server goes down? By default, the middleware will
|
||||
raise an exception. You can change this behavior:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
skip_on_error=True, # Allow requests through if backend fails
|
||||
)
|
||||
|
||||
With ``skip_on_error=True``, requests are allowed through when the backend is
|
||||
unavailable. This is a tradeoff between availability and protection.
|
||||
|
||||
Full Configuration Reference
|
||||
----------------------------
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000, # Max requests per window
|
||||
window_size=60.0, # Window size in seconds
|
||||
algorithm=Algorithm.SLIDING_WINDOW_COUNTER, # Algorithm to use
|
||||
backend=None, # Storage backend (default: MemoryBackend)
|
||||
key_prefix="middleware", # Prefix for rate limit keys
|
||||
include_headers=True, # Add rate limit headers to responses
|
||||
error_message="Rate limit exceeded. Please try again later.",
|
||||
status_code=429, # HTTP status when limited
|
||||
skip_on_error=False, # Allow requests if backend fails
|
||||
exempt_paths=None, # Set of paths to exempt
|
||||
exempt_ips=None, # Set of IPs to exempt
|
||||
key_extractor=default_key_extractor, # Function to identify clients
|
||||
)
|
||||
Reference in New Issue
Block a user