release: bump version to 0.3.0
- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
This commit is contained in:
319
docs/advanced/distributed-systems.rst
Normal file
319
docs/advanced/distributed-systems.rst
Normal file
@@ -0,0 +1,319 @@
|
||||
Distributed Systems
|
||||
===================
|
||||
|
||||
Running rate limiting across multiple application instances requires careful
|
||||
consideration. This guide covers the patterns and pitfalls.
|
||||
|
||||
The Challenge
|
||||
-------------
|
||||
|
||||
In a distributed system, you might have:
|
||||
|
||||
- Multiple application instances behind a load balancer
|
||||
- Kubernetes pods that scale up and down
|
||||
- Serverless functions that run independently
|
||||
|
||||
Each instance needs to share rate limit state. Otherwise, a client could make
|
||||
100 requests to instance A and another 100 to instance B, effectively bypassing
|
||||
a 100 request limit.
|
||||
|
||||
Redis: The Standard Solution
|
||||
----------------------------
|
||||
|
||||
Redis is the go-to choice for distributed rate limiting:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi_traffic import RateLimiter
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://redis-server:6379/0",
|
||||
key_prefix="myapp:ratelimit",
|
||||
)
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
await limiter.initialize()
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def shutdown():
|
||||
limiter = get_limiter()
|
||||
await limiter.close()
|
||||
|
||||
All instances connect to the same Redis server and share state.
|
||||
|
||||
High Availability Redis
|
||||
-----------------------
|
||||
|
||||
For production, you'll want Redis with high availability:
|
||||
|
||||
**Redis Sentinel:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/0",
|
||||
sentinel_master="mymaster",
|
||||
)
|
||||
|
||||
**Redis Cluster:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://node1:6379,node2:6379,node3:6379/0",
|
||||
)
|
||||
|
||||
Atomic Operations
|
||||
-----------------
|
||||
|
||||
Race conditions are a real concern in distributed systems. Consider this scenario:
|
||||
|
||||
1. Instance A reads: 99 requests made
|
||||
2. Instance B reads: 99 requests made
|
||||
3. Instance A writes: 100 requests (allows request)
|
||||
4. Instance B writes: 100 requests (allows request)
|
||||
|
||||
Now you've allowed 101 requests when the limit was 100.
|
||||
|
||||
FastAPI Traffic's Redis backend uses Lua scripts to make operations atomic:
|
||||
|
||||
.. code-block:: lua
|
||||
|
||||
-- Simplified example of atomic check-and-increment
|
||||
local current = redis.call('GET', KEYS[1])
|
||||
if current and tonumber(current) >= limit then
|
||||
return 0 -- Reject
|
||||
end
|
||||
redis.call('INCR', KEYS[1])
|
||||
return 1 -- Allow
|
||||
|
||||
The entire check-and-update happens in a single Redis operation.
|
||||
|
||||
Network Latency
|
||||
---------------
|
||||
|
||||
Redis adds network latency to every request. Some strategies to minimize impact:
|
||||
|
||||
**1. Connection pooling (automatic):**
|
||||
|
||||
The Redis backend maintains a connection pool, so you're not creating new
|
||||
connections for each request.
|
||||
|
||||
**2. Local caching:**
|
||||
|
||||
For very high-traffic endpoints, consider a two-tier approach:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
|
||||
# Local memory backend for fast path
|
||||
local_backend = MemoryBackend()
|
||||
local_limiter = RateLimiter(local_backend)
|
||||
|
||||
# Redis backend for distributed state
|
||||
redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")
|
||||
distributed_limiter = RateLimiter(redis_backend)
|
||||
|
||||
async def check_rate_limit(request: Request, config: RateLimitConfig):
|
||||
# Quick local check (may allow some extra requests)
|
||||
local_result = await local_limiter.check(request, config)
|
||||
if not local_result.allowed:
|
||||
return local_result
|
||||
|
||||
# Authoritative distributed check
|
||||
return await distributed_limiter.check(request, config)
|
||||
|
||||
**3. Skip on error:**
|
||||
|
||||
If Redis latency is causing issues, you might prefer to allow requests through
|
||||
rather than block:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
Handling Redis Failures
|
||||
-----------------------
|
||||
|
||||
What happens when Redis goes down?
|
||||
|
||||
**Fail closed (default):**
|
||||
|
||||
Requests fail. This is safer but impacts availability.
|
||||
|
||||
**Fail open:**
|
||||
|
||||
Allow requests through:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, skip_on_error=True)
|
||||
|
||||
**Circuit breaker pattern:**
|
||||
|
||||
Implement a circuit breaker to avoid hammering a failing Redis:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import time
|
||||
|
||||
class CircuitBreaker:
|
||||
def __init__(self, failure_threshold=5, reset_timeout=60):
|
||||
self.failures = 0
|
||||
self.threshold = failure_threshold
|
||||
self.reset_timeout = reset_timeout
|
||||
self.last_failure = 0
|
||||
self.open = False
|
||||
|
||||
def record_failure(self):
|
||||
self.failures += 1
|
||||
self.last_failure = time.time()
|
||||
if self.failures >= self.threshold:
|
||||
self.open = True
|
||||
|
||||
def record_success(self):
|
||||
self.failures = 0
|
||||
self.open = False
|
||||
|
||||
def should_allow(self) -> bool:
|
||||
if not self.open:
|
||||
return True
|
||||
# Check if we should try again
|
||||
if time.time() - self.last_failure > self.reset_timeout:
|
||||
return True
|
||||
return False
|
||||
|
||||
Kubernetes Deployment
|
||||
---------------------
|
||||
|
||||
Here's a typical Kubernetes setup:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# redis-deployment.yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: redis
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: redis
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: redis
|
||||
spec:
|
||||
containers:
|
||||
- name: redis
|
||||
image: redis:7-alpine
|
||||
ports:
|
||||
- containerPort: 6379
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: redis
|
||||
spec:
|
||||
selector:
|
||||
app: redis
|
||||
ports:
|
||||
- port: 6379
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# app-deployment.yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: api
|
||||
spec:
|
||||
replicas: 3
|
||||
selector:
|
||||
matchLabels:
|
||||
app: api
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: api
|
||||
image: myapp:latest
|
||||
env:
|
||||
- name: REDIS_URL
|
||||
value: "redis://redis:6379/0"
|
||||
|
||||
Your app connects to Redis via the service name:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import os
|
||||
|
||||
redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0")
|
||||
backend = await RedisBackend.from_url(redis_url)
|
||||
|
||||
Monitoring
|
||||
----------
|
||||
|
||||
Keep an eye on:
|
||||
|
||||
1. **Redis latency:** High latency means slow requests
|
||||
2. **Redis memory:** Rate limit data shouldn't use much, but monitor it
|
||||
3. **Connection count:** Make sure you're not exhausting connections
|
||||
4. **Rate limit hits:** Track how often clients are being limited
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def on_rate_limited(request: Request, result):
|
||||
logger.info(
|
||||
"Rate limited: client=%s path=%s remaining=%d",
|
||||
request.client.host,
|
||||
request.url.path,
|
||||
result.info.remaining,
|
||||
)
|
||||
|
||||
@rate_limit(100, 60, on_blocked=on_rate_limited)
|
||||
async def endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
Testing Distributed Rate Limits
|
||||
-------------------------------
|
||||
|
||||
Testing distributed behavior is tricky. Here's an approach:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
|
||||
async def test_distributed_limit():
|
||||
"""Simulate requests from multiple 'instances'."""
|
||||
async with httpx.AsyncClient() as client:
|
||||
# Fire 150 requests concurrently
|
||||
tasks = [
|
||||
client.get("http://localhost:8000/api/data")
|
||||
for _ in range(150)
|
||||
]
|
||||
responses = await asyncio.gather(*tasks)
|
||||
|
||||
# Count successes and rate limits
|
||||
successes = sum(1 for r in responses if r.status_code == 200)
|
||||
limited = sum(1 for r in responses if r.status_code == 429)
|
||||
|
||||
print(f"Successes: {successes}, Rate limited: {limited}")
|
||||
# With a limit of 100, expect ~100 successes and ~50 limited
|
||||
|
||||
asyncio.run(test_distributed_limit())
|
||||
291
docs/advanced/performance.rst
Normal file
291
docs/advanced/performance.rst
Normal file
@@ -0,0 +1,291 @@
|
||||
Performance
|
||||
===========
|
||||
|
||||
FastAPI Traffic is designed to be fast. But when you're handling thousands of
|
||||
requests per second, every microsecond counts. Here's how to squeeze out the
|
||||
best performance.
|
||||
|
||||
Baseline Performance
|
||||
--------------------
|
||||
|
||||
On typical hardware, you can expect:
|
||||
|
||||
- **Memory backend:** ~0.01ms per check
|
||||
- **SQLite backend:** ~0.1ms per check
|
||||
- **Redis backend:** ~1ms per check (network dependent)
|
||||
|
||||
For most applications, this overhead is negligible compared to your actual
|
||||
business logic.
|
||||
|
||||
Choosing the Right Algorithm
|
||||
----------------------------
|
||||
|
||||
Algorithms have different performance characteristics:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Algorithm
|
||||
- Time Complexity
|
||||
- Space Complexity
|
||||
- Notes
|
||||
* - Token Bucket
|
||||
- O(1)
|
||||
- O(1)
|
||||
- Two floats per key
|
||||
* - Fixed Window
|
||||
- O(1)
|
||||
- O(1)
|
||||
- One int + one float per key
|
||||
* - Sliding Window Counter
|
||||
- O(1)
|
||||
- O(1)
|
||||
- Three values per key
|
||||
* - Leaky Bucket
|
||||
- O(1)
|
||||
- O(1)
|
||||
- Two floats per key
|
||||
* - Sliding Window
|
||||
- O(n)
|
||||
- O(n)
|
||||
- Stores every timestamp
|
||||
|
||||
**Recommendation:** Use Sliding Window Counter (the default) unless you have
|
||||
specific requirements. It's O(1) and provides good accuracy.
|
||||
|
||||
**Avoid Sliding Window for high-volume endpoints.** If you're allowing 10,000
|
||||
requests per minute, that's 10,000 timestamps to store and filter per key.
|
||||
|
||||
Memory Backend Optimization
|
||||
---------------------------
|
||||
|
||||
The memory backend is already fast, but you can tune it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import MemoryBackend
|
||||
|
||||
backend = MemoryBackend(
|
||||
max_size=10000, # Limit memory usage
|
||||
cleanup_interval=60, # Less frequent cleanup = less overhead
|
||||
)
|
||||
|
||||
**max_size:** Limits the number of keys stored. When exceeded, LRU eviction kicks
|
||||
in. Set this based on your expected number of unique clients.
|
||||
|
||||
**cleanup_interval:** How often to scan for expired entries. Higher values mean
|
||||
less CPU overhead but more memory usage from expired entries.
|
||||
|
||||
SQLite Backend Optimization
|
||||
---------------------------
|
||||
|
||||
SQLite is surprisingly fast for rate limiting:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import SQLiteBackend
|
||||
|
||||
backend = SQLiteBackend(
|
||||
"rate_limits.db",
|
||||
cleanup_interval=300, # Clean every 5 minutes
|
||||
)
|
||||
|
||||
**Tips:**
|
||||
|
||||
1. **Use an SSD.** SQLite performance depends heavily on disk I/O.
|
||||
|
||||
2. **Put the database on a local disk.** Network-attached storage adds latency.
|
||||
|
||||
3. **WAL mode is enabled by default.** This allows concurrent reads and writes.
|
||||
|
||||
4. **Increase cleanup_interval** if you have many keys. Cleanup scans the entire
|
||||
table.
|
||||
|
||||
Redis Backend Optimization
|
||||
--------------------------
|
||||
|
||||
Redis is the bottleneck in most distributed setups:
|
||||
|
||||
**1. Use connection pooling (automatic):**
|
||||
|
||||
The backend maintains a pool of connections. You don't need to do anything.
|
||||
|
||||
**2. Use pipelining for batch operations:**
|
||||
|
||||
If you're checking multiple rate limits, batch them:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Instead of multiple round trips
|
||||
result1 = await limiter.check(request, config1)
|
||||
result2 = await limiter.check(request, config2)
|
||||
|
||||
# Consider combining into one check with higher cost
|
||||
combined_config = RateLimitConfig(limit=100, window_size=60, cost=2)
|
||||
result = await limiter.check(request, combined_config)
|
||||
|
||||
**3. Use Redis close to your application:**
|
||||
|
||||
Network latency is usually the biggest factor. Run Redis in the same datacenter,
|
||||
or better yet, the same availability zone.
|
||||
|
||||
**4. Consider Redis Cluster for high throughput:**
|
||||
|
||||
Distributes load across multiple Redis nodes.
|
||||
|
||||
Reducing Overhead
|
||||
-----------------
|
||||
|
||||
**1. Exempt paths that don't need limiting:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
app.add_middleware(
|
||||
RateLimitMiddleware,
|
||||
limit=1000,
|
||||
window_size=60,
|
||||
exempt_paths={"/health", "/metrics", "/ready"},
|
||||
)
|
||||
|
||||
**2. Use coarse-grained limits when possible:**
|
||||
|
||||
Instead of limiting every endpoint separately, use middleware for a global limit:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# One check per request
|
||||
app.add_middleware(RateLimitMiddleware, limit=1000, window_size=60)
|
||||
|
||||
# vs. multiple checks per request
|
||||
@rate_limit(100, 60) # Check 1
|
||||
@another_decorator # Check 2
|
||||
async def endpoint():
|
||||
pass
|
||||
|
||||
**3. Increase window size:**
|
||||
|
||||
Longer windows mean fewer state updates:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Updates state 60 times per minute per client
|
||||
@rate_limit(60, 60)
|
||||
|
||||
# Updates state 1 time per minute per client
|
||||
@rate_limit(1, 1) # Same rate, but per-second
|
||||
|
||||
Wait, that's backwards. Actually, the number of state updates equals the number
|
||||
of requests, regardless of window size. But longer windows mean:
|
||||
|
||||
- Fewer unique window boundaries
|
||||
- Better cache efficiency
|
||||
- More stable rate limiting
|
||||
|
||||
**4. Skip headers when not needed:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(100, 60, include_headers=False)
|
||||
|
||||
Saves a tiny bit of response processing.
|
||||
|
||||
Benchmarking
|
||||
------------
|
||||
|
||||
Here's a simple benchmark script:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter, RateLimitConfig
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
async def benchmark():
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
await limiter.initialize()
|
||||
|
||||
config = RateLimitConfig(limit=10000, window_size=60)
|
||||
|
||||
# Mock request
|
||||
request = MagicMock()
|
||||
request.client.host = "127.0.0.1"
|
||||
request.url.path = "/test"
|
||||
request.method = "GET"
|
||||
request.headers = {}
|
||||
|
||||
# Warm up
|
||||
for _ in range(100):
|
||||
await limiter.check(request, config)
|
||||
|
||||
# Benchmark
|
||||
iterations = 10000
|
||||
start = time.perf_counter()
|
||||
|
||||
for _ in range(iterations):
|
||||
await limiter.check(request, config)
|
||||
|
||||
elapsed = time.perf_counter() - start
|
||||
|
||||
print(f"Total time: {elapsed:.3f}s")
|
||||
print(f"Per check: {elapsed/iterations*1000:.3f}ms")
|
||||
print(f"Checks/sec: {iterations/elapsed:.0f}")
|
||||
|
||||
await limiter.close()
|
||||
|
||||
asyncio.run(benchmark())
|
||||
|
||||
Typical output:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
Total time: 0.150s
|
||||
Per check: 0.015ms
|
||||
Checks/sec: 66666
|
||||
|
||||
Profiling
|
||||
---------
|
||||
|
||||
If you suspect rate limiting is a bottleneck, profile it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import cProfile
|
||||
import pstats
|
||||
|
||||
async def profile_rate_limiting():
|
||||
# Your rate limiting code here
|
||||
pass
|
||||
|
||||
cProfile.run('asyncio.run(profile_rate_limiting())', 'rate_limit.prof')
|
||||
|
||||
stats = pstats.Stats('rate_limit.prof')
|
||||
stats.sort_stats('cumulative')
|
||||
stats.print_stats(20)
|
||||
|
||||
Look for:
|
||||
|
||||
- Time spent in backend operations
|
||||
- Time spent in algorithm calculations
|
||||
- Unexpected hotspots
|
||||
|
||||
When Performance Really Matters
|
||||
-------------------------------
|
||||
|
||||
If you're handling millions of requests per second and rate limiting overhead
|
||||
is significant:
|
||||
|
||||
1. **Consider sampling:** Only check rate limits for a percentage of requests
|
||||
and extrapolate.
|
||||
|
||||
2. **Use probabilistic data structures:** Bloom filters or Count-Min Sketch can
|
||||
approximate rate limiting with less overhead.
|
||||
|
||||
3. **Push to the edge:** Use CDN-level rate limiting (Cloudflare, AWS WAF) to
|
||||
handle the bulk of traffic.
|
||||
|
||||
4. **Accept some inaccuracy:** Fixed window with ``skip_on_error=True`` is very
|
||||
fast and "good enough" for many use cases.
|
||||
|
||||
For most applications, though, the default configuration is plenty fast.
|
||||
367
docs/advanced/testing.rst
Normal file
367
docs/advanced/testing.rst
Normal file
@@ -0,0 +1,367 @@
|
||||
Testing
|
||||
=======
|
||||
|
||||
Testing rate-limited endpoints requires some care. You don't want your tests to
|
||||
be flaky because of timing issues, and you need to verify that limits actually work.
|
||||
|
||||
Basic Testing Setup
|
||||
-------------------
|
||||
|
||||
Use pytest with pytest-asyncio for async tests:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# conftest.py
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
@pytest.fixture
|
||||
def app():
|
||||
"""Create a fresh app for each test."""
|
||||
from myapp import create_app
|
||||
return create_app()
|
||||
|
||||
@pytest.fixture
|
||||
def client(app):
|
||||
"""Test client with fresh rate limiter."""
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
|
||||
with TestClient(app) as client:
|
||||
yield client
|
||||
|
||||
Testing Rate Limit Enforcement
|
||||
------------------------------
|
||||
|
||||
Verify that the limit is actually enforced:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_rate_limit_enforced(client):
|
||||
"""Test that requests are blocked after limit is reached."""
|
||||
# Make requests up to the limit
|
||||
for i in range(10):
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200, f"Request {i+1} should succeed"
|
||||
|
||||
# Next request should be rate limited
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 429
|
||||
assert "retry_after" in response.json()
|
||||
|
||||
Testing Rate Limit Headers
|
||||
--------------------------
|
||||
|
||||
Check that headers are included correctly:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_rate_limit_headers(client):
|
||||
"""Test that rate limit headers are present."""
|
||||
response = client.get("/api/data")
|
||||
|
||||
assert "X-RateLimit-Limit" in response.headers
|
||||
assert "X-RateLimit-Remaining" in response.headers
|
||||
assert "X-RateLimit-Reset" in response.headers
|
||||
|
||||
# Verify values make sense
|
||||
limit = int(response.headers["X-RateLimit-Limit"])
|
||||
remaining = int(response.headers["X-RateLimit-Remaining"])
|
||||
|
||||
assert limit == 100 # Your configured limit
|
||||
assert remaining == 99 # One request made
|
||||
|
||||
Testing Different Clients
|
||||
-------------------------
|
||||
|
||||
Verify that different clients have separate limits:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_separate_limits_per_client(client):
|
||||
"""Test that different IPs have separate limits."""
|
||||
# Client A makes requests
|
||||
for _ in range(10):
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "1.1.1.1"}
|
||||
)
|
||||
assert response.status_code == 200
|
||||
|
||||
# Client A is now limited
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "1.1.1.1"}
|
||||
)
|
||||
assert response.status_code == 429
|
||||
|
||||
# Client B should still have full quota
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "2.2.2.2"}
|
||||
)
|
||||
assert response.status_code == 200
|
||||
|
||||
Testing Window Reset
|
||||
--------------------
|
||||
|
||||
Test that limits reset after the window expires:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import time
|
||||
from unittest.mock import patch
|
||||
|
||||
def test_limit_resets_after_window(client):
|
||||
"""Test that limits reset after window expires."""
|
||||
# Exhaust the limit
|
||||
for _ in range(10):
|
||||
client.get("/api/data")
|
||||
|
||||
# Should be limited
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 429
|
||||
|
||||
# Fast-forward time (mock time.time)
|
||||
with patch('time.time') as mock_time:
|
||||
# Move 61 seconds into the future
|
||||
mock_time.return_value = time.time() + 61
|
||||
|
||||
# Should be allowed again
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200
|
||||
|
||||
Testing Exemptions
|
||||
------------------
|
||||
|
||||
Verify that exemptions work:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_exempt_paths(client):
|
||||
"""Test that exempt paths bypass rate limiting."""
|
||||
# Exhaust limit on a regular endpoint
|
||||
for _ in range(100):
|
||||
client.get("/api/data")
|
||||
|
||||
# Regular endpoint should be limited
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 429
|
||||
|
||||
# Health check should still work
|
||||
response = client.get("/health")
|
||||
assert response.status_code == 200
|
||||
|
||||
def test_exempt_ips(client):
|
||||
"""Test that exempt IPs bypass rate limiting."""
|
||||
# Make many requests from exempt IP
|
||||
for _ in range(1000):
|
||||
response = client.get(
|
||||
"/api/data",
|
||||
headers={"X-Forwarded-For": "127.0.0.1"}
|
||||
)
|
||||
assert response.status_code == 200 # Never limited
|
||||
|
||||
Testing with Async Client
|
||||
-------------------------
|
||||
|
||||
For async endpoints, use httpx:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import pytest
|
||||
import httpx
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_async_rate_limiting():
|
||||
"""Test rate limiting with async client."""
|
||||
async with httpx.AsyncClient(app=app, base_url="http://test") as client:
|
||||
# Make concurrent requests
|
||||
responses = await asyncio.gather(*[
|
||||
client.get("/api/data")
|
||||
for _ in range(15)
|
||||
])
|
||||
|
||||
successes = sum(1 for r in responses if r.status_code == 200)
|
||||
limited = sum(1 for r in responses if r.status_code == 429)
|
||||
|
||||
assert successes == 10 # Limit
|
||||
assert limited == 5 # Over limit
|
||||
|
||||
Testing Backend Failures
|
||||
------------------------
|
||||
|
||||
Test behavior when the backend fails:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from unittest.mock import AsyncMock, patch
|
||||
from fastapi_traffic import BackendError
|
||||
|
||||
def test_skip_on_error(client):
|
||||
"""Test that requests are allowed when backend fails and skip_on_error=True."""
|
||||
with patch.object(
|
||||
MemoryBackend, 'get',
|
||||
side_effect=BackendError("Connection failed")
|
||||
):
|
||||
# With skip_on_error=True, should still work
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200
|
||||
|
||||
def test_fail_on_error(client):
|
||||
"""Test that requests fail when backend fails and skip_on_error=False."""
|
||||
with patch.object(
|
||||
MemoryBackend, 'get',
|
||||
side_effect=BackendError("Connection failed")
|
||||
):
|
||||
# With skip_on_error=False (default), should fail
|
||||
response = client.get("/api/strict-data")
|
||||
assert response.status_code == 500
|
||||
|
||||
Mocking the Rate Limiter
|
||||
------------------------
|
||||
|
||||
For unit tests, you might want to mock the rate limiter entirely:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
from fastapi_traffic.core.models import RateLimitInfo, RateLimitResult
|
||||
|
||||
def test_with_mocked_limiter(client):
|
||||
"""Test endpoint logic without actual rate limiting."""
|
||||
mock_limiter = MagicMock()
|
||||
mock_limiter.hit = AsyncMock(return_value=RateLimitResult(
|
||||
allowed=True,
|
||||
info=RateLimitInfo(
|
||||
limit=100,
|
||||
remaining=99,
|
||||
reset_at=time.time() + 60,
|
||||
window_size=60,
|
||||
),
|
||||
key="test",
|
||||
))
|
||||
|
||||
set_limiter(mock_limiter)
|
||||
|
||||
response = client.get("/api/data")
|
||||
assert response.status_code == 200
|
||||
mock_limiter.hit.assert_called_once()
|
||||
|
||||
Integration Testing with Redis
|
||||
------------------------------
|
||||
|
||||
For integration tests with Redis:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import pytest
|
||||
from fastapi_traffic.backends.redis import RedisBackend
|
||||
|
||||
@pytest.fixture
|
||||
async def redis_backend():
|
||||
"""Create a Redis backend for testing."""
|
||||
backend = await RedisBackend.from_url(
|
||||
"redis://localhost:6379/15", # Use a test database
|
||||
key_prefix="test:",
|
||||
)
|
||||
yield backend
|
||||
await backend.clear() # Clean up after test
|
||||
await backend.close()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_redis_rate_limiting(redis_backend):
|
||||
"""Test rate limiting with real Redis."""
|
||||
limiter = RateLimiter(redis_backend)
|
||||
await limiter.initialize()
|
||||
|
||||
config = RateLimitConfig(limit=5, window_size=60)
|
||||
request = create_mock_request("1.1.1.1")
|
||||
|
||||
# Make requests up to limit
|
||||
for _ in range(5):
|
||||
result = await limiter.check(request, config)
|
||||
assert result.allowed
|
||||
|
||||
# Next should be blocked
|
||||
result = await limiter.check(request, config)
|
||||
assert not result.allowed
|
||||
|
||||
await limiter.close()
|
||||
|
||||
Fixtures for Common Scenarios
|
||||
-----------------------------
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# conftest.py
|
||||
import pytest
|
||||
from fastapi_traffic import MemoryBackend, RateLimiter, RateLimitConfig
|
||||
from fastapi_traffic.core.limiter import set_limiter
|
||||
|
||||
@pytest.fixture
|
||||
def fresh_limiter():
|
||||
"""Fresh rate limiter for each test."""
|
||||
backend = MemoryBackend()
|
||||
limiter = RateLimiter(backend)
|
||||
set_limiter(limiter)
|
||||
return limiter
|
||||
|
||||
@pytest.fixture
|
||||
def rate_limit_config():
|
||||
"""Standard rate limit config for tests."""
|
||||
return RateLimitConfig(
|
||||
limit=10,
|
||||
window_size=60,
|
||||
)
|
||||
|
||||
@pytest.fixture
|
||||
def mock_request():
|
||||
"""Create a mock request."""
|
||||
def _create(ip="127.0.0.1", path="/test"):
|
||||
request = MagicMock()
|
||||
request.client.host = ip
|
||||
request.url.path = path
|
||||
request.method = "GET"
|
||||
request.headers = {}
|
||||
return request
|
||||
return _create
|
||||
|
||||
Avoiding Flaky Tests
|
||||
--------------------
|
||||
|
||||
Rate limiting tests can be flaky due to timing. Tips:
|
||||
|
||||
1. **Use short windows for tests:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@rate_limit(10, 1) # 10 per second, not 10 per minute
|
||||
|
||||
2. **Mock time instead of sleeping:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
with patch('time.time', return_value=future_time):
|
||||
# Test window reset
|
||||
|
||||
3. **Reset state between tests:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
async def reset_limiter():
|
||||
yield
|
||||
limiter = get_limiter()
|
||||
await limiter.backend.clear()
|
||||
|
||||
4. **Use unique keys per test:**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_something(mock_request):
|
||||
request = mock_request(ip=f"test-{uuid.uuid4()}")
|
||||
Reference in New Issue
Block a user