release: bump version to 0.3.0

- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
2026-03-17 20:55:38 +00:00
parent 492410614f
commit f3453cb0fc
51 changed files with 6507 additions and 166 deletions
--- a/docs/advanced/distributed-systems.rst
+++ b/docs/advanced/distributed-systems.rst
@@ -0,0 +1,319 @@
+Distributed Systems
+===================
+
+Running rate limiting across multiple application instances requires careful
+consideration. This guide covers the patterns and pitfalls.
+
+The Challenge
+-------------
+
+In a distributed system, you might have:
+
+- Multiple application instances behind a load balancer
+- Kubernetes pods that scale up and down
+- Serverless functions that run independently
+
+Each instance needs to share rate limit state. Otherwise, a client could make
+100 requests to instance A and another 100 to instance B, effectively bypassing
+a 100 request limit.
+
+Redis: The Standard Solution
+----------------------------
+
+Redis is the go-to choice for distributed rate limiting:
+
+.. code-block:: python
+
+   from fastapi import FastAPI
+   from fastapi_traffic import RateLimiter
+   from fastapi_traffic.backends.redis import RedisBackend
+   from fastapi_traffic.core.limiter import set_limiter
+
+   app = FastAPI()
+
+   @app.on_event("startup")
+   async def startup():
+       backend = await RedisBackend.from_url(
+           "redis://redis-server:6379/0",
+           key_prefix="myapp:ratelimit",
+       )
+       limiter = RateLimiter(backend)
+       set_limiter(limiter)
+       await limiter.initialize()
+
+   @app.on_event("shutdown")
+   async def shutdown():
+       limiter = get_limiter()
+       await limiter.close()
+
+All instances connect to the same Redis server and share state.
+
+High Availability Redis
+-----------------------
+
+For production, you'll want Redis with high availability:
+
+**Redis Sentinel:**
+
+.. code-block:: python
+
+   backend = await RedisBackend.from_url(
+       "redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/0",
+       sentinel_master="mymaster",
+   )
+
+**Redis Cluster:**
+
+.. code-block:: python
+
+   backend = await RedisBackend.from_url(
+       "redis://node1:6379,node2:6379,node3:6379/0",
+   )
+
+Atomic Operations
+-----------------
+
+Race conditions are a real concern in distributed systems. Consider this scenario:
+
+1. Instance A reads: 99 requests made
+2. Instance B reads: 99 requests made
+3. Instance A writes: 100 requests (allows request)
+4. Instance B writes: 100 requests (allows request)
+
+Now you've allowed 101 requests when the limit was 100.
+
+FastAPI Traffic's Redis backend uses Lua scripts to make operations atomic:
+
+.. code-block:: lua
+
+   -- Simplified example of atomic check-and-increment
+   local current = redis.call('GET', KEYS[1])
+   if current and tonumber(current) >= limit then
+       return 0  -- Reject
+   end
+   redis.call('INCR', KEYS[1])
+   return 1  -- Allow
+
+The entire check-and-update happens in a single Redis operation.
+
+Network Latency
+---------------
+
+Redis adds network latency to every request. Some strategies to minimize impact:
+
+**1. Connection pooling (automatic):**
+
+The Redis backend maintains a connection pool, so you're not creating new
+connections for each request.
+
+**2. Local caching:**
+
+For very high-traffic endpoints, consider a two-tier approach:
+
+.. code-block:: python
+
+   from fastapi_traffic import MemoryBackend, RateLimiter
+
+   # Local memory backend for fast path
+   local_backend = MemoryBackend()
+   local_limiter = RateLimiter(local_backend)
+
+   # Redis backend for distributed state
+   redis_backend = await RedisBackend.from_url("redis://localhost:6379/0")
+   distributed_limiter = RateLimiter(redis_backend)
+
+   async def check_rate_limit(request: Request, config: RateLimitConfig):
+       # Quick local check (may allow some extra requests)
+       local_result = await local_limiter.check(request, config)
+       if not local_result.allowed:
+           return local_result
+       
+       # Authoritative distributed check
+       return await distributed_limiter.check(request, config)
+
+**3. Skip on error:**
+
+If Redis latency is causing issues, you might prefer to allow requests through
+rather than block:
+
+.. code-block:: python
+
+   @rate_limit(100, 60, skip_on_error=True)
+   async def endpoint(request: Request):
+       return {"status": "ok"}
+
+Handling Redis Failures
+-----------------------
+
+What happens when Redis goes down?
+
+**Fail closed (default):**
+
+Requests fail. This is safer but impacts availability.
+
+**Fail open:**
+
+Allow requests through:
+
+.. code-block:: python
+
+   @rate_limit(100, 60, skip_on_error=True)
+
+**Circuit breaker pattern:**
+
+Implement a circuit breaker to avoid hammering a failing Redis:
+
+.. code-block:: python
+
+   import time
+
+   class CircuitBreaker:
+       def __init__(self, failure_threshold=5, reset_timeout=60):
+           self.failures = 0
+           self.threshold = failure_threshold
+           self.reset_timeout = reset_timeout
+           self.last_failure = 0
+           self.open = False
+
+       def record_failure(self):
+           self.failures += 1
+           self.last_failure = time.time()
+           if self.failures >= self.threshold:
+               self.open = True
+
+       def record_success(self):
+           self.failures = 0
+           self.open = False
+
+       def should_allow(self) -> bool:
+           if not self.open:
+               return True
+           # Check if we should try again
+           if time.time() - self.last_failure > self.reset_timeout:
+               return True
+           return False
+
+Kubernetes Deployment
+---------------------
+
+Here's a typical Kubernetes setup:
+
+.. code-block:: yaml
+
+   # redis-deployment.yaml
+   apiVersion: apps/v1
+   kind: Deployment
+   metadata:
+     name: redis
+   spec:
+     replicas: 1
+     selector:
+       matchLabels:
+         app: redis
+     template:
+       metadata:
+         labels:
+           app: redis
+       spec:
+         containers:
+         - name: redis
+           image: redis:7-alpine
+           ports:
+           - containerPort: 6379
+   ---
+   apiVersion: v1
+   kind: Service
+   metadata:
+     name: redis
+   spec:
+     selector:
+       app: redis
+     ports:
+     - port: 6379
+
+.. code-block:: yaml
+
+   # app-deployment.yaml
+   apiVersion: apps/v1
+   kind: Deployment
+   metadata:
+     name: api
+   spec:
+     replicas: 3
+     selector:
+       matchLabels:
+         app: api
+     template:
+       spec:
+         containers:
+         - name: api
+           image: myapp:latest
+           env:
+           - name: REDIS_URL
+             value: "redis://redis:6379/0"
+
+Your app connects to Redis via the service name:
+
+.. code-block:: python
+
+   import os
+
+   redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0")
+   backend = await RedisBackend.from_url(redis_url)
+
+Monitoring
+----------
+
+Keep an eye on:
+
+1. **Redis latency:** High latency means slow requests
+2. **Redis memory:** Rate limit data shouldn't use much, but monitor it
+3. **Connection count:** Make sure you're not exhausting connections
+4. **Rate limit hits:** Track how often clients are being limited
+
+.. code-block:: python
+
+   import logging
+
+   logger = logging.getLogger(__name__)
+
+   def on_rate_limited(request: Request, result):
+       logger.info(
+           "Rate limited: client=%s path=%s remaining=%d",
+           request.client.host,
+           request.url.path,
+           result.info.remaining,
+       )
+
+   @rate_limit(100, 60, on_blocked=on_rate_limited)
+   async def endpoint(request: Request):
+       return {"status": "ok"}
+
+Testing Distributed Rate Limits
+-------------------------------
+
+Testing distributed behavior is tricky. Here's an approach:
+
+.. code-block:: python
+
+   import asyncio
+   import httpx
+
+   async def test_distributed_limit():
+       """Simulate requests from multiple 'instances'."""
+       async with httpx.AsyncClient() as client:
+           # Fire 150 requests concurrently
+           tasks = [
+               client.get("http://localhost:8000/api/data")
+               for _ in range(150)
+           ]
+           responses = await asyncio.gather(*tasks)
+           
+           # Count successes and rate limits
+           successes = sum(1 for r in responses if r.status_code == 200)
+           limited = sum(1 for r in responses if r.status_code == 429)
+           
+           print(f"Successes: {successes}, Rate limited: {limited}")
+           # With a limit of 100, expect ~100 successes and ~50 limited
+
+   asyncio.run(test_distributed_limit())
--- a/docs/advanced/performance.rst
+++ b/docs/advanced/performance.rst
@@ -0,0 +1,291 @@
+Performance
+===========
+
+FastAPI Traffic is designed to be fast. But when you're handling thousands of
+requests per second, every microsecond counts. Here's how to squeeze out the
+best performance.
+
+Baseline Performance
+--------------------
+
+On typical hardware, you can expect:
+
+- **Memory backend:** ~0.01ms per check
+- **SQLite backend:** ~0.1ms per check
+- **Redis backend:** ~1ms per check (network dependent)
+
+For most applications, this overhead is negligible compared to your actual
+business logic.
+
+Choosing the Right Algorithm
+----------------------------
+
+Algorithms have different performance characteristics:
+
+.. list-table::
+   :header-rows: 1
+
+   * - Algorithm
+     - Time Complexity
+     - Space Complexity
+     - Notes
+   * - Token Bucket
+     - O(1)
+     - O(1)
+     - Two floats per key
+   * - Fixed Window
+     - O(1)
+     - O(1)
+     - One int + one float per key
+   * - Sliding Window Counter
+     - O(1)
+     - O(1)
+     - Three values per key
+   * - Leaky Bucket
+     - O(1)
+     - O(1)
+     - Two floats per key
+   * - Sliding Window
+     - O(n)
+     - O(n)
+     - Stores every timestamp
+
+**Recommendation:** Use Sliding Window Counter (the default) unless you have
+specific requirements. It's O(1) and provides good accuracy.
+
+**Avoid Sliding Window for high-volume endpoints.** If you're allowing 10,000
+requests per minute, that's 10,000 timestamps to store and filter per key.
+
+Memory Backend Optimization
+---------------------------
+
+The memory backend is already fast, but you can tune it:
+
+.. code-block:: python
+
+   from fastapi_traffic import MemoryBackend
+
+   backend = MemoryBackend(
+       max_size=10000,       # Limit memory usage
+       cleanup_interval=60,  # Less frequent cleanup = less overhead
+   )
+
+**max_size:** Limits the number of keys stored. When exceeded, LRU eviction kicks
+in. Set this based on your expected number of unique clients.
+
+**cleanup_interval:** How often to scan for expired entries. Higher values mean
+less CPU overhead but more memory usage from expired entries.
+
+SQLite Backend Optimization
+---------------------------
+
+SQLite is surprisingly fast for rate limiting:
+
+.. code-block:: python
+
+   from fastapi_traffic import SQLiteBackend
+
+   backend = SQLiteBackend(
+       "rate_limits.db",
+       cleanup_interval=300,  # Clean every 5 minutes
+   )
+
+**Tips:**
+
+1. **Use an SSD.** SQLite performance depends heavily on disk I/O.
+
+2. **Put the database on a local disk.** Network-attached storage adds latency.
+
+3. **WAL mode is enabled by default.** This allows concurrent reads and writes.
+
+4. **Increase cleanup_interval** if you have many keys. Cleanup scans the entire
+   table.
+
+Redis Backend Optimization
+--------------------------
+
+Redis is the bottleneck in most distributed setups:
+
+**1. Use connection pooling (automatic):**
+
+The backend maintains a pool of connections. You don't need to do anything.
+
+**2. Use pipelining for batch operations:**
+
+If you're checking multiple rate limits, batch them:
+
+.. code-block:: python
+
+   # Instead of multiple round trips
+   result1 = await limiter.check(request, config1)
+   result2 = await limiter.check(request, config2)
+
+   # Consider combining into one check with higher cost
+   combined_config = RateLimitConfig(limit=100, window_size=60, cost=2)
+   result = await limiter.check(request, combined_config)
+
+**3. Use Redis close to your application:**
+
+Network latency is usually the biggest factor. Run Redis in the same datacenter,
+or better yet, the same availability zone.
+
+**4. Consider Redis Cluster for high throughput:**
+
+Distributes load across multiple Redis nodes.
+
+Reducing Overhead
+-----------------
+
+**1. Exempt paths that don't need limiting:**
+
+.. code-block:: python
+
+   app.add_middleware(
+       RateLimitMiddleware,
+       limit=1000,
+       window_size=60,
+       exempt_paths={"/health", "/metrics", "/ready"},
+   )
+
+**2. Use coarse-grained limits when possible:**
+
+Instead of limiting every endpoint separately, use middleware for a global limit:
+
+.. code-block:: python
+
+   # One check per request
+   app.add_middleware(RateLimitMiddleware, limit=1000, window_size=60)
+
+   # vs. multiple checks per request
+   @rate_limit(100, 60)  # Check 1
+   @another_decorator     # Check 2
+   async def endpoint():
+       pass
+
+**3. Increase window size:**
+
+Longer windows mean fewer state updates:
+
+.. code-block:: python
+
+   # Updates state 60 times per minute per client
+   @rate_limit(60, 60)
+
+   # Updates state 1 time per minute per client
+   @rate_limit(1, 1)  # Same rate, but per-second
+
+Wait, that's backwards. Actually, the number of state updates equals the number
+of requests, regardless of window size. But longer windows mean:
+
+- Fewer unique window boundaries
+- Better cache efficiency
+- More stable rate limiting
+
+**4. Skip headers when not needed:**
+
+.. code-block:: python
+
+   @rate_limit(100, 60, include_headers=False)
+
+Saves a tiny bit of response processing.
+
+Benchmarking
+------------
+
+Here's a simple benchmark script:
+
+.. code-block:: python
+
+   import asyncio
+   import time
+   from fastapi_traffic import MemoryBackend, RateLimiter, RateLimitConfig
+   from unittest.mock import MagicMock
+
+   async def benchmark():
+       backend = MemoryBackend()
+       limiter = RateLimiter(backend)
+       await limiter.initialize()
+
+       config = RateLimitConfig(limit=10000, window_size=60)
+       
+       # Mock request
+       request = MagicMock()
+       request.client.host = "127.0.0.1"
+       request.url.path = "/test"
+       request.method = "GET"
+       request.headers = {}
+
+       # Warm up
+       for _ in range(100):
+           await limiter.check(request, config)
+
+       # Benchmark
+       iterations = 10000
+       start = time.perf_counter()
+       
+       for _ in range(iterations):
+           await limiter.check(request, config)
+       
+       elapsed = time.perf_counter() - start
+       
+       print(f"Total time: {elapsed:.3f}s")
+       print(f"Per check: {elapsed/iterations*1000:.3f}ms")
+       print(f"Checks/sec: {iterations/elapsed:.0f}")
+
+       await limiter.close()
+
+   asyncio.run(benchmark())
+
+Typical output:
+
+.. code-block:: text
+
+   Total time: 0.150s
+   Per check: 0.015ms
+   Checks/sec: 66666
+
+Profiling
+---------
+
+If you suspect rate limiting is a bottleneck, profile it:
+
+.. code-block:: python
+
+   import cProfile
+   import pstats
+
+   async def profile_rate_limiting():
+       # Your rate limiting code here
+       pass
+
+   cProfile.run('asyncio.run(profile_rate_limiting())', 'rate_limit.prof')
+
+   stats = pstats.Stats('rate_limit.prof')
+   stats.sort_stats('cumulative')
+   stats.print_stats(20)
+
+Look for:
+
+- Time spent in backend operations
+- Time spent in algorithm calculations
+- Unexpected hotspots
+
+When Performance Really Matters
+-------------------------------
+
+If you're handling millions of requests per second and rate limiting overhead
+is significant:
+
+1. **Consider sampling:** Only check rate limits for a percentage of requests
+   and extrapolate.
+
+2. **Use probabilistic data structures:** Bloom filters or Count-Min Sketch can
+   approximate rate limiting with less overhead.
+
+3. **Push to the edge:** Use CDN-level rate limiting (Cloudflare, AWS WAF) to
+   handle the bulk of traffic.
+
+4. **Accept some inaccuracy:** Fixed window with ``skip_on_error=True`` is very
+   fast and "good enough" for many use cases.
+
+For most applications, though, the default configuration is plenty fast.
--- a/docs/advanced/testing.rst
+++ b/docs/advanced/testing.rst
@@ -0,0 +1,367 @@
+Testing
+=======
+
+Testing rate-limited endpoints requires some care. You don't want your tests to
+be flaky because of timing issues, and you need to verify that limits actually work.
+
+Basic Testing Setup
+-------------------
+
+Use pytest with pytest-asyncio for async tests:
+
+.. code-block:: python
+
+   # conftest.py
+   import pytest
+   from fastapi.testclient import TestClient
+   from fastapi_traffic import MemoryBackend, RateLimiter
+   from fastapi_traffic.core.limiter import set_limiter
+
+   @pytest.fixture
+   def app():
+       """Create a fresh app for each test."""
+       from myapp import create_app
+       return create_app()
+
+   @pytest.fixture
+   def client(app):
+       """Test client with fresh rate limiter."""
+       backend = MemoryBackend()
+       limiter = RateLimiter(backend)
+       set_limiter(limiter)
+       
+       with TestClient(app) as client:
+           yield client
+
+Testing Rate Limit Enforcement
+------------------------------
+
+Verify that the limit is actually enforced:
+
+.. code-block:: python
+
+   def test_rate_limit_enforced(client):
+       """Test that requests are blocked after limit is reached."""
+       # Make requests up to the limit
+       for i in range(10):
+           response = client.get("/api/data")
+           assert response.status_code == 200, f"Request {i+1} should succeed"
+
+       # Next request should be rate limited
+       response = client.get("/api/data")
+       assert response.status_code == 429
+       assert "retry_after" in response.json()
+
+Testing Rate Limit Headers
+--------------------------
+
+Check that headers are included correctly:
+
+.. code-block:: python
+
+   def test_rate_limit_headers(client):
+       """Test that rate limit headers are present."""
+       response = client.get("/api/data")
+       
+       assert "X-RateLimit-Limit" in response.headers
+       assert "X-RateLimit-Remaining" in response.headers
+       assert "X-RateLimit-Reset" in response.headers
+       
+       # Verify values make sense
+       limit = int(response.headers["X-RateLimit-Limit"])
+       remaining = int(response.headers["X-RateLimit-Remaining"])
+       
+       assert limit == 100  # Your configured limit
+       assert remaining == 99  # One request made
+
+Testing Different Clients
+-------------------------
+
+Verify that different clients have separate limits:
+
+.. code-block:: python
+
+   def test_separate_limits_per_client(client):
+       """Test that different IPs have separate limits."""
+       # Client A makes requests
+       for _ in range(10):
+           response = client.get(
+               "/api/data",
+               headers={"X-Forwarded-For": "1.1.1.1"}
+           )
+           assert response.status_code == 200
+
+       # Client A is now limited
+       response = client.get(
+           "/api/data",
+           headers={"X-Forwarded-For": "1.1.1.1"}
+       )
+       assert response.status_code == 429
+
+       # Client B should still have full quota
+       response = client.get(
+           "/api/data",
+           headers={"X-Forwarded-For": "2.2.2.2"}
+       )
+       assert response.status_code == 200
+
+Testing Window Reset
+--------------------
+
+Test that limits reset after the window expires:
+
+.. code-block:: python
+
+   import time
+   from unittest.mock import patch
+
+   def test_limit_resets_after_window(client):
+       """Test that limits reset after window expires."""
+       # Exhaust the limit
+       for _ in range(10):
+           client.get("/api/data")
+
+       # Should be limited
+       response = client.get("/api/data")
+       assert response.status_code == 429
+
+       # Fast-forward time (mock time.time)
+       with patch('time.time') as mock_time:
+           # Move 61 seconds into the future
+           mock_time.return_value = time.time() + 61
+           
+           # Should be allowed again
+           response = client.get("/api/data")
+           assert response.status_code == 200
+
+Testing Exemptions
+------------------
+
+Verify that exemptions work:
+
+.. code-block:: python
+
+   def test_exempt_paths(client):
+       """Test that exempt paths bypass rate limiting."""
+       # Exhaust limit on a regular endpoint
+       for _ in range(100):
+           client.get("/api/data")
+
+       # Regular endpoint should be limited
+       response = client.get("/api/data")
+       assert response.status_code == 429
+
+       # Health check should still work
+       response = client.get("/health")
+       assert response.status_code == 200
+
+   def test_exempt_ips(client):
+       """Test that exempt IPs bypass rate limiting."""
+       # Make many requests from exempt IP
+       for _ in range(1000):
+           response = client.get(
+               "/api/data",
+               headers={"X-Forwarded-For": "127.0.0.1"}
+           )
+           assert response.status_code == 200  # Never limited
+
+Testing with Async Client
+-------------------------
+
+For async endpoints, use httpx:
+
+.. code-block:: python
+
+   import pytest
+   import httpx
+
+   @pytest.mark.asyncio
+   async def test_async_rate_limiting():
+       """Test rate limiting with async client."""
+       async with httpx.AsyncClient(app=app, base_url="http://test") as client:
+           # Make concurrent requests
+           responses = await asyncio.gather(*[
+               client.get("/api/data")
+               for _ in range(15)
+           ])
+
+           successes = sum(1 for r in responses if r.status_code == 200)
+           limited = sum(1 for r in responses if r.status_code == 429)
+
+           assert successes == 10  # Limit
+           assert limited == 5     # Over limit
+
+Testing Backend Failures
+------------------------
+
+Test behavior when the backend fails:
+
+.. code-block:: python
+
+   from unittest.mock import AsyncMock, patch
+   from fastapi_traffic import BackendError
+
+   def test_skip_on_error(client):
+       """Test that requests are allowed when backend fails and skip_on_error=True."""
+       with patch.object(
+           MemoryBackend, 'get',
+           side_effect=BackendError("Connection failed")
+       ):
+           # With skip_on_error=True, should still work
+           response = client.get("/api/data")
+           assert response.status_code == 200
+
+   def test_fail_on_error(client):
+       """Test that requests fail when backend fails and skip_on_error=False."""
+       with patch.object(
+           MemoryBackend, 'get',
+           side_effect=BackendError("Connection failed")
+       ):
+           # With skip_on_error=False (default), should fail
+           response = client.get("/api/strict-data")
+           assert response.status_code == 500
+
+Mocking the Rate Limiter
+------------------------
+
+For unit tests, you might want to mock the rate limiter entirely:
+
+.. code-block:: python
+
+   from unittest.mock import AsyncMock, MagicMock
+   from fastapi_traffic.core.limiter import set_limiter
+   from fastapi_traffic.core.models import RateLimitInfo, RateLimitResult
+
+   def test_with_mocked_limiter(client):
+       """Test endpoint logic without actual rate limiting."""
+       mock_limiter = MagicMock()
+       mock_limiter.hit = AsyncMock(return_value=RateLimitResult(
+           allowed=True,
+           info=RateLimitInfo(
+               limit=100,
+               remaining=99,
+               reset_at=time.time() + 60,
+               window_size=60,
+           ),
+           key="test",
+       ))
+       
+       set_limiter(mock_limiter)
+       
+       response = client.get("/api/data")
+       assert response.status_code == 200
+       mock_limiter.hit.assert_called_once()
+
+Integration Testing with Redis
+------------------------------
+
+For integration tests with Redis:
+
+.. code-block:: python
+
+   import pytest
+   from fastapi_traffic.backends.redis import RedisBackend
+
+   @pytest.fixture
+   async def redis_backend():
+       """Create a Redis backend for testing."""
+       backend = await RedisBackend.from_url(
+           "redis://localhost:6379/15",  # Use a test database
+           key_prefix="test:",
+       )
+       yield backend
+       await backend.clear()  # Clean up after test
+       await backend.close()
+
+   @pytest.mark.asyncio
+   async def test_redis_rate_limiting(redis_backend):
+       """Test rate limiting with real Redis."""
+       limiter = RateLimiter(redis_backend)
+       await limiter.initialize()
+
+       config = RateLimitConfig(limit=5, window_size=60)
+       request = create_mock_request("1.1.1.1")
+
+       # Make requests up to limit
+       for _ in range(5):
+           result = await limiter.check(request, config)
+           assert result.allowed
+
+       # Next should be blocked
+       result = await limiter.check(request, config)
+       assert not result.allowed
+
+       await limiter.close()
+
+Fixtures for Common Scenarios
+-----------------------------
+
+.. code-block:: python
+
+   # conftest.py
+   import pytest
+   from fastapi_traffic import MemoryBackend, RateLimiter, RateLimitConfig
+   from fastapi_traffic.core.limiter import set_limiter
+
+   @pytest.fixture
+   def fresh_limiter():
+       """Fresh rate limiter for each test."""
+       backend = MemoryBackend()
+       limiter = RateLimiter(backend)
+       set_limiter(limiter)
+       return limiter
+
+   @pytest.fixture
+   def rate_limit_config():
+       """Standard rate limit config for tests."""
+       return RateLimitConfig(
+           limit=10,
+           window_size=60,
+       )
+
+   @pytest.fixture
+   def mock_request():
+       """Create a mock request."""
+       def _create(ip="127.0.0.1", path="/test"):
+           request = MagicMock()
+           request.client.host = ip
+           request.url.path = path
+           request.method = "GET"
+           request.headers = {}
+           return request
+       return _create
+
+Avoiding Flaky Tests
+--------------------
+
+Rate limiting tests can be flaky due to timing. Tips:
+
+1. **Use short windows for tests:**
+
+   .. code-block:: python
+
+      @rate_limit(10, 1)  # 10 per second, not 10 per minute
+
+2. **Mock time instead of sleeping:**
+
+   .. code-block:: python
+
+      with patch('time.time', return_value=future_time):
+          # Test window reset
+
+3. **Reset state between tests:**
+
+   .. code-block:: python
+
+      @pytest.fixture(autouse=True)
+      async def reset_limiter():
+          yield
+          limiter = get_limiter()
+          await limiter.backend.clear()
+
+4. **Use unique keys per test:**
+
+   .. code-block:: python
+
+      def test_something(mock_request):
+          request = mock_request(ip=f"test-{uuid.uuid4()}")