release: bump version to 0.3.0

- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
2026-03-17 20:55:38 +00:00
parent 492410614f
commit f3453cb0fc
51 changed files with 6507 additions and 166 deletions
--- a/docs/user-guide/algorithms.rst
+++ b/docs/user-guide/algorithms.rst
@@ -0,0 +1,290 @@
+Rate Limiting Algorithms
+========================
+
+FastAPI Traffic ships with five rate limiting algorithms. Each has its own strengths,
+and picking the right one depends on what you're trying to achieve.
+
+This guide will help you understand the tradeoffs and choose wisely.
+
+Overview
+--------
+
+Here's the quick comparison:
+
+.. list-table::
+   :header-rows: 1
+   :widths: 20 40 40
+
+   * - Algorithm
+     - Best For
+     - Tradeoffs
+   * - **Token Bucket**
+     - APIs that need burst handling
+     - Allows temporary spikes above average rate
+   * - **Sliding Window**
+     - Precise rate limiting
+     - Higher memory usage
+   * - **Fixed Window**
+     - Simple, low-overhead limiting
+     - Boundary issues (2x burst at window edges)
+   * - **Leaky Bucket**
+     - Consistent throughput
+     - No burst handling
+   * - **Sliding Window Counter**
+     - General purpose (default)
+     - Good balance of precision and efficiency
+
+Token Bucket
+------------
+
+Think of this as a bucket that holds tokens. Each request consumes a token, and
+tokens refill at a steady rate. If the bucket is empty, requests are rejected.
+
+.. code-block:: python
+
+   from fastapi_traffic import rate_limit, Algorithm
+
+   @app.get("/api/data")
+   @rate_limit(
+       100,  # 100 tokens refill per minute
+       60,
+       algorithm=Algorithm.TOKEN_BUCKET,
+       burst_size=20,  # bucket can hold up to 20 tokens
+   )
+   async def get_data(request: Request):
+       return {"data": "here"}
+
+**How it works:**
+
+1. The bucket starts full (at ``burst_size`` capacity)
+2. Each request removes one token
+3. Tokens refill at ``limit / window_size`` per second
+4. If no tokens are available, the request is rejected
+
+**When to use it:**
+
+- Your API has legitimate burst traffic (e.g., page loads that trigger multiple requests)
+- You want to allow short spikes while maintaining an average rate
+- Mobile apps that batch requests when coming online
+
+**Example scenario:** A mobile app that syncs data when it reconnects. You want to
+allow it to catch up quickly, but not overwhelm your servers.
+
+Sliding Window
+--------------
+
+This algorithm tracks the exact timestamp of every request within the window. It's
+the most accurate approach, but uses more memory.
+
+.. code-block:: python
+
+   @app.get("/api/transactions")
+   @rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
+   async def get_transactions(request: Request):
+       return {"transactions": []}
+
+**How it works:**
+
+1. Every request timestamp is stored
+2. When checking, we count requests in the last ``window_size`` seconds
+3. Old timestamps are cleaned up automatically
+
+**When to use it:**
+
+- You need precise rate limiting (financial APIs, compliance requirements)
+- Memory isn't a major concern
+- The rate limit is relatively low (not millions of requests)
+
+**Tradeoffs:**
+
+- Memory usage grows with request volume
+- Slightly more CPU for timestamp management
+
+Fixed Window
+------------
+
+The simplest algorithm. Divide time into fixed windows (e.g., every minute) and
+count requests in each window.
+
+.. code-block:: python
+
+   @app.get("/api/simple")
+   @rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
+   async def simple_endpoint(request: Request):
+       return {"status": "ok"}
+
+**How it works:**
+
+1. Time is divided into fixed windows (0:00-1:00, 1:00-2:00, etc.)
+2. Each request increments the counter for the current window
+3. When the window changes, the counter resets
+
+**When to use it:**
+
+- You want the simplest, most efficient option
+- Slight inaccuracy at window boundaries is acceptable
+- High-volume scenarios where memory matters
+
+**The boundary problem:**
+
+A client could make 100 requests at 0:59 and another 100 at 1:01, effectively
+getting 200 requests in 2 seconds. If this matters for your use case, use
+sliding window counter instead.
+
+Leaky Bucket
+------------
+
+Imagine a bucket with a hole in the bottom. Requests fill the bucket, and it
+"leaks" at a constant rate. If the bucket overflows, requests are rejected.
+
+.. code-block:: python
+
+   @app.get("/api/steady")
+   @rate_limit(
+       100,
+       60,
+       algorithm=Algorithm.LEAKY_BUCKET,
+       burst_size=10,  # bucket capacity
+   )
+   async def steady_endpoint(request: Request):
+       return {"status": "ok"}
+
+**How it works:**
+
+1. The bucket has a maximum capacity (``burst_size``)
+2. Each request adds "water" to the bucket
+3. Water leaks out at ``limit / window_size`` per second
+4. If the bucket would overflow, the request is rejected
+
+**When to use it:**
+
+- You need consistent, smooth throughput
+- Downstream systems can't handle bursts
+- Processing capacity is truly fixed (e.g., hardware limitations)
+
+**Difference from token bucket:**
+
+- Token bucket allows bursts up to the bucket size
+- Leaky bucket smooths out traffic to a constant rate
+
+Sliding Window Counter
+----------------------
+
+This is the default algorithm, and it's a good choice for most use cases. It
+combines the efficiency of fixed windows with better accuracy.
+
+.. code-block:: python
+
+   @app.get("/api/default")
+   @rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW_COUNTER)
+   async def default_endpoint(request: Request):
+       return {"status": "ok"}
+
+**How it works:**
+
+1. Maintains counters for the current and previous windows
+2. Calculates a weighted average based on how far into the current window we are
+3. At 30 seconds into a 60-second window: ``count = prev_count * 0.5 + curr_count``
+
+**When to use it:**
+
+- General purpose rate limiting
+- You want better accuracy than fixed window without the memory cost of sliding window
+- Most APIs fall into this category
+
+**Why it's the default:**
+
+It gives you 90% of the accuracy of sliding window with the memory efficiency of
+fixed window. Unless you have specific requirements, this is probably what you want.
+
+Choosing the Right Algorithm
+----------------------------
+
+Here's a decision tree:
+
+1. **Do you need to allow bursts?**
+   
+   - Yes → Token Bucket
+   - No, I need smooth traffic → Leaky Bucket
+
+2. **Do you need exact precision?**
+   
+   - Yes, compliance/financial → Sliding Window
+   - No, good enough is fine → Continue
+
+3. **Is memory a concern?**
+   
+   - Yes, high volume → Fixed Window
+   - No → Sliding Window Counter (default)
+
+Performance Comparison
+----------------------
+
+All algorithms are O(1) for the check operation, but they differ in storage:
+
+.. list-table::
+   :header-rows: 1
+
+   * - Algorithm
+     - Storage per Key
+     - Operations
+   * - Token Bucket
+     - 2 floats
+     - 1 read, 1 write
+   * - Sliding Window
+     - N timestamps
+     - 1 read, 1 write, cleanup
+   * - Fixed Window
+     - 1 int, 1 float
+     - 1 read, 1 write
+   * - Leaky Bucket
+     - 2 floats
+     - 1 read, 1 write
+   * - Sliding Window Counter
+     - 3 values
+     - 1 read, 1 write
+
+For most applications, the performance difference is negligible. Choose based on
+behavior, not performance, unless you're handling millions of requests per second.
+
+Code Examples
+-------------
+
+Here's a complete example showing all algorithms:
+
+.. code-block:: python
+
+   from fastapi import FastAPI, Request
+   from fastapi_traffic import rate_limit, Algorithm
+
+   app = FastAPI()
+
+   # Burst-friendly endpoint
+   @app.get("/api/burst")
+   @rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=25)
+   async def burst_endpoint(request: Request):
+       return {"type": "token_bucket"}
+
+   # Precise limiting
+   @app.get("/api/precise")
+   @rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
+   async def precise_endpoint(request: Request):
+       return {"type": "sliding_window"}
+
+   # Simple and efficient
+   @app.get("/api/simple")
+   @rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
+   async def simple_endpoint(request: Request):
+       return {"type": "fixed_window"}
+
+   # Smooth throughput
+   @app.get("/api/steady")
+   @rate_limit(100, 60, algorithm=Algorithm.LEAKY_BUCKET)
+   async def steady_endpoint(request: Request):
+       return {"type": "leaky_bucket"}
+
+   # Best of both worlds (default)
+   @app.get("/api/balanced")
+   @rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW_COUNTER)
+   async def balanced_endpoint(request: Request):
+       return {"type": "sliding_window_counter"}