release: bump version to 0.3.0
- Refactor Redis backend connection handling and pool management - Update algorithm implementations with improved type annotations - Enhance config loader validation with stricter Pydantic schemas - Improve decorator and middleware error handling - Expand example scripts with better docstrings and usage patterns - Add new 00_basic_usage.py example for quick start - Reorganize examples directory structure - Fix type annotation inconsistencies across core modules - Update dependencies in pyproject.toml
This commit is contained in:
290
docs/user-guide/algorithms.rst
Normal file
290
docs/user-guide/algorithms.rst
Normal file
@@ -0,0 +1,290 @@
|
||||
Rate Limiting Algorithms
|
||||
========================
|
||||
|
||||
FastAPI Traffic ships with five rate limiting algorithms. Each has its own strengths,
|
||||
and picking the right one depends on what you're trying to achieve.
|
||||
|
||||
This guide will help you understand the tradeoffs and choose wisely.
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
Here's the quick comparison:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 20 40 40
|
||||
|
||||
* - Algorithm
|
||||
- Best For
|
||||
- Tradeoffs
|
||||
* - **Token Bucket**
|
||||
- APIs that need burst handling
|
||||
- Allows temporary spikes above average rate
|
||||
* - **Sliding Window**
|
||||
- Precise rate limiting
|
||||
- Higher memory usage
|
||||
* - **Fixed Window**
|
||||
- Simple, low-overhead limiting
|
||||
- Boundary issues (2x burst at window edges)
|
||||
* - **Leaky Bucket**
|
||||
- Consistent throughput
|
||||
- No burst handling
|
||||
* - **Sliding Window Counter**
|
||||
- General purpose (default)
|
||||
- Good balance of precision and efficiency
|
||||
|
||||
Token Bucket
|
||||
------------
|
||||
|
||||
Think of this as a bucket that holds tokens. Each request consumes a token, and
|
||||
tokens refill at a steady rate. If the bucket is empty, requests are rejected.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi_traffic import rate_limit, Algorithm
|
||||
|
||||
@app.get("/api/data")
|
||||
@rate_limit(
|
||||
100, # 100 tokens refill per minute
|
||||
60,
|
||||
algorithm=Algorithm.TOKEN_BUCKET,
|
||||
burst_size=20, # bucket can hold up to 20 tokens
|
||||
)
|
||||
async def get_data(request: Request):
|
||||
return {"data": "here"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. The bucket starts full (at ``burst_size`` capacity)
|
||||
2. Each request removes one token
|
||||
3. Tokens refill at ``limit / window_size`` per second
|
||||
4. If no tokens are available, the request is rejected
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- Your API has legitimate burst traffic (e.g., page loads that trigger multiple requests)
|
||||
- You want to allow short spikes while maintaining an average rate
|
||||
- Mobile apps that batch requests when coming online
|
||||
|
||||
**Example scenario:** A mobile app that syncs data when it reconnects. You want to
|
||||
allow it to catch up quickly, but not overwhelm your servers.
|
||||
|
||||
Sliding Window
|
||||
--------------
|
||||
|
||||
This algorithm tracks the exact timestamp of every request within the window. It's
|
||||
the most accurate approach, but uses more memory.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/transactions")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
|
||||
async def get_transactions(request: Request):
|
||||
return {"transactions": []}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. Every request timestamp is stored
|
||||
2. When checking, we count requests in the last ``window_size`` seconds
|
||||
3. Old timestamps are cleaned up automatically
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- You need precise rate limiting (financial APIs, compliance requirements)
|
||||
- Memory isn't a major concern
|
||||
- The rate limit is relatively low (not millions of requests)
|
||||
|
||||
**Tradeoffs:**
|
||||
|
||||
- Memory usage grows with request volume
|
||||
- Slightly more CPU for timestamp management
|
||||
|
||||
Fixed Window
|
||||
------------
|
||||
|
||||
The simplest algorithm. Divide time into fixed windows (e.g., every minute) and
|
||||
count requests in each window.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/simple")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
|
||||
async def simple_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. Time is divided into fixed windows (0:00-1:00, 1:00-2:00, etc.)
|
||||
2. Each request increments the counter for the current window
|
||||
3. When the window changes, the counter resets
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- You want the simplest, most efficient option
|
||||
- Slight inaccuracy at window boundaries is acceptable
|
||||
- High-volume scenarios where memory matters
|
||||
|
||||
**The boundary problem:**
|
||||
|
||||
A client could make 100 requests at 0:59 and another 100 at 1:01, effectively
|
||||
getting 200 requests in 2 seconds. If this matters for your use case, use
|
||||
sliding window counter instead.
|
||||
|
||||
Leaky Bucket
|
||||
------------
|
||||
|
||||
Imagine a bucket with a hole in the bottom. Requests fill the bucket, and it
|
||||
"leaks" at a constant rate. If the bucket overflows, requests are rejected.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/steady")
|
||||
@rate_limit(
|
||||
100,
|
||||
60,
|
||||
algorithm=Algorithm.LEAKY_BUCKET,
|
||||
burst_size=10, # bucket capacity
|
||||
)
|
||||
async def steady_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. The bucket has a maximum capacity (``burst_size``)
|
||||
2. Each request adds "water" to the bucket
|
||||
3. Water leaks out at ``limit / window_size`` per second
|
||||
4. If the bucket would overflow, the request is rejected
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- You need consistent, smooth throughput
|
||||
- Downstream systems can't handle bursts
|
||||
- Processing capacity is truly fixed (e.g., hardware limitations)
|
||||
|
||||
**Difference from token bucket:**
|
||||
|
||||
- Token bucket allows bursts up to the bucket size
|
||||
- Leaky bucket smooths out traffic to a constant rate
|
||||
|
||||
Sliding Window Counter
|
||||
----------------------
|
||||
|
||||
This is the default algorithm, and it's a good choice for most use cases. It
|
||||
combines the efficiency of fixed windows with better accuracy.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@app.get("/api/default")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW_COUNTER)
|
||||
async def default_endpoint(request: Request):
|
||||
return {"status": "ok"}
|
||||
|
||||
**How it works:**
|
||||
|
||||
1. Maintains counters for the current and previous windows
|
||||
2. Calculates a weighted average based on how far into the current window we are
|
||||
3. At 30 seconds into a 60-second window: ``count = prev_count * 0.5 + curr_count``
|
||||
|
||||
**When to use it:**
|
||||
|
||||
- General purpose rate limiting
|
||||
- You want better accuracy than fixed window without the memory cost of sliding window
|
||||
- Most APIs fall into this category
|
||||
|
||||
**Why it's the default:**
|
||||
|
||||
It gives you 90% of the accuracy of sliding window with the memory efficiency of
|
||||
fixed window. Unless you have specific requirements, this is probably what you want.
|
||||
|
||||
Choosing the Right Algorithm
|
||||
----------------------------
|
||||
|
||||
Here's a decision tree:
|
||||
|
||||
1. **Do you need to allow bursts?**
|
||||
|
||||
- Yes → Token Bucket
|
||||
- No, I need smooth traffic → Leaky Bucket
|
||||
|
||||
2. **Do you need exact precision?**
|
||||
|
||||
- Yes, compliance/financial → Sliding Window
|
||||
- No, good enough is fine → Continue
|
||||
|
||||
3. **Is memory a concern?**
|
||||
|
||||
- Yes, high volume → Fixed Window
|
||||
- No → Sliding Window Counter (default)
|
||||
|
||||
Performance Comparison
|
||||
----------------------
|
||||
|
||||
All algorithms are O(1) for the check operation, but they differ in storage:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - Algorithm
|
||||
- Storage per Key
|
||||
- Operations
|
||||
* - Token Bucket
|
||||
- 2 floats
|
||||
- 1 read, 1 write
|
||||
* - Sliding Window
|
||||
- N timestamps
|
||||
- 1 read, 1 write, cleanup
|
||||
* - Fixed Window
|
||||
- 1 int, 1 float
|
||||
- 1 read, 1 write
|
||||
* - Leaky Bucket
|
||||
- 2 floats
|
||||
- 1 read, 1 write
|
||||
* - Sliding Window Counter
|
||||
- 3 values
|
||||
- 1 read, 1 write
|
||||
|
||||
For most applications, the performance difference is negligible. Choose based on
|
||||
behavior, not performance, unless you're handling millions of requests per second.
|
||||
|
||||
Code Examples
|
||||
-------------
|
||||
|
||||
Here's a complete example showing all algorithms:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi_traffic import rate_limit, Algorithm
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
# Burst-friendly endpoint
|
||||
@app.get("/api/burst")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=25)
|
||||
async def burst_endpoint(request: Request):
|
||||
return {"type": "token_bucket"}
|
||||
|
||||
# Precise limiting
|
||||
@app.get("/api/precise")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
|
||||
async def precise_endpoint(request: Request):
|
||||
return {"type": "sliding_window"}
|
||||
|
||||
# Simple and efficient
|
||||
@app.get("/api/simple")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
|
||||
async def simple_endpoint(request: Request):
|
||||
return {"type": "fixed_window"}
|
||||
|
||||
# Smooth throughput
|
||||
@app.get("/api/steady")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.LEAKY_BUCKET)
|
||||
async def steady_endpoint(request: Request):
|
||||
return {"type": "leaky_bucket"}
|
||||
|
||||
# Best of both worlds (default)
|
||||
@app.get("/api/balanced")
|
||||
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW_COUNTER)
|
||||
async def balanced_endpoint(request: Request):
|
||||
return {"type": "sliding_window_counter"}
|
||||
Reference in New Issue
Block a user