Beginner 10 min HTTP 429

429 Too Many Requests — Rate Limiting Hits

Симптомы

- API returns HTTP 429 with a `Retry-After` header indicating wait time
- Response body contains rate limit details: `{"error":"rate_limit_exceeded","remaining":0,"reset":1709251200}`
- Requests succeed normally, then suddenly all fail for a rolling time window
- Different API keys or OAuth tokens hit limits independently at different rates
- Logs show bursts of 429s followed by recovery once the window resets

Первопричины

  • Sending requests in a tight loop without any delay between calls
  • Sharing a single API key across multiple workers or processes without coordination
  • Retry logic that immediately retries on failure, compounding the rate limit violation
  • Not reading or respecting the `Retry-After` or `X-RateLimit-Reset` response headers
  • Burst traffic from batch jobs scheduled to run simultaneously (e.g., top of the hour)

Диагностика

1. **Inspect the response headers** to understand the rate limit policy:
```bash
curl -i -X GET https://api.example.com/endpoint \
-H 'Authorization: Bearer YOUR_TOKEN'
# Look for: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After
```

2. **Decode the reset timestamp** if it is epoch-based:
```bash
date -d @1709251200 # Linux
date -r 1709251200 # macOS
```

3. **Calculate your actual request rate** by counting logs over a rolling window:
```bash
grep 'POST /api/' access.log | awk '{print $4}' | cut -c1-17 | sort | uniq -c
```

4. **Check if multiple processes share the same key** — grep your process list or config:
```bash
grep -r 'API_KEY' .env* config/ | head -20
ps aux | grep worker
```

5. **Simulate the limit** with a quick burst test to confirm the threshold:
```bash
for i in $(seq 1 20); do
curl -s -o /dev/null -w '%{http_code}\n' https://api.example.com/endpoint \
-H 'Authorization: Bearer YOUR_TOKEN'
done
```

Решение

**1. Respect the `Retry-After` header with exponential backoff:**
```python
import time, random, httpx

def call_with_retry(url: str, headers: dict, max_retries: int = 5) -> httpx.Response:
for attempt in range(max_retries):
resp = httpx.get(url, headers=headers)
if resp.status_code == 429:
retry_after = int(resp.headers.get('Retry-After', 2 ** attempt))
jitter = random.uniform(0, 1)
time.sleep(retry_after + jitter)
continue
resp.raise_for_status()
return resp
raise RuntimeError('Max retries exceeded')
```

**2. Add a token bucket or leaky bucket in front of outbound calls:**
```python
import asyncio
from asyncio import Semaphore

# Allow at most 10 concurrent requests
sem = Semaphore(10)

async def safe_fetch(client, url):
async with sem:
return await client.get(url)
```

**3. Distribute load across multiple API keys (if the provider allows):**
```python
import itertools
keys = ['key_a', 'key_b', 'key_c']
key_pool = itertools.cycle(keys)
headers = {'Authorization': f'Bearer {next(key_pool)}'}
```

**4. Cache responses so repeated calls do not hit the API:**
```python
from django.core.cache import cache
def get_user_data(user_id: int) -> dict:
key = f'api:user:{user_id}'
cached = cache.get(key)
if cached:
return cached
data = call_api(user_id)
cache.set(key, data, timeout=300)
return data
```

Профилактика

- **Track your usage** against the documented quota before hitting limits; use `X-RateLimit-Remaining` to self-throttle before reaching zero
- **Stagger batch jobs** using randomised start times or a queue (Celery, django-tasks) rather than cron jobs that fire simultaneously
- **Cache aggressively** for read-heavy endpoints — a 60-second cache can reduce outbound calls by 99% for popular resources
- **Set up alerts** when `X-RateLimit-Remaining` drops below 20% so you can investigate before requests start failing

Связанные коды состояния

Связанные термины