Advanced 15 min HTTP 502

502 Bad Gateway — WebSocket Upgrade Failure Through Proxy

Síntomas

- WebSocket connections fail with a 502 error during the initial handshake — the connection never upgrades to WebSocket protocol
- The application's WebSocket feature works when connecting directly to the backend server (bypassing the proxy) but fails through Nginx, HAProxy, or ALB
- The proxy access log shows the initial HTTP GET/Upgrade request but returns 502 instead of 101 Switching Protocols
- Browser console shows: `WebSocket connection to 'wss://...' failed: Error during WebSocket handshake: Unexpected response code: 502`
- The WebSocket works initially after deployment but connections start dropping after the proxy's `proxy_read_timeout` (e.g., 60 seconds of idle time)

Causas raíz

  • Nginx proxy configuration missing the `proxy_set_header Upgrade $http_upgrade` and `proxy_set_header Connection 'upgrade'` directives — without these, Nginx does not forward the Upgrade header to the backend
  • Load balancer configured for HTTP/2 only mode which does not support WebSocket protocol upgrades the same way HTTP/1.1 does
  • Proxy `proxy_read_timeout` set too short for long-lived WebSocket connections — Nginx closes the connection after the timeout even though the WebSocket is idle but still active
  • CDN (e.g., Cloudflare free plan) not supporting WebSocket, or the WebSocket feature being disabled in the CDN configuration
  • Multiple proxy hops (e.g., CDN → Load Balancer → Nginx) where one intermediate layer strips the Upgrade header, breaking the protocol negotiation chain

Diagnóstico

**Step 1 — Check the WebSocket handshake request and response headers**

Open DevTools → Network tab → filter by WS → click the failed connection:
```
# Expected request headers for WebSocket upgrade:
GET /ws/chat/ HTTP/1.1
Host: example.com
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

# Expected response for successful upgrade:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

# Failing: gets 502 instead of 101
```

**Step 2 — Test the WebSocket handshake directly with curl**

```bash
# Test through Nginx proxy:
curl -v -N \
-H 'Connection: Upgrade' \
-H 'Upgrade: websocket' \
-H 'Sec-WebSocket-Version: 13' \
-H 'Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==' \
http://example.com/ws/chat/
# 502 here → proxy issue. 101 here → proxy is fine, client code is the issue.

# Test directly on the backend (bypassing proxy):
curl -v -N \
-H 'Connection: Upgrade' \
-H 'Upgrade: websocket' \
http://127.0.0.1:8000/ws/chat/
# 101 here → backend works, proxy is misconfigured
```

**Step 3 — Check Nginx proxy configuration for Upgrade headers**

```bash
ssh server
grep -r 'upgrade\|websocket' /etc/nginx/sites-enabled/ 2>/dev/null
# If nothing found → WebSocket proxy headers are missing
grep -r 'proxy_read_timeout' /etc/nginx/sites-enabled/
# Default: 60s — may cause idle WebSocket connections to drop
```

**Step 4 — Check the Nginx error log for specific errors**

```bash
sudo tail -f /var/log/nginx/error.log
# Look for:
# upstream sent invalid header while reading response header from upstream
# connect() failed (111: Connection refused) while connecting to upstream
# recv() failed (104: Connection reset by peer)
```

**Step 5 — Verify CDN WebSocket support**

```bash
# Cloudflare: Check if WebSockets are enabled
# Dashboard → Your Domain → Network → WebSockets → Enabled
# Or check via API:
curl -s -X GET \
'https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/websockets' \
-H 'Authorization: Bearer {cf_token}' | jq '.result.value'
# 'on' = enabled, 'off' = WebSocket upgrade will fail at CDN
```

Resolución

**Fix 1: Add WebSocket proxy headers to Nginx configuration**

```nginx
# /etc/nginx/sites-available/example.com
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}

server {
listen 80;
server_name example.com;

location /ws/ {
proxy_pass http://127.0.0.1:8000;
proxy_http_version 1.1; # Required for WebSocket
proxy_set_header Upgrade $http_upgrade; # Forward the Upgrade header
proxy_set_header Connection $connection_upgrade; # 'upgrade' or 'close'
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_read_timeout 3600s; # 1 hour for long-lived connections
proxy_send_timeout 3600s;
}

location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
}
}
```
```bash
sudo nginx -t && sudo systemctl reload nginx
```

**Fix 2: AWS ALB — enable WebSocket stickiness**

```bash
# AWS ALB natively supports WebSocket — no extra config needed
# But ensure the target group uses HTTP (not HTTPS) for WebSocket backend
# And enable sticky sessions if your WebSocket state is not shared:
aws elbv2 create-rule --listener-arn <listener-arn> \
--conditions '[{"Field":"path-pattern","Values":["/ws/*"]}]' \
--actions '[{"Type":"forward","TargetGroupArn":"<ws-target-group>"}]' \
--priority 10
```

**Fix 3: Cloudflare — enable WebSockets for the zone**

```bash
curl -X PATCH \
'https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/websockets' \
-H 'Authorization: Bearer {cf_token}' \
-H 'Content-Type: application/json' \
--data '{"value": "on"}'
```

**Fix 4: Add WebSocket keep-alive pings to prevent idle timeout**

```python
# Django Channels — configure ping interval in consumers.py
import asyncio
from channels.generic.websocket import AsyncWebsocketConsumer

class ChatConsumer(AsyncWebsocketConsumer):
async def connect(self):
await self.accept()
self.ping_task = asyncio.ensure_future(self.keep_alive())

async def keep_alive(self):
while True:
await asyncio.sleep(30) # Ping every 30s to prevent idle timeout
await self.send(text_data='{"type": "ping"}')

async def disconnect(self, close_code):
self.ping_task.cancel()
```

Prevención

- **Include WebSocket proxy configuration in your Nginx template from day one** — the `map $http_upgrade $connection_upgrade` block and `proxy_set_header Upgrade` directives should be in every reverse proxy config for WebSocket-capable apps
- **Set `proxy_read_timeout` well above the WebSocket idle timeout** — WebSocket connections can be idle for minutes between messages; the default 60s timeout will cause spurious disconnects
- **Test the full WebSocket path (client → CDN → proxy → backend) in staging** before production — use `wscat` or a simple test client to verify the 101 handshake succeeds through every layer
- **Implement application-level WebSocket heartbeats** (ping/pong frames every 20-30 seconds) so intermediate proxies and NAT gateways keep the connection alive and you get immediate detection of dropped connections

Códigos de estado relacionados

Términos relacionados