Topic: mcp server request smuggling security

MCP server request smuggling security — HTTP/1.1 CL-TE ambiguity, chunked encoding, pipelined isolation

HTTP request smuggling arises when two systems on the same TCP connection — an MCP client, a reverse proxy, and the MCP server backend — disagree on where one HTTP request ends and the next begins. The disagreement is caused by the two body-length signaling mechanisms in HTTP/1.1: Content-Length and Transfer-Encoding: chunked. When both are present, RFC 7230 says to prefer Transfer-Encoding and ignore Content-Length — but many proxies and servers implement this rule differently. The discrepancy lets an attacker "smuggle" the beginning of a second request inside the body of the first, causing the backend to interpret one request as two, which can bypass authentication, poison request queues, or read another session's response.

Quick reference

CL-TE ambiguity: Reject any request that contains both Content-Length and Transfer-Encoding headers with a 400 response — do not attempt to resolve the conflict.
Pipelining: HTTP/1.1 pipelining lets multiple requests share a TCP connection sequentially; disable keep-alive or enforce strict sequential processing to prevent session state interleaving.
Reverse proxy hygiene: Front proxies must normalize hop-by-hop headers and strip attacker-injected Transfer-Encoding obfuscations before forwarding to the MCP backend.
WebSocket upgrades: The Upgrade header in HTTP/1.1 can be injected into a smuggled request prefix; validate upgrade headers strictly and reject unexpected Connection header values.
HTTP/2: Migrating the MCP transport to HTTP/2 eliminates CL-TE ambiguity entirely — HTTP/2 uses a single binary framing layer with no competing body-length mechanisms.

1. CL-TE and TE-CL ambiguity — the HTTP/1.1 request smuggling root cause

RFC 7230 §3.3.3 is unambiguous: if a request has both Content-Length and Transfer-Encoding, servers must ignore Content-Length and use Transfer-Encoding to determine message body length. Despite this, many reverse proxies and origin servers implement the rule differently. A CL-TE smuggle exploits a proxy that prioritizes Content-Length (so it forwards exactly N bytes) while the backend prioritizes Transfer-Encoding (so it reads until a zero-length chunk). The "extra" bytes the backend reads after the zero chunk are the prefix of the next request — which the backend processes as a new, attacker-authored request.

The TE-CL variant is the mirror image: the proxy uses Transfer-Encoding and the backend uses Content-Length. Either way, the result is one physical TCP request being parsed as two logical HTTP requests with different boundaries at the proxy and at the backend.

# CL-TE smuggle: proxy uses Content-Length (13), backend uses chunked encoding.
# Proxy forwards exactly 13 bytes of body; backend reads chunk header "0\r\n\r\n"
# (5 bytes = the zero-length terminator), leaving "GET /admin..." in its buffer
# as the start of the next request it will process.

POST /tools/call HTTP/1.1
Host: mcp.example.com
Content-Length: 13
Transfer-Encoding: chunked

0

GET /admin HTTP/1.1
Host: mcp.example.com

# TE-CL smuggle: proxy uses Transfer-Encoding; backend uses Content-Length.
# Proxy forwards the full chunked body; backend reads only Content-Length bytes,
# leaving the remainder as a new request prefix.

POST /tools/call HTTP/1.1
Host: mcp.example.com
Content-Length: 3
Transfer-Encoding: chunked

8
SMUGGLED
0

In an MCP context, a successful smuggle against a shared MCP proxy can cause tool-call responses belonging to one session to be delivered to another session, or can inject attacker-controlled tool-call arguments into a victim session's request queue — a session hijacking primitive without requiring any credential theft.

2. Rejecting ambiguous requests in Express

The correct defense at the application layer is to reject any request that contains both Content-Length and Transfer-Encoding outright with a 400 Bad Request. Do not attempt to resolve the ambiguity — the RFC-correct resolution (prefer TE) is not universally implemented by front proxies, so the only safe response is rejection. This must be applied as early middleware, before any body parser runs:

import express, { Request, Response, NextFunction } from 'express';

const app = express();

// Reject CL-TE and TE-CL ambiguity as the very first middleware.
// Must run before body parsers (express.json, express.raw, etc.)
app.use((req: Request, res: Response, next: NextFunction) => {
  const hasContentLength = 'content-length' in req.headers;
  const hasTransferEncoding = 'transfer-encoding' in req.headers;

  if (hasContentLength && hasTransferEncoding) {
    res.status(400).json({
      error: 'Ambiguous request: Content-Length and Transfer-Encoding both present',
      code: 'CL_TE_CONFLICT',
    });
    return;
  }

  // Also reject obfuscated Transfer-Encoding values used to bypass proxy parsing.
  // Proxies often strip "Transfer-Encoding: chunked" but may pass
  // "Transfer-Encoding: chunked, identity" or "Transfer-Encoding: xchunked".
  const te = req.headers['transfer-encoding'];
  if (te && te !== 'chunked' && te !== 'identity') {
    res.status(400).json({
      error: `Unrecognized Transfer-Encoding value: ${te}`,
      code: 'INVALID_TRANSFER_ENCODING',
    });
    return;
  }

  next();
});

// Body parser runs only after the ambiguity check passes
app.use(express.json({ limit: '1mb' }));

// MCP tool endpoint
app.post('/tools/call', authenticate, mcpToolCallHandler);

In addition to rejecting the ambiguous request, log the source IP and the session ID when the check fires. In a shared MCP deployment, a burst of CL-TE conflict rejections from a single source is a strong signal of active smuggling probe activity and warrants rate-limiting or blocking that source.

3. HTTP pipelining risk with MCP sessions

HTTP/1.1 pipelining allows a client to send multiple requests over a single TCP connection without waiting for each response. The server must respond to pipelined requests in order — but the association between request and response is entirely positional, not tagged. If the server processes pipelined requests from different sessions on the same connection (via a multiplexing proxy), a session context leak becomes possible: request N's session state influences the processing of request N+1.

MCP session state — the active tool list, resource subscriptions, sampling context, and authentication credentials — is especially sensitive. An MCP server that processes pipelined requests from a shared proxy connection may mix session state between callers if the tool handler reads from a connection-scoped store rather than a request-scoped one.

The safest mitigation at the MCP server level is to disable HTTP keep-alive, forcing each MCP exchange onto its own TCP connection and eliminating the shared-connection attack surface entirely. The performance trade-off is worth it unless the MCP server is on a high-throughput path where connection setup cost dominates:

import http from 'node:http';
import express from 'express';

const app = express();
// ... middleware and routes ...

const server = http.createServer(app);

// Disable HTTP keep-alive: each request gets its own TCP connection.
// This prevents pipelining and eliminates shared-connection request interleaving.
server.keepAliveTimeout = 0;

// If you need keep-alive for performance, enforce strict sequential processing
// by setting the maximum number of requests per connection to 1.
// (node:http exposes maxRequestsPerSocket from Node.js 16.10.0 onward)
server.maxRequestsPerSocket = 1;

server.listen(3000, () => {
  console.log('MCP HTTP server listening on :3000 (keep-alive disabled)');
});

// Alternative: set Connection: close on every response to signal to proxies
// and clients not to reuse the connection.
app.use((_req, res, next) => {
  res.set('Connection', 'close');
  next();
});

If disabling keep-alive is not viable, ensure all session state is stored in a per-request context (e.g., Express res.locals) and never in connection-scoped or module-scoped variables. Use async-context tracking (node:async_hooks AsyncLocalStorage) to bind session state to the request lifecycle rather than to the TCP connection or module scope.

4. Reverse proxy awareness and hop-by-hop header hygiene

A common MCP deployment topology is: client → Nginx or Caddy (TLS termination + rate limiting) → Node.js MCP server. Request smuggling vulnerabilities most often arise at the boundary between the proxy and the origin, not at either end independently. The CL-TE check in the Express middleware above defends the origin — but the proxy must also be configured to not silently resolve the CL-TE conflict before forwarding.

Nginx's default behavior in proxy_pass mode is to rewrite requests to HTTP/1.0 or to strip Transfer-Encoding: chunked and substitute Content-Length, which prevents the classic CL-TE smuggle but can hide obfuscated TE headers from the origin. The critical configuration is to ensure Nginx strips all hop-by-hop headers — including Transfer-Encoding, Connection, and Keep-Alive — from forwarded requests, so the origin server sees only well-formed HTTP/1.1 with no ambiguous framing. The X-Request-Id header injected by Nginx threads through to the MCP server's access log, enabling trace-back of smuggled requests to the originating TCP connection and client IP:

# nginx.conf — MCP reverse proxy configuration

upstream mcp_backend {
    server 127.0.0.1:3000;
    keepalive 0;  # disable connection pooling to the MCP backend;
                  # ensures each proxied request uses a fresh backend connection
}

server {
    listen 443 ssl http2;
    server_name mcp.example.com;

    # Strip hop-by-hop and potentially dangerous headers before forwarding
    proxy_set_header Connection "";         # removes hop-by-hop Connection header
    proxy_set_header Transfer-Encoding "";  # Nginx rewrites body framing anyway
    proxy_set_header X-Request-Id $request_id;  # traceability across proxy hops

    # Force HTTP/1.1 to the backend (Nginx handles chunked <-> CL translation)
    proxy_http_version 1.1;

    location /tools/ {
        proxy_pass http://mcp_backend;

        # Prevent Nginx from buffering request bodies — important for streaming MCP
        proxy_request_buffering off;

        # Hard limit on request size to prevent body amplification
        client_max_body_size 2m;
    }
}

When a smuggled request surfaces as an unexpected tool call in your logs, the X-Request-Id value (which Nginx generates as a unique hex string per request via $request_id) lets you correlate it back to the physical TCP connection. Log the header value from both the Nginx access log and the Express request handler, and alert on tool calls that arrive without a valid X-Request-Id — those came in on a path that bypassed the proxy entirely.

5. Migrating to HTTP/2 to eliminate the vulnerability class

HTTP/2 does not have a Content-Length vs. Transfer-Encoding ambiguity. HTTP/2 uses a binary framing layer where each frame carries an explicit length field in its frame header, and DATA frames on a stream carry the body. There is no chunked transfer encoding in HTTP/2, and if a request includes a Transfer-Encoding header it is a stream error. CL-TE request smuggling is structurally impossible in a pure HTTP/2 deployment.

The caveat is HTTP/2 downgrade: if Nginx speaks HTTP/2 to clients but HTTP/1.1 to the backend (the most common configuration), the proxy must faithfully translate semantics. The most robust fix is to run HTTP/2 end-to-end between the MCP client, the proxy, and the origin. The example below shows a minimal node:http2 server for an MCP tool endpoint, with a migration path from HTTP/1.1:

import http2 from 'node:http2';
import fs from 'node:fs';

// Express does not natively support http2 — use a compatibility shim
// or switch to a framework with native h2 support (e.g., Fastify + @fastify/http2).
// This example uses the low-level node:http2 API directly.

const server = http2.createSecureServer({
  key: fs.readFileSync('./tls/key.pem'),
  cert: fs.readFileSync('./tls/cert.pem'),
  // ALPN negotiation: advertise h2 only; reject http/1.1 fallback
  allowHTTP1: false,
});

server.on('stream', (stream, headers) => {
  const method = headers[':method'];
  const path   = headers[':path'];

  // HTTP/2 uses pseudo-headers (:method, :path) instead of a request line
  if (method !== 'POST' || path !== '/tools/call') {
    stream.respond({ ':status': 404 });
    stream.end();
    return;
  }

  // Transfer-Encoding is not valid in HTTP/2 — reject if present (malformed client)
  if (headers['transfer-encoding']) {
    stream.respond({ ':status': 400 });
    stream.end(JSON.stringify({ error: 'Transfer-Encoding not valid in HTTP/2' }));
    return;
  }

  // Collect DATA frames into a body buffer
  const chunks: Buffer[] = [];
  stream.on('data', (chunk: Buffer) => chunks.push(chunk));
  stream.on('end', () => {
    const body = Buffer.concat(chunks).toString('utf8');
    let parsed: unknown;
    try {
      parsed = JSON.parse(body);
    } catch {
      stream.respond({ ':status': 400 });
      stream.end(JSON.stringify({ error: 'Invalid JSON body' }));
      return;
    }

    const response = handleToolCall(parsed);
    stream.respond({ ':status': 200, 'content-type': 'application/json' });
    stream.end(JSON.stringify(response));
  });
});

server.listen(3000, () => {
  console.log('MCP HTTP/2 server on :3000 — CL-TE smuggling structurally impossible');
});

// Migration path from HTTP/1.1:
// 1. Deploy the HTTP/2 listener on a new port (e.g., 3001) alongside the existing one.
// 2. Update Nginx upstream to use h2c (cleartext HTTP/2) via the grpc_pass directive
//    or proxy_pass with http2=on (Nginx >= 1.25.1).
// 3. Run both listeners in parallel behind a feature flag per client.
// 4. Migrate MCP clients to the HTTP/2 endpoint incrementally.
// 5. Retire the HTTP/1.1 listener once all clients have migrated.

function handleToolCall(_body: unknown): object {
  // ... MCP tool dispatch logic ...
  return { result: 'ok' };
}

SkillAudit's Security axis probes MCP HTTP servers with CL-TE conflicting headers and observes whether the server rejects them with a 400 or processes them silently. A server that accepts both headers without rejection is flagged as High severity — the issue is a precondition for request smuggling even if the server is not currently behind a vulnerable proxy, because deployment configurations change independently of application code. Servers running on HTTP/2 with allowHTTP1: false receive a pass on this check automatically. Run a free audit at skillaudit.dev to check your MCP HTTP transport configuration.