Topic: HTTP request smuggling

MCP server HTTP request smuggling security

MCP servers deployed behind a reverse proxy are exposed to HTTP request smuggling when the proxy and the upstream disagree on where one request ends and the next begins. CL.TE and TE.CL desync let an attacker prepend arbitrary content to another user's request — bypassing authentication headers the proxy added and injecting requests that appear to come from other users.

CL.TE desync: proxy reads Content-Length, upstream reads Transfer-Encoding

In a CL.TE attack the front-end proxy (Caddy, nginx, a CDN) determines the body boundary using the Content-Length header and forwards the complete request to the upstream. The upstream Node.js server — configured to prefer Transfer-Encoding: chunked — sees the chunked framing and stops reading at the chunk terminator before the CL boundary is reached. The leftover bytes become the prefix of the next request on the same keep-alive connection.

POST /mcp/tool HTTP/1.1
Host: api.example.com
Content-Length: 49
Transfer-Encoding: chunked

0

GET /admin/reset HTTP/1.1
X-Ignore: x

Caddy reads 49 bytes as the complete body and forwards the whole thing. The upstream reads the chunked body, finds the terminator 0\r\n\r\n, and considers the request done. The trailing GET /admin/reset … bytes sit in the TCP read buffer. When the next legitimate request arrives, the upstream prepends those poisoned bytes to form a new request, effectively injecting an authenticated GET /admin/reset on behalf of the next user in the queue — carrying that user's session cookie or Bearer token that the proxy injected.

The attack works because HTTP/1.1 permits both headers simultaneously. RFC 7230 §3.3.3 says Content-Length must be ignored when Transfer-Encoding is present, but the two ends of the proxy chain do not agree on which one should enforce this rule — and neither generates an error when both headers appear. The gap between their interpretations is the attack surface.

CL.TE is particularly dangerous in MCP deployments because the proxy often injects trust headers (X-User-Id, X-Tenant, X-Internal-Token) that the upstream trusts unconditionally. A smuggled request that bypasses the proxy entirely is treated as carrying those headers by the upstream, giving the attacker any privilege the proxy would have granted.

TE.CL desync: proxy reads Transfer-Encoding, upstream reads Content-Length

The mirror image: the proxy honours Transfer-Encoding: chunked, strips the chunked encoding, and forwards the request with a recalculated Content-Length to the upstream. The upstream ignores Transfer-Encoding and uses the Content-Length value to determine body length — but the attacker controls that value, so the upstream stops reading early and the remainder of the chunked body sits in the TCP buffer.

POST /mcp/tool HTTP/1.1
Host: api.example.com
Content-Length: 4
Transfer-Encoding: chunked

5e
POST /internal/token HTTP/1.1
Host: api.example.com
Content-Length: 15

{"steal": "yes"}
0

The proxy sees a single chunked request of 94 bytes (0x5e = 94) and forwards it with Content-Length: 94. The upstream reads only 4 bytes — the chunk size line 5e\r\n — as the body because Content-Length: 4 is what the attacker declared in the outer headers. The remaining 90 bytes — the inner POST /internal/token — are buffered and prepended to the next connection's request data.

The six-byte off-by-one in the chunk size is enough to land the smuggled suffix at the exact start of the next request's method field. The upstream then parses a composite request: the legitimate user's headers above the injection point, and the attacker's path and body below it. Internal token endpoints that are only supposed to be reachable from the proxy's localhost interface are suddenly reachable via this channel.

CONNECT request smuggling through established tunnels

HTTP CONNECT establishes a raw TCP tunnel through the proxy to a target host. Once the proxy returns 200 Connection Established, it forwards bytes bidirectionally without further inspection. An attacker who can issue a CONNECT request can embed a complete HTTP/1.1 request in the tunnel payload that the upstream treats as a new, fresh connection — bypassing all proxy-layer auth middleware, IP allowlisting, and header injection the proxy performs on normal requests.

CONNECT internal-service.cluster.local:80 HTTP/1.1
Host: internal-service.cluster.local:80

GET /mcp/admin/impersonate?user=root HTTP/1.1
Host: internal-service.cluster.local
Authorization: Bearer <token-from-prior-request>

After the proxy opens the tunnel, the attacker sends the embedded GET. The upstream receives it as a brand-new HTTP/1.1 request on a trusted internal socket — the proxy has already authenticated the outer CONNECT but does not re-authenticate the tunnel payload. MCP servers that expose an HTTP CONNECT endpoint for any reason are fully exposed unless the upstream explicitly rejects tunnel-borne requests from untrusted origins.

CONNECT smuggling also enables protocol confusion. Once a tunnel is open, the attacker can send WebSocket upgrade frames, raw TLS ClientHello bytes, or other protocols the upstream might handle but the proxy was never told to inspect. Administrative sidechannels (health-check endpoints, metrics endpoints, debug endpoints) accessible only over the internal network become reachable via the tunnel even when they are not on the proxy's routing table.

A related technique called "CONNECT request splitting" combines CONNECT with CL.TE: the outer CONNECT request carries a smuggled HTTP/1.1 request in its body using Content-Length desync. The proxy opens the tunnel and starts forwarding. The upstream receives the CONNECT response and then immediately sees the smuggled request — which, from its perspective, arrived on an already-trusted internal socket that the proxy authenticated. The fix is identical: either reject all CONNECT at the proxy, or enforce HTTP/2 end-to-end so that binary framing prevents the content-length ambiguity in the first place.

Proxy header sanitization checklist for MCP deployments

The following table covers the most common proxy-upstream header mismatches seen in MCP server audits. Apply each normalization at the proxy before requests reach the upstream application server.

Remove Content-Length when Transfer-Encoding: chunked is present — eliminates CL.TE surface (Caddy: header_up -Content-Length; nginx: handled automatically in HTTP mode).
Reject requests with multiple Transfer-Encoding headers — RFC 7230 §3.3.1 forbids multiple TE headers; a request with two TE headers is either malformed or a smuggling probe.
Normalize Transfer-Encoding values to lowercase canonical tokens — strip whitespace, reject unknown tokens, refuse mixed lists like gzip, chunked where chunked is not the last entry.
Reject Content-Length values that contain non-digit characters — some parsers accept Content-Length: 10 (trailing space) or Content-Length: 0x0a (hex) differently from others.
Strip proxy-internal trust headers from client requests — before processing, remove any X-User-Id, X-Roles, X-Internal-Token headers that clients could forge; add them fresh from authenticated session context.
Set Connection: close on error responses — if the proxy detects an ambiguous header, closing the connection prevents any buffered smuggled bytes from being parsed as the next request.

Detecting smuggling vulnerability in your MCP server stack

Before applying mitigations, confirm whether your proxy-upstream pair is actually vulnerable. The standard test is the differential timing probe: send a CL.TE payload where the smuggled prefix is designed to make the next request time out (by appending enough bytes that the upstream waits for more body data). If the response time for a subsequent innocent request suddenly increases, the stack is vulnerable.

A safer detection method that avoids poisoning real user requests is to test in a staging environment with a known canary endpoint. Use Burp Suite's HTTP Request Smuggler extension or the open-source smuggler.py tool to send the full battery of CL.TE, TE.CL, TE.TE (obfuscated Transfer-Encoding), and HTTP/2 downgrade variants. Many MCP servers run behind CDNs that advertise HTTP/2 to the client but downgrade to HTTP/1.1 internally — the CDN-to-origin leg is often where the vulnerability lives even though the client-to-CDN leg is HTTP/2.

# Test for CL.TE using smuggler.py (staging environment only)
python3 smuggler.py -u https://staging.example.com/mcp/tool \
  --log-level debug \
  --timeout 10 \
  --methods POST

# What to look for in output:
# [CL.TE] Response time increased by >5s on canary request → vulnerable
# [TE.CL] Canary request returned 400 or garbled → may be vulnerable
# [TE.TE] Normal timing → likely safe for this variant

Pay attention to responses that return a 400 with an unexpected body — this often means the upstream received a partial or garbled request that it rejected, which is a sign of partial desync. A 400 is better than a silently poisoned connection, but it still indicates the framing is inconsistent between the two ends.

For MCP servers running on Node.js, also check whether the upstream uses the default http.Server (which handles both CL and TE and prefers TE) or a custom parser. Some popular MCP framework libraries wrap undici or node-fetch with different header-handling defaults than the built-in server. Review the parser configuration explicitly rather than assuming default behaviour.

In addition to the smuggler.py tool, test using raw netcat or socat to send hand-crafted HTTP/1.1 bytes directly to the upstream on its internal port (from a privileged network position or in your staging environment). Send a request where Content-Length claims 10 bytes but only 6 are in the body, then send a second request. If the second request is processed correctly, the upstream discarded the leftover bytes (safe). If the second request is preceded by 4 unexpected bytes in the upstream's log, you have confirmed a partial-body carry-over and should treat the stack as CL.TE-vulnerable until the configuration is fixed and retested.

# Manual test: does upstream carry leftover bytes into the next request?
# (Run against staging upstream directly on internal port, not through the proxy)
printf 'POST /mcp/health HTTP/1.1\r\nHost: localhost\r\nContent-Length: 10\r\n\r\nABCDEF' \
  | nc localhost 3000

# Then immediately send a clean request:
printf 'GET /mcp/health HTTP/1.1\r\nHost: localhost\r\nContent-Length: 0\r\n\r\n' \
  | nc localhost 3000

# If the GET response body or status is anomalous, the 4 leftover bytes
# (ABCDEF minus the 6 sent = none; adjust counts per your test) were interpreted.

Mitigations: HTTP/2, keep-alive discipline, and header normalization

Enforce HTTP/2 end-to-end. HTTP/2 uses binary framing; each frame carries an explicit length and type field. There is no ambiguity between Content-Length and Transfer-Encoding because chunked encoding does not exist in HTTP/2. Smuggling requires HTTP/1.1 framing ambiguity — eliminating HTTP/1.1 on the proxy-to-upstream leg eliminates the attack surface entirely. Use h2c (HTTP/2 cleartext) on the internal leg even when TLS is terminated at the proxy:

# Caddy — force h2c (HTTP/2 cleartext) to upstream Node.js server
reverse_proxy localhost:3000 {
  transport http {
    versions h2c
  }
}

Confirm the upstream HTTP server supports HTTP/2. Node.js's built-in http2 module does; Express does not natively but can be wrapped with spdy or replaced with Fastify, which has h2 support built in.

Disable keep-alive between proxy and upstream. Smuggling injects into a subsequent request on the same TCP connection. If every request uses a fresh TCP connection, there is no accumulated inter-request state to smuggle into. The cost is connection overhead per request — acceptable for an internal leg where the latency is sub-millisecond.

# nginx — HTTP/1.0 to upstream disables keep-alive implicitly
proxy_http_version 1.0;
proxy_set_header Connection "";

Normalize ambiguous headers at the proxy. When both Content-Length and Transfer-Encoding appear on the same request, the proxy should strip Content-Length before forwarding, following RFC 7230 §3.3.3. Caddy's header_up directive removes upstream-bound headers:

reverse_proxy localhost:3000 {
  header_up -Content-Length
}

This prevents the upstream from ever seeing a Content-Length it can prefer over Transfer-Encoding, removing the precondition for CL.TE desync. For TE.CL desync, also reject any request that contains both headers at the proxy itself and return 400.

Disable CONNECT on public-facing endpoints. Unless your MCP server genuinely needs to proxy outbound connections, return 405 on all CONNECT requests at the ingress before they reach the upstream. In Caddy:

@connect method CONNECT
respond @connect "Method Not Allowed" 405

If CONNECT is required for a specific use case (outbound proxy for tool calls), restrict it to authenticated callers and to an allowlist of target hosts — never allow CONNECT to internal cluster addresses.

Log and alert on ambiguous header combinations. Even if you cannot immediately enforce HTTP/2 end-to-end or disable keep-alive, detect smuggling attempts in real time by logging any request where both Content-Length and Transfer-Encoding headers are present at the proxy. This should be vanishingly rare in legitimate traffic. A spike in dual-header requests from a single IP is a strong signal of active probing. Feed these logs into your SIEM and page on-call if more than 5 dual-header requests arrive from the same IP within 60 seconds.

Validate Content-Length arithmetic at the application layer. As a defence-in-depth measure, have the MCP server middleware read the entire request body into a buffer and verify that Buffer.byteLength(body) matches the declared Content-Length. A mismatch — where the declared CL is larger than the actual body received — indicates the proxy terminated the body earlier than expected, which can signal a desync in progress. Return 400 and log the anomaly rather than processing a potentially corrupted request body. This check costs one buffer allocation per request but prevents partial-body processing that could confuse tool input parsers.

Why MCP servers are an attractive smuggling target

HTTP request smuggling is not a new class of vulnerability, but MCP servers have characteristics that make them disproportionately attractive targets compared to a typical REST API. First, MCP servers act on behalf of LLM agents: a smuggled request that reaches a privileged tool endpoint can instruct the agent to perform actions — file writes, database queries, API calls — that the attacker could not trigger directly. The agent's ambient authority (all the tools it can call) is available to the smuggled request.

Second, MCP servers are frequently deployed with a proxy that injects trust headers: X-User-Id, X-Authenticated: true, X-Roles: admin. The upstream trusts these headers unconditionally because they come from the internal proxy. A smuggled request that bypasses the proxy and arrives directly at the upstream port over the same keep-alive socket appears to carry those trust headers from the previous legitimate request's context — depending on how the upstream's HTTP parser handles connection-level versus request-level state.

Third, many MCP deployments share a single upstream process among multiple tenants, relying on the proxy to inject the correct tenant context. A smuggled request from tenant A can inject a prefix into tenant B's request body, potentially reading tenant B's tool outputs or corrupting tenant B's session state. The blast radius is not limited to the attacker's own tenant.

These factors mean that the impact of a smuggling vulnerability in an MCP server is substantially higher than in a standard web application, and the investment in mitigations — HTTP/2 enforcement, keep-alive control, header normalization — is correspondingly more justified.

TE.TE obfuscation: hiding Transfer-Encoding from normalization filters

A sophisticated variant of TE.CL uses an obfuscated Transfer-Encoding value to fool the proxy's normalization filter while the upstream still accepts it. If the proxy normalizes only the exact lowercase string chunked but the upstream accepts case-insensitive or whitespace-padded values, the attacker can send a TE header that the proxy ignores but the upstream honours:

# TE.TE variant 1: uppercase
Transfer-Encoding: Chunked

# TE.TE variant 2: trailing whitespace (some parsers strip it, some don't)
Transfer-Encoding: chunked\x20

# TE.TE variant 3: extra comma-separated token
Transfer-Encoding: identity, chunked

# TE.TE variant 4: tab-separated
Transfer-Encoding: \tchunked

Each variant causes the proxy and upstream to disagree about whether the request is chunked. One sees Content-Length; the other sees Transfer-Encoding. The exploitation path is then identical to standard TE.CL. The mitigation is to normalize Transfer-Encoding to a canonical value at the proxy before it reaches the upstream, and to reject any TE value that does not match the exact RFC-specified token set (chunked, compress, deflate, gzip, identity).

In nginx, the proxy_pass directive strips Transfer-Encoding and recalculates Content-Length by default — this is protective. Caddy in reverse-proxy mode also normalizes TE. The danger arises when the MCP server uses a passthrough proxy mode, a raw TCP proxy (HAProxy's mode tcp), or a WebSocket upgrade path that bypasses HTTP normalization entirely.

For HAProxy deployments, ensure option http-server-close is set on the backend and option forwardfor is used rather than mode tcp. TCP mode passes bytes without any HTTP parsing, meaning any TE obfuscation the client sends reaches the upstream unmodified. Switching from TCP mode to HTTP mode is often the single configuration change that eliminates the entire TE.TE attack surface.

HTTP/2 request smuggling: h2.TE and h2.CL variants

HTTP/2 eliminates classic request smuggling because framing is binary and unambiguous. However, HTTP/2 introduces its own smuggling variant when a front-end server accepts HTTP/2 from clients but downgrades to HTTP/1.1 when forwarding to the upstream — and the downgrade translates HTTP/2 pseudo-headers into HTTP/1.1 headers in ways that reintroduce ambiguity.

The h2.CL variant: a client sends an HTTP/2 request with a content-length header whose value disagrees with the DATA frame length. HTTP/2 technically forbids this (RFC 9113 §8.1.1 requires CL to match DATA frame length), but some front-ends do not validate and forward the declared CL value into the HTTP/1.1 Content-Length header verbatim. The upstream then uses that declared value, stopping early and leaving bytes in the buffer.

The mitigation is the same as for HTTP/1.1 CL.TE: enforce HTTP/2 end-to-end (h2c to upstream), or strip and recalculate Content-Length at the proxy rather than forwarding the client-declared value. Never trust a client-provided content-length header on an HTTP/2 connection — always use the actual DATA frame byte count.

Priority remediation order for MCP deployments

If you discover your MCP server is vulnerable to request smuggling, remediate in this order based on implementation effort and impact reduction:

Immediately: Disable keep-alive between proxy and upstream (proxy_http_version 1.0 in nginx, keepalive 0 in Caddy). This eliminates the inter-request state that smuggling exploits, at the cost of slightly higher connection overhead. Zero code changes to the application, config change only.
Within one sprint: Add header normalization at the proxy — strip Content-Length when Transfer-Encoding is present, reject requests with multiple Transfer-Encoding headers, and return 400 on any TE value not in the canonical set. This can be done entirely in proxy configuration.
Within one quarter: Migrate the proxy-to-upstream transport to HTTP/2 (h2c). This requires the upstream application server to support HTTP/2, which may involve switching from Express to Fastify or enabling Node.js's http2 module. Test throughput under load before rolling out — HTTP/2 multiplexing changes connection semantics in ways that affect connection pool sizing.
Ongoing: Add smuggling probes to your staging CI pipeline using smuggler.py or equivalent. Any deployment that changes the proxy configuration or application server HTTP library should trigger a re-run of the probe battery before going to production.

SkillAudit findings for HTTP request smuggling

CRITICAL −24 Ambiguous CL+TE headers accepted by proxy and upstream without normalization — active smuggling surface confirmed by differential probe response.

HIGH −16 Keep-alive connections to upstream with no header normalization — smuggled prefix persists across requests in the connection pool.

HIGH −14 HTTP CONNECT tunneling enabled on public ingress without host allowlist — tunnel payloads bypass proxy authentication middleware entirely.

MEDIUM −8 No HTTP version pinning at ingress — HTTP/1.0 and HTTP/1.1 accepted by default, retaining framing ambiguity that HTTP/2 eliminates.

Run a SkillAudit scan to detect HTTP request smuggling surfaces in your MCP server proxy configuration. See also MCP server API gateway security.