Topic: mcp server api gateway security

MCP server API gateway security — protecting HTTP-transport servers in team deployments

Most community MCP servers use stdio transport and run as local child processes. But team deployments increasingly use HTTP-transport servers: a single shared server that multiple developers' agent sessions can call. This architecture needs an API gateway layer — not because HTTP is inherently insecure, but because the MCP protocol itself provides no authentication, no rate limiting, and no audit logging. Every HTTP-transport MCP server exposed to a network should sit behind a gateway that enforces all three.

The authentication gap in HTTP-transport MCP

The MCP protocol spec (as of 2026) defines the request/response format but does not mandate an authentication scheme. An HTTP-transport MCP server that binds to 0.0.0.0:3000 and has no authentication middleware will respond to tool calls from any process on any network that can reach that port.

This is rarely intentional — authors typically assume the deployment environment provides network-level isolation. In practice, in cloud environments, team VPNs, and shared development servers, network-level isolation is weaker than most developers expect. A missing authentication layer is a HIGH finding on the SkillAudit Security axis for HTTP-transport servers.

Pattern 1: bearer token middleware

The simplest authentication pattern: a shared secret in the Authorization header that all clients include. This is appropriate for internal team tools where all callers are trusted team members.

// Express middleware for bearer token auth
function requireBearerToken(req, res, next) {
  const auth = req.headers.authorization;
  if (!auth || !auth.startsWith("Bearer ")) {
    return res.status(401).json({ error: "authentication required" });
  }

  const token = auth.slice(7);
  // Use timingSafeEqual to prevent timing attacks on token comparison
  const expected = Buffer.from(process.env.MCP_API_KEY);
  const provided = Buffer.from(token);
  if (expected.length !== provided.length ||
      !crypto.timingSafeEqual(expected, provided)) {
    return res.status(403).json({ error: "invalid token" });
  }

  next();
}

app.use("/mcp", requireBearerToken);

Note the use of timingSafeEqual — constant-time comparison prevents timing attacks that could leak information about partial token matches. This is a standard pattern that SkillAudit checks for; a direct === comparison on tokens is flagged as a WARN.

Pattern 2: per-user API keys with caller identity propagation

For multi-user team deployments, shared tokens create audit trail problems: you can't distinguish which team member's session made a particular tool call. Per-user API keys solve this by associating each request with a specific caller identity that the tool handler can read.

// Per-user key lookup + identity propagation
const USER_KEYS = new Map(Object.entries(JSON.parse(process.env.MCP_USER_KEYS)));
// MCP_USER_KEYS = {"alice": "key_abc123", "bob": "key_xyz789"}

function requireUserKey(req, res, next) {
  const auth = req.headers.authorization;
  if (!auth?.startsWith("Bearer ")) {
    return res.status(401).json({ error: "authentication required" });
  }

  const token = auth.slice(7);
  let caller = null;
  for (const [user, key] of USER_KEYS) {
    if (key === token) { caller = user; break; }
  }

  if (!caller) return res.status(403).json({ error: "invalid key" });

  // Attach caller identity to request for tool handlers
  req.mcpCaller = caller;
  next();
}

Tool handlers can then read req.mcpCaller (or however the transport layer surfaces the request context) to make per-user authorization decisions and to populate audit log entries with the caller's identity.

Pattern 3: rate limiting per caller

Without rate limiting, a single compromised API key or a rogue LLM session can exhaust server resources or API quotas. Rate limiting should be applied per caller, not per IP, because multiple team members may be behind the same NAT.

// Per-caller rate limiting with sliding window
const callWindows = new Map(); // caller -> [timestamp, ...]

function rateLimitByCaller(limit, windowMs) {
  return (req, res, next) => {
    const caller = req.mcpCaller ?? req.ip;
    const now = Date.now();
    const window = (callWindows.get(caller) ?? [])
      .filter(t => now - t < windowMs);

    if (window.length >= limit) {
      return res.status(429).json({
        error: "rate limit exceeded",
        retryAfter: Math.ceil(windowMs / 1000)
      });
    }

    window.push(now);
    callWindows.set(caller, window);
    next();
  };
}

// 100 tool calls per minute per caller
app.use("/mcp", rateLimitByCaller(100, 60_000));

Pattern 4: tool-level audit logging

Audit logging records which caller invoked which tool with which arguments, enabling incident response and unauthorized-use detection. The log entry should be written before the tool handler runs (so a crash doesn't lose the record) and should include enough context to reconstruct the session without recording sensitive argument values.

// Audit middleware that logs before forwarding to tool handler
function auditLog(req, res, next) {
  const entry = {
    ts: new Date().toISOString(),
    caller: req.mcpCaller,
    tool: req.body?.method,
    // Log arg keys but not values — values may contain secrets
    argKeys: Object.keys(req.body?.params?.arguments ?? {}),
    ip: req.ip,
    userAgent: req.headers["user-agent"]
  };

  // Write to append-only log before processing
  fs.appendFileSync("./logs/mcp-audit.jsonl",
    JSON.stringify(entry) + "\n");

  next();
}

app.use("/mcp/call-tool", auditLog);

Logging argument keys but not values is a deliberate choice: argument values often contain sensitive data (file paths with usernames, API tokens passed as args) but the argument shape (which keys were present) is sufficient for audit purposes in most cases.

Combining patterns in a gateway layer

For production team deployments, these four patterns compose cleanly into an Express middleware chain:

app.use("/mcp",
  requireUserKey,             // 1. authenticate
  rateLimitByCaller(100, 60000), // 2. rate limit per caller
  auditLog,                   // 3. audit before handling
  mcpHandler                  // 4. actual tool dispatch
);

An alternative is to put a reverse proxy (nginx, Caddy, Kong) in front of the MCP server and handle authentication and rate limiting at the proxy layer. This keeps the MCP server's code simpler but requires the proxy to propagate caller identity to the server via a header.

What SkillAudit checks for API gateway security

For HTTP-transport servers, the Security and Permissions axes check:

Binding to 0.0.0.0 with no authentication middleware (HIGH)
Authentication using direct string comparison rather than constant-time compare (WARN)
No rate limiting on tool endpoints (WARN)
No audit logging (INFO — not scored, but surfaced)
API keys or shared secrets stored in source files rather than environment variables (HIGH on Credentials axis)

For stdio-transport servers (which are not exposed to the network), these checks are skipped — they're only relevant when there's an inbound network surface to protect.

For the per-tool authorization checks that complement the gateway layer, see MCP server access control. For the rate limiting context in terms of resource consumption protection, see MCP server rate limiting.

Check your HTTP-transport server's gateway configuration

SkillAudit detects HTTP-transport binding and checks for authentication, rate limiting, and audit logging in your server code and deployment configuration.

Run a free audit