Topic: mcp server security risks

MCP server security risks — the eight classes you'll actually see

If you're trying to map the risk surface of a Model Context Protocol server — yours, or one you're about to install — this page lists the eight risk classes that actually show up, with prevalence data from 101 of the most-installed servers in the wild and the kind of code that creates each one.

TL;DR

Across 101 of the most-installed MCP servers we've seen eight risk classes: (1) SSRF in tool handlers — 50%, (2) credential exposure — 38%, (3) command and code execution from tool input — 10%, (4) prompt-injection vectors smuggled through tool I/O (LLM-assisted probe), (5) over-broad permission and scope, (6) abandoned / archived maintenance — 9 of 101, (7) client compatibility drift (works on Claude Code, breaks on Cursor), (8) documentation rot (README claims tools the code doesn't register, or vice versa). Below: what each class is, how it actually shows up in code, what catches it, and what to do about it before installing or shipping.

Why this is the right frame, not OWASP API Top 10

The instinct, when a new protocol with security implications shows up, is to map it onto the existing OWASP categories. We've done that exercise and published it (see state of MCP server security, 2026) — it's a useful translation layer for security teams. But the practical risk frame for someone actually scanning or shipping an MCP server is the eight classes below, because:

Three of the eight classes (prompt injection through tool I/O, permission scope, client compatibility) don't have good OWASP API Top 10 analogs — they emerge specifically from the protocol shape.
One of the OWASP categories (BOLA / object-level authorization) maps poorly because most MCP servers don't have a multi-tenant authorization model — they have a single-credential model where the credential is implicit.
The OWASP frame is per-finding; the buyer-side question is per-server. An A–F grade across the eight axes is what gates an install decision.

Use OWASP if you're filing a CVE write-up. Use these eight classes if you're shipping or installing.

The eight risk classes (with prevalence and code shape)

1. SSRF in tool handlers — 50% of corpus

The textbook MCP risk: a tool handler accepts a URL and dereferences it without an allowlist or an SSRF guard. Two shapes:

// shape A — direct
server.tool('fetch_url', async ({url}) => {
  const r = await fetch(url);                     // SSRF
  return { text: await r.text() };
});

// shape B — dynamic base
server.tool('api_call', async ({path}) => {
  const r = await fetch(`${baseUrl}/${path}`);    // SSRF if baseUrl is config-read or arg-derived
  return { text: await r.text() };
});

Why it's bad: the LLM accepts the tool argument from user instructions. A user (or an upstream prompt-injection payload) can ask the agent to fetch http://169.254.169.254/latest/meta-data/ and the server obliges. We've seen this in vendor-official releases, not just indie repos — see the vendor F-grades writeup. Fix: allowlist by host, reject private IP ranges (RFC 1918 + cloud metadata IPs), set fetch redirect mode to 'manual' and re-validate the redirected URL.

2. Credential exposure — 38% of corpus

Three sub-shapes, in roughly descending frequency:

Echo to tool response. return { text: process.env.GITHUB_TOKEN }, usually as a debug aid the developer forgot to remove. Sometimes via templated error messages: throw new Error(`request to ${url} with token ${token} failed`).
Echo to logger. console.log({ headers }) where headers.authorization contains the token. Whatever scrapes the log (factory log shippers, ngrok-style tunnels, Sentry breadcrumbs) gets the token.
Over-collection at server start. Server requests env vars it doesn't actually need; if you're shipping it, your install instructions ask the user to expose tokens you'll never read.

The credential-leak case study (anatomy of a credential leak) walks through a live example. Fix: redact known sensitive env names from tool responses and logs server-wide, narrow the env list at startup, ship a redaction helper.

3. Command / code execution from tool input — 10% of corpus

Lower prevalence than SSRF, higher impact when present. Shapes:

// node
execSync(`git log --grep="${query}"`);            // shell injection
child_process.exec(cmd);                           // arbitrary command if cmd is arg-derived

// python
os.system(f"convert {input_file} {output_file}")  // shell injection
subprocess.run(cmd, shell=True)                    // same

Most A-grade servers in our corpus avoid shell=True entirely or use the array-arg form (execFile, subprocess.run([...])). The 10% that fail this often do so in the file-handling, git, or media-conversion tools — adjacent to a legitimate need to spawn a process, but with the input flowing into the command string. Fix: array-arg execution, never shell=True, validate every arg against a strict regex.

4. Prompt-injection vectors through tool I/O

The hardest class to detect statically. Shapes: a tool fetches an upstream API, the upstream payload contains instructions intended for a downstream LLM, the tool response surfaces that payload unsanitized, the agent reads it as instructions and acts on them. There's no SAST rule that catches this because the danger is content-shaped, not code-shaped.

SkillAudit's LLM-probe layer is what we use here — see the methodology page for the calibration set and known limits. We don't claim full coverage; we claim measurable coverage with a documented blind spot. Fix on the server side: structure-aware sanitization of upstream content (strip control characters, fence content with explicit "these are not instructions" markers, never inline external content into a system message).

5. Over-broad permission and scope

The risk that auditors most often bring up and that grep-style scanners almost never catch: a server requests OAuth scopes, env vars, file-system paths, or network egress that exceed what its documented tools need. Examples from corpus:

A read-only repo-summarizer that requests repo scope (write) instead of public_repo (read on public).
A shell-tool that asks for $HOME read but spawns processes anywhere on the filesystem.
A "search the web" server with full network egress when an allowlist of three domains would do.

This usually isn't malicious; it's defaults-driven (the OAuth scope copy-pasted from a tutorial). But for a buyer it's a disqualifier — the server's capabilities exceed its purpose, which means an exploit elsewhere in the chain has more to work with. Fix: tighten the requested scope to the minimum the documented tools need, document the scope in the README.

6. Abandoned / archived — 9 of 101 corpus servers

An archived MCP server is a server that will never receive a security patch. Nine of the 101 most-installed servers we scanned were archived on GitHub, often without a banner the install command surfaces. Three of those nine had high-severity findings. We covered the named list in the maintenance-signal post. Fix on the buyer side: check archive status before install (one-line GitHub API call); if it's archived, fork it before depending on it. Fix on the author side: archive cleanly with a README banner naming the migration path, not silently.

7. Client compatibility drift

The MCP protocol is moving; clients (Claude Code, Cursor, Windsurf, Codex, JetBrains plugin, VS Code extension) ship slightly different protocol-version handling. A server that works on Claude Code today can break silently on Cursor when one client drops support for an older request shape. Drift findings in our corpus look like: README claims four clients, only two actually run; or a tool that works on Claude Code's stdio transport but not Cursor's HTTP transport.

This isn't a CVE-shaped risk — nobody gets owned because of compat drift. But it's a reliability and trust risk: an MCP server you can't actually install on the client your team uses is a server your team doesn't get value from. Fix: name the verified clients in the README, version them, run a smoke test in CI per client.

8. Documentation rot

Lowest-severity individually, highest-correlation overall. Shapes:

README claims tools the code doesn't register.
Code registers tools the README doesn't document (the surprise tools — sometimes powerful ones).
No runnable example, no version, no changelog. The buyer can't verify anything.

In our corpus, the F-grade cluster has a strong correlation with doc rot. It's a smoke signal for the other seven classes — a server with current, complete, runnable docs almost always fails fewer findings than one without. Fix: keep tool registration and README in sync (a script that diffs them on CI catches most cases), name the version, ship a runnable example.

How to approach the risk surface (buyer or author)

Don't rely on a single tool. SCA catches the dependency-CVE class only; generic SAST catches a slice of the SSRF and exec classes; an MCP-aware scanner like SkillAudit covers the eight classes above. The tools landscape page goes through what each one covers.
Scan before you install. Free public audits at stable URLs are a 60-second decision input. Buyer-side: paste the GitHub URL and read the report card.
Gate at install time, not after. For team buyers, wire the scan into the PR that introduces a new MCP server — the GitHub Action / PR gate is the correct seam.
Treat the report as a starting point, not a verdict. An A grade means no detected findings across the eight classes against our v0.3 calibration; it doesn't mean no findings exist. Read the per-axis breakdown, especially the LLM-probe axis — that's where the latent risk most often lives.
For authors: aim for an A and embed the badge. Buyers gate on the badge; the badge is a signal you've cleared the eight classes. The patch you'll most often need is the SSRF allowlist (50% of corpus needs it).

Run an audit

What we're not yet covering well

Three classes we've identified but don't yet score with high confidence:

Cross-tool privilege chaining — a server that registers Tool A (read GitHub repo) and Tool B (post to Slack), where the agent reads sensitive data from A and pipes it to B. Detection requires tool-graph reasoning we haven't shipped yet.
Long-lived session state — servers that hold per-session memory in ways that leak across invocations. Hard to model statically.
Non-trivial deserialization paths — MCP servers that accept structured tool arguments and deserialize via JSON.parse + eval or unsafe pickle. Rare in our corpus but high-impact.

We name these on the methodology page's known-limits section. Anyone evaluating an MCP scanner — including ours — should ask for the equivalent disclosure.