Topic: mcp server security scan

How to run an MCP server security scan

If you're about to claude plugin install a community Model Context Protocol server, or you maintain one and want a green badge before publishing, here's the actual workflow — what to submit, what comes back, and what to do with it.

TL;DR

Paste a GitHub URL into the SkillAudit hero form, wait ~60 seconds, get a report card with a single A–F grade plus pass/warn/fail across six axes (security, permissions, credentials, maintenance, client compatibility, documentation). For buyers: A or B = install with confidence, C = install pinned to a reviewed commit, D or F = block or fix-then-revisit. For authors: re-scan after each release; the grade rebuilds from the latest commit. Across our 101-server corpus, only 19% earned an A; the median was a C. Don't install blind.

Why scan, and why now

2026 is the year MCP went mainstream and the year MCP-specific exploits stopped being theoretical. The public scan results are blunt: 50% of community MCP servers ship SSRF in tool handlers, 38% have credential-handling findings, 10% have command-exec sinks. None of these leave a CVE — they're first-party code, written this week, by indie authors who haven't shipped to a public marketplace before. Conventional dependency scanners pass them clean. The trust signal a buyer or marketplace reviewer needs has to come from somewhere else.

For three buyer profiles, scanning is a clear win:

Step-by-step: how to run a scan

  1. Find the canonical source. The thing your client will actually install — usually a public GitHub repo, sometimes an npm package whose source has drifted from the README's GitHub link. If they disagree, scan the npm tarball; that's what installs. SkillAudit accepts all three: GitHub URL, npm package name, ZIP upload.
  2. Submit it. Paste the URL into the hero form on the homepage. The scan starts immediately on a worker; no signup is required for public-repo audits on the free tier.
  3. Wait ~60 seconds. The static layer (tree-sitter pattern matching tuned to MCP idioms) finishes first, in under 10 seconds. The LLM-assisted layer (prompt-injection probing of extracted tool handlers via Claude Haiku 4.5) takes longer — proportional to tool count. Each axis renders into the report card as it lands; you don't have to wait for everything.
  4. Read the grade with the per-axis breakdown. The A–F grade is a single-glance signal. The six pass/warn/fail axes are how you'd defend the grade in a code review. Findings have file paths and line numbers; you can verify each one against the source.
  5. Cite the report URL. If the scan helps an install decision or a publishing decision, link the stable /audits/owner-repo/ URL in your PR description, your commit message, your README, or your fleet-policy doc. The scan is reproducible from public source; readers can click through.

Run a scan now

What comes back: reading the six-axis report card

The report card has the same shape for every server. The score on each axis is independently produced and shown alongside the overall letter grade so you can read where the grade came from:

What to do with each grade

GradeMeaningRecommended action (buyer)Recommended action (author)
AClean across all axes; LLM probe found no exploitable injection vectorsInstall with confidence. Re-check on major-version bumps.Embed the badge. You're done.
BMinor warnings, no high-severity findingsInstall. Read the warnings; some are documentation-only.Address the warnings if you have a 30-minute budget; A is reachable.
COne mid-severity finding or multiple warnings; security axis is borderlineInstall pinned to a specific commit you've reviewed. Don't auto-update.Fix the one finding; you'll move to a B or A. The remediation hint will name the file.
DOne high-severity finding (e.g. SSRF in a registered tool) or multiple mid-severityBlock in fleet policy. If you must install, fork and patch.Fix before publishing. The marketplace will reject this; the listing review uses the same axes.
FMultiple high-severity findings, archived, or LLM probe successfully extracted credentialsDo not install. Note in your team's deny-list with the report URL as the citation.The fixes are usually mechanical — re-read the A-grade patterns; most are absent.

When to re-scan

Related questions

Are the scan results public?

Yes for public-repo audits. Each scan publishes to a stable URL at /audits/owner-repo/ and authors can embed the badge. Private-repo scans (Pro tier) generate a private report; the URL is unguessable and access is via the requesting account only. Privacy detail.

Does the scanner store the source code?

No. Source is fetched, analyzed in-memory, and the scan result is persisted; the source itself is not retained after the scan completes. The privacy page documents this; the Team plan adds an audit log of scan events.

How accurate is the LLM-assisted prompt-injection probe?

It catches a meaningful class of vector that pure static analysis misses, and it has known limits — a calibration writeup is here. The probe is reported as a separate axis so you can read what static caught vs what the LLM probe added; we don't bury the static-only findings in a single number.

Can I scan a Claude skill (not just an MCP server)?

Yes — Claude skills published to a public repo or to the Anthropic Skills Directory are scannable on the same six-axis engine. Skills have a slightly different tool-registration shape; the static layer covers both.

What if my repo has a .skillaudit.yml config file?

The scanner respects per-repo configuration: an allowlist of known-false-positive patterns, a list of additional clients to test against, and an LLM-probe budget. The config file is optional; a repo with no config is scanned with the defaults.

Further reading