Topic: mcp server supply chain security

MCP server supply chain security — dependency risk in the Model Context Protocol ecosystem

When the security community talks about "AI supply chain risk," they usually mean poisoned training data or compromised model weights. That's not the supply chain risk that bites MCP server users — the real exposure is mundane: npm packages and PyPI distributions carrying open CVEs, floating semver ranges that resolve to different code at every install, and no lockfile to tell you what you actually ran. In the 101-server SkillAudit corpus, 8% of servers had at least one open CVE in their dependency tree at the time of audit — a number that grows daily as new advisories land against packages that haven't been touched in months.

TL;DR

Supply chain risk in MCP servers comes from npm and PyPI dependencies, not the LLM training pipeline. Standard SCA tools (Dependabot, Snyk, OSV-Scanner) catch known CVEs in the dependency tree but miss the MCP-specific surface: the tool-handler code path that actually executes when an agent calls a tool. 8% of our corpus had active CVEs at audit time. The minimum safe posture is: lock your deps with a committed lockfile, run npm audit or pip-audit in CI as a gate, ship an SBOM with each release, and archive or deprecate the package when you stop maintaining it. SkillAudit's maintenance axis scores all four of these and flags servers where CVE-carrying deps sit on the execution path for a tool handler.

What "supply chain" means for MCP servers

An MCP server is typically a Node.js or Python process that registers tool handlers and exposes them to an agent over stdio or SSE. Like any such process, it depends on a package graph — often dozens of transitive dependencies that the author never directly reviewed. Supply chain risk in this context means: a vulnerability in one of those packages that can be triggered by the code paths the MCP server actually exercises.

The OWASP LLM Top 10 maps this to LLM05: Supply Chain Vulnerabilities, which covers compromised pre-trained components and training data as well as third-party packages. For MCP servers, the third-party package vector is the immediate and practical one: a JSON parsing library with a prototype pollution CVE, an HTTP client with a request-smuggling advisory, a markdown renderer with an XSS-class finding that becomes a prompt-injection path when the output is fed back to a model.

The distinction between the AI-hype version of "supply chain" and the mundane version matters because the mitigations are completely different. Protecting against a poisoned training corpus requires process controls at Anthropic. Protecting against a CVE in axios@0.21.1 requires running npm audit in your CI pipeline and pinning the lockfile. The second one is something every MCP author can do today.

What SCA tools catch — and what they miss

Software composition analysis (SCA) tools like Dependabot, Snyk, and OSV-Scanner are good at one thing: matching packages in your dependency tree against advisory databases (GitHub Advisory Database, OSV, NVD). If your package.json or requirements.txt pulls in a package with a known CVE, a properly configured SCA tool will file a PR or fail a CI check.

What they don't do is understand the MCP tool-handler surface. A CVE in a transitive dependency that's only reachable via a code path that your MCP server never executes is technically present in the SBOM but not practically exploitable through the server's tool API. Conversely, a CVE in a package that sits directly on the path from server.tool('fetch', handler) to network I/O is a much higher priority than its CVSS score might suggest — because the agent will call that handler automatically, without a human reviewing the argument before execution.

SCA tools also can't detect the MCP-specific amplification factors: a server that runs with broad filesystem permissions, that processes LLM-provided arguments without validation, or that returns externally-fetched content verbatim to the model. These are the axes where SkillAudit's maintenance and security scoring adds signal beyond what a generic SCA pass provides.

Corpus findings: 8% open CVEs at audit time

Of the 101 servers in the SkillAudit corpus, 8 had at least one dependency with an open advisory on the day they were scanned. That's a lower rate than we expected — but it reflects selection bias: the corpus skews toward widely-installed servers that receive more maintenance attention than the median community server. The more representative statistic is the maintenance-axis score distribution: a significant share of corpus servers have not cut a release in over six months, meaning any CVE disclosed since the last release is silently present with no remediation in progress.

The CVE-carrying packages we found fell into a few categories: HTTP client libraries (older versions of axios, requests, httpx), templating engines, and one XML parser. None of them were zero-days; all had fixes available for weeks or months before the audit. The blocker in every case was that the server hadn't cut a new release — the author may not even have been aware the advisory existed.

This is why lockfile discipline and automated SCA are maintenance hygiene, not just security theater. A Dependabot alert that fires within 48 hours of an advisory is the difference between "we patched in 3 days" and "we had a month-old CVE at audit time."

Pinned lockfiles, SBOM, and the maintenance axis

The SkillAudit maintenance axis grades four supply-chain-specific signals:

Lockfile committed. A package-lock.json, yarn.lock, or poetry.lock committed to the repo and kept up to date. Without a lockfile, every fresh install resolves the dependency tree at runtime — meaning the same repo can install different code on different days. In an agent context where a server may be installed into an isolated container on every run, floating resolution means non-deterministic behavior that's nearly impossible to audit.

No open advisories on the tool-handler path. Not just "no CVEs in the SBOM" — specifically, no open advisories in packages that are direct dependencies of the modules containing tool handler registrations. We use static reachability analysis to distinguish these from inert transitive deps.

SCA gate in CI. Evidence of npm audit --audit-level=moderate, pip-audit, or Snyk in the CI workflow configuration. A repo that runs SCA in CI gets credit even if a new advisory has landed since the last run; a repo with no CI SCA at all is graded lower regardless of current advisory status.

SBOM artifact. A machine-readable SBOM (CycloneDX or SPDX format) shipped with each release. This is the least commonly present signal in our corpus — fewer than 15% of servers include one — but it's increasingly expected by enterprise buyers doing their own vendor security review before allowing a server to connect to their agent infrastructure.

How to check for supply chain risk before installing a community server

If you're evaluating a community MCP server and want a quick supply chain signal before you connect it to your agent, here are four checks that take under five minutes:

1. Clone and run npm audit or pip-audit. This gives you the current advisory picture for the installed tree. Any HIGH or CRITICAL finding is a red flag; check whether the server's last release predates the advisory — if so, the author almost certainly doesn't know about it.

2. Check for a committed lockfile. ls package-lock.json yarn.lock pnpm-lock.yaml. No lockfile = non-deterministic installs = you cannot reproduce exactly what the author tested against.

3. Check the last commit date on key dependency files. A package.json last modified 18 months ago with a floating "axios": "^0.21" range means the installed version might be very different from what the author shipped — and might be carrying advisories the author has never seen.

4. Search the repository's Issues and PRs for "CVE" or "security". Author responsiveness to disclosed CVEs is the best leading indicator of how they'll handle future ones. A repo with no security-related issues ever filed, and no SECURITY.md, is a repo with no disclosure channel.

For a comprehensive picture — including the tool-handler reachability analysis that generic SCA misses — run a SkillAudit scan on the repo URL. The maintenance axis report breaks down each of the four signals above alongside the full six-axis grade.

Related