Playbook · 2026-04-30

Block 52 of 101 community MCP servers with one CI gate — the 2026 team policy template

If your team adopts a one-line policy — "no MCP server installs below grade C" — you block 52 of the 101 most-installed Model Context Protocol servers in 2026, including 29 vendor-official releases your developers would have waved through on brand alone. This is the policy in one paragraph, the GitHub Action template in 30 lines, and the 12-week rollout calendar your VP-Engineering can hand a security engineer next Monday.

The data behind the headline

Across the 101 most-installed Model Context Protocol servers we could find — vendor-official releases from Cloudflare, Stripe, Heroku, MongoDB, GitHub, AWS, Azure, Auth0, Sentry, and Anthropic itself, plus the indie corpus most teams adopt by name (FastMCP, mcp-installer, Klavis, the LastMile agent stack) — SkillAudit v0.2.1 reports the distribution above: 19 A, 0 B, 30 C, 10 D, 42 F. The full methodology and aggregate findings are in the state-of-MCP-security post; the per-vendor F-grade breakdown names every file path the engine flagged for the 29 vendor-official F's. Both posts are open. Both link directly to the per-repo report cards. Nothing in this playbook is paywalled — your security engineer can verify every grade by clicking the audit links.

Three things to read off the distribution before we get to the policy. First, a min-grade-A gate is not viable in 2026 — only 19 of 101 community MCPs would clear it, and most of the A's are on narrow surface (Pinecone, Redis, Snowflake, Couchbase, Microsoft Playwright, ElevenLabs, Qdrant, Vectara, Meilisearch, ZilliZ Milvus, the LangChain adapters, FireCrawl, Exa, FastAPI-MCP, fetch-mcp, DuckDuckGo). Forcing your developers to pick only from these 19 turns "we have a policy" into "we have a freeze" which turns into shadow installs. Second, there are zero B grades. That is not a bug in the engine; it is a property of the corpus — the rubric currently lands a repo on either A (clean across all six axes) or C (one-axis warning, others clean) or worse. A v0.3 calibration update will likely produce more B's; until then, B-and-up is functionally equivalent to A-and-up. Third, a min-grade-C gate clears 49 of 101 repos for install (19 A + 30 C) — a working choice set that includes most of the database-vector-search and developer-tool MCPs your team actually wants. Start there.

The policy in one paragraph

That is the whole policy. Six sentences. Read it, paste it into your security wiki, and move to the rollout.

Step 1 — Inventory what your team already has installed

Before you can gate new installs you need to know which MCP servers are already in production agents on your team's laptops. The single source of truth is .claude/plugins.lock in the developer's home directory or the team-managed agent profile, with the equivalent in ~/.cursor/extensions/, ~/.windsurf/plugins/, and ~/.config/codex/plugins.json. Most teams in 2026 do not yet inventory these — there is no equivalent of npm ls for MCP plugins. Three approaches that work in week 1:

  1. Developer self-attestation — a 5-minute Slack survey: "list every MCP server you have installed across every agent, with the GitHub URL or npm name". Goes 70% of the way; misses the ones developers forgot they installed for a hackathon.
  2. Filesystem walk — a one-line script across all team laptops via your MDM (find ~ -name 'plugins.lock' -o -name 'mcp_settings.json' 2>/dev/null covers Claude Code, Cursor, and Windsurf). Combine the lockfiles. This catches the forgotten installs.
  3. Outbound DNS log review — pull a week of egress connections from your endpoint-protection vendor (CrowdStrike, SentinelOne, Defender) and grep for npmjs.org, github.com, raw.githubusercontent.com, plus the per-vendor MCP installer endpoints (Cloudflare, Anthropic, etc.). MCPs running stdio that fetch their dependencies show up here. This catches the install-by-curl shadow paths.

Run all three. Cross-reference. Land at a single Google Sheet or Notion table with columns repo URL, installed by, used in agent (which one), last update, SkillAudit grade. The grade column is filled by pasting the GitHub URL into the audit form for any repo not already on the public board.

Step 2 — Pick the threshold (and why C is right for week 1)

The grade distribution above gives you four candidate thresholds.

Recommended: ship at minimum C in week 1. Tighten to B once the v0.3 calibration update produces a real B-grade band. Allow named exceptions per the policy paragraph above for the D-grade installs that have a fix-or-replace deadline on file.

Step 3 — Wire the CI gate

This is the 30-line GitHub Action. Drop it into .github/workflows/mcp-gate.yml in any repo whose .claude/plugins.lock is committed (any team-managed agent profile typically commits the lockfile to a private team repo for tracking). It runs on every PR that touches the lockfile, calls the SkillAudit public grade endpoint for each new entry, and fails the check if any new entry is below the threshold. There is no API key, no auth, and no billing path — public grades are open.

# .github/workflows/mcp-gate.yml
name: mcp-install-gate
on:
  pull_request:
    paths:
      - '.claude/plugins.lock'
jobs:
  gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 2 }
      - name: Diff added MCP entries
        id: diff
        run: |
          git diff origin/main -- .claude/plugins.lock \
            | grep -E '^\+\s+"[a-z0-9_-]+/[a-z0-9_.-]+"' \
            | sed -E 's/.*"([^"]+)".*/\1/' > added.txt
          echo "count=$(wc -l < added.txt)" >> "$GITHUB_OUTPUT"
      - name: Check each entry against SkillAudit
        if: steps.diff.outputs.count != '0'
        env:
          MIN_GRADE: 'C'  # one of: A, B, C, D
        run: |
          rank() { case "$1" in A) echo 4;; B) echo 3;; C) echo 2;; D) echo 1;; *) echo 0;; esac; }
          MIN=$(rank "$MIN_GRADE"); FAIL=0
          while read -r repo; do
            slug=$(echo "$repo" | tr '/' '-')
            grade=$(curl -fsSL "https://skillaudit.dev/audit-index.json" | jq -r --arg k "$repo" '.[$k].grade // empty')
            [ -z "$grade" ] && { echo "::error::$repo not yet audited — submit at https://skillaudit.dev/#audit-req"; FAIL=1; continue; }
            if [ "$(rank "$grade")" -lt "$MIN" ]; then
              echo "::error::$repo grade=$grade below MIN_GRADE=$MIN_GRADE — see https://skillaudit.dev/audits/$slug/"
              FAIL=1
            else
              echo "$repo grade=$grade OK"
            fi
          done < added.txt
          exit "$FAIL"

Two things worth saying about this gate before you ship it. First, it talks to https://skillaudit.dev/audit-index.json directly — that endpoint is a static JSON file refreshed on every corpus re-scan and served from CDN; there is no rate limit and no auth. Second, the ::error:: output uses GitHub's standard annotation syntax so the failure shows up inline on the PR, not just in the run log — the developer who tried to add an F-grade MCP sees the link to the audit page they should read before they argue with their security engineer about an exception.

Step 4 — Set the re-scan cadence

Grades age. A maintainer who fixes the one SSRF that pulled them down to C lands at A on the re-scan. A maintainer who pushes a bad commit that breaks the SSRF allow-list moves the same repo from A back to F. The policy paragraph specifies 30 days or on plugins.lock change, whichever comes first; here is what that looks like operationally.

  1. A weekly cron in your team-policy repo runs the same audit lookup against every entry in plugins.lock and writes a Slack notification for any grade change. (One CI workflow with schedule: cron: '0 9 * * 1'; same script as the gate above, different trigger.)
  2. Any grade drop below C triggers a 14-day window during which the engineer who installed that MCP must (a) re-scan against a newer release if one exists, (b) replace it with a higher-graded MCP that solves the same problem, or (c) document a written exception with re-scan deadline.
  3. Grade improvements (e.g. C → A on a re-scan after a maintainer fix) are logged but not noisy — the inventory table updates, no Slack ping needed.

Step 5 — Communicate the policy and exception process

The hard part of this is not the technical gate; it is the social agreement. Three things to land before the policy goes live in CI:

The 12-week rollout calendar

For the security engineer the VP-Engineering hands this to next Monday, here is the calendar.

Common gotchas

Vendor-official is not a security signal

The single biggest mistake teams make in week 1 is to allow-list vendor-official releases past the gate. Twenty-nine of the forty-two F's in our corpus are vendor-official; we name every one with the file path. The dev-rel team that wrote the demo MCP for the conference is not the security team that audits the SaaS API. Brand is not the signal you are looking for.

Repo-wide token scopes hide single-tool blast-radius

Several F-grade MCPs (Heroku, Auth0, MongoDB) attach a bearer token to every outbound fetch() call by construction in the API client layer. The blast-radius of a single SSRF in a single tool handler is therefore "the entire token scope" — not just the one record that tool was supposed to read. The policy paragraph's "named owner + written remediation" requirement should trigger a token-scope review for any exception involving an MCP that uses a single shared token; the easiest mitigation is to scope the token down before granting the exception.

Examples and scripts do count when developers copy-paste from them

The honest calibration note in the per-vendor F-grade post distinguishes runtime-tool-surface F's from F's partially driven by scripts/, benchmarks/, samples/, or examples/ findings. For an audit-the-engine-itself view this distinction matters; for a team adopting the MCP, less so. If a developer reads the README, follows the example, and the example contains the SSRF — that's a real install-time path, not a calibration artifact. The v0.3 engine update will weight these less aggressively, but the policy should treat "F driven by examples" the same as "F driven by runtime tool surface" until the maintainer fixes both.

Community MCPs ship without a CHANGELOG more often than not

The maintenance-axis check on the SkillAudit rubric explicitly looks for a CHANGELOG / RELEASE-NOTES / Releases-tab presence; over half of the F-grade community MCPs in our corpus have none. If you cannot tell what changed between the version your team installed and the version the upstream tagged this morning, you cannot run the re-scan-on-version-change cadence Step 4 calls for. Prefer MCPs with explicit version tags and visible release notes over MCPs that ship from main with no release process.

FAQ

Further reading

Adopting community MCP servers at your team? Start with the public board.

See every grade → Submit a repo to audit →