Engineering · 2026-06-03

MCP server security CI/CD pipeline: a complete build pipeline audit checklist

Most MCP servers ship straight from a developer's laptop. A security review that only fires when you submit to a directory catches the problem after code is merged, after staging has been running it for a week, and potentially after early adopters have already installed it. This post builds the complete pipeline that catches issues before any of that: pre-commit static analysis, a SkillAudit grade gate on every PR, lockfile enforcement, permission manifest validation, and branch protection settings that make all of it non-bypassable.

Why CI gates are necessary — not just a post-deploy audit

The typical MCP server publication flow looks like this: a developer writes the server locally, tests it against their own Claude installation, then either pushes to a GitHub repo and submits the URL to a directory, or publishes directly to npm. There is no formal security review step. A directory audit — including SkillAudit's — happens asynchronously after submission. The vulnerability is already in a tagged release before it surfaces.

Consider the timeline when there is no CI gate. A developer writes a tool handler that calls fetch(args.url) without validating the URL scheme — a classic SSRF. The code merges to main. It deploys to staging, where it runs for a week. An early-adopter installs the server from the main branch tag. The directory submission triggers an audit. The audit flags the SSRF. Now the fix requires a new release, a re-scan, updated install instructions for early adopters, and a retraction from any user who installed the vulnerable version. The cost of the finding multiplies with every step it travels past the commit that introduced it.

A CI gate collapses that timeline. The SSRF surfaces on the pull request that introduced it — before merge, before staging, before any user installs it. The cost of the finding is a code review comment and a one-line fix. This is the economic argument for shifting security left: the same finding is dramatically cheaper to fix at the PR stage than at the post-deploy audit stage.

The five steps below build a pipeline that enforces this shift. They are ordered by where they fire in the developer workflow, from earliest (pre-commit) to latest (branch protection). Each step is independently valuable; they stack.

1
Pre-commit hooks for static analysis

Pre-commit hooks run on the developer's machine before a commit is created. They are the earliest possible catch point. The tradeoff is that they must be fast — any hook that adds more than 5 seconds to a commit will be disabled within a week. That means lightweight rules only: obvious antipatterns that a cheap regex or a fast linter can catch.

Set up husky with two checks: ESLint with a focused security config, and a semgrep invocation against a narrow pattern set.

# Install husky and set up the hook
npm install --save-dev husky
npx husky init
# Creates .husky/pre-commit — replace its contents with the script below
#!/usr/bin/env sh
# .husky/pre-commit
set -e

echo "→ ESLint security check..."
npx eslint --ext .ts src/ --rule 'no-eval: error' \
  --rule 'no-new-func: error' \
  --rule 'no-implied-eval: error'

echo "→ SSRF pattern check (fetch with unvalidated args)..."
# Fail if source directly interpolates tool args into fetch() calls
if grep -rn 'fetch(args\.' src/; then
  echo ""
  echo "ERROR: Possible SSRF — fetch() called with raw tool argument."
  echo "Validate and allowlist the URL before passing to fetch()."
  echo "See: https://skillaudit.dev/blog/mcp-server-permissions-checklist/"
  exit 1
fi

echo "→ semgrep credential-echo patterns..."
semgrep --config auto \
  --include='*.ts' \
  --error \
  --quiet \
  src/

echo "✓ Pre-commit checks passed."

Keep it fast. The ESLint call above runs in ~1 second on a typical MCP server source tree. The semgrep call with --config auto adds 2–3 seconds. Total: under 5 seconds. If semgrep is too slow on your codebase, replace --config auto with a narrower ruleset: --config p/secrets (credential detection only) runs in under a second.

Pre-commit hooks are advisory on the local machine — a developer can bypass them with git commit --no-verify. This is expected. The CI gate in Step 2 is the enforcement layer; the pre-commit hook is the fast-feedback layer that catches the obvious issue before it becomes a PR discussion.

2
SkillAudit CLI check in the GitHub Actions PR gate

The pre-commit hook catches obvious antipatterns. The SkillAudit CI gate runs the full six-axis audit — security, credentials, permissions, maintenance, compatibility, documentation — and fails the PR if the grade drops below the team's threshold or if any individual axis regresses versus the main branch. This is the core enforcement step.

This feature (CI webhook integration with --fail-on-drop) is available on the Pro and Team plans. The --fail-on-drop flag compares the current branch's grade on each axis against the main branch's last recorded grade. It fails if any axis regresses, even when the overall letter grade stays the same. This catches the common case where a PR adds a WARN-level finding on the permissions axis but the overall grade stays at B — without --fail-on-drop, that regression silently merges.

# .github/workflows/skillaudit-gate.yml
name: skillaudit-grade-gate
on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  skillaudit:
    name: SkillAudit grade gate (min B)
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write   # needed to post the grade comment

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0     # full history needed for --fail-on-drop comparison

      - name: Run SkillAudit grade check
        env:
          SKILLAUDIT_API_TOKEN: ${{ secrets.SKILLAUDIT_API_TOKEN }}
        run: |
          npx skillaudit check \
            --min-grade B \
            --fail-on-drop \
            --token "$SKILLAUDIT_API_TOKEN"
        # --min-grade B  → fail if overall grade is below B
        # --fail-on-drop → also fail if any axis regresses vs main branch,
        #                  even when the overall grade stays at B or above

      - name: Post grade badge as PR comment
        if: always()    # post comment even if the grade check failed
        env:
          SKILLAUDIT_API_TOKEN: ${{ secrets.SKILLAUDIT_API_TOKEN }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          npx skillaudit comment \
            --pr ${{ github.event.pull_request.number }} \
            --token "$SKILLAUDIT_API_TOKEN"
        # Posts a comment with the grade badge, axis scores, and a link to
        # the full report at skillaudit.dev — so reviewers can see the audit
        # detail without leaving GitHub

The grade comment includes a summary table of all six axis scores and a direct link to the full report on skillaudit.dev. Reviewers can click through to see the exact finding, the file path, the line number, and the remediation suggestion — without needing to run the audit themselves. This is load-bearing for developer experience: a PR blocked with "grade dropped to C" and no further context produces frustration; a PR blocked with a direct link to the specific finding produces a fix.

To establish the initial grade baseline on main before wiring in the gate: run npx skillaudit check --token <token> on main and commit a .skillaudit-baseline.json file (generated automatically by the CLI) to the repo root. The gate uses this file for the --fail-on-drop comparison.

3
Dependency lockfile enforcement

Use npm ci — not npm install — in CI. The difference matters for security: npm install updates the lockfile if it is out of sync with package.json, silently pulling a newer version of a dependency. npm ci fails hard if the lockfile is out of sync. In CI, you want the hard failure — it means a developer's local change to package.json (perhaps adding a caret range to a new dependency) cannot silently flow through to production without being caught.

Why this matters for supply-chain: a compromised transitive dependency does not need to touch your package.json at all. It only needs the lockfile to be absent or stale so that npm resolves the dependency tree fresh at install time. With a missing or stale lockfile, npm install happily fetches the latest version of every transitive dependency — including a version that was just poisoned by an attacker. With npm ci and a committed, up-to-date lockfile containing SHA-512 integrity hashes, that poisoned version fails the integrity check and the build stops. For a deeper walkthrough of the supply-chain attack timeline this prevents, see Dependency pinning for MCP servers.

      - name: Verify lockfile is in sync
        run: |
          # npm ci fails if package-lock.json is not in sync with package.json.
          # This catches developers who ran `npm install` locally without committing
          # the updated lockfile, or who manually edited package.json.
          npm ci

      - name: Audit for known CVEs
        run: |
          npm audit --audit-level=high
          # Fails the build if any HIGH or CRITICAL advisory is present
          # in the installed dependency tree.
          # Use --audit-level=moderate to catch more (recommended for new projects).

What npm audit catches that SkillAudit does not. The SkillAudit maintenance axis checks for floating ranges, missing lockfiles, and advisory presence at scan time. npm audit in CI is a live check against the current npm advisory database — it will catch a CVE that was published after the last SkillAudit scan. Run both. They are complementary: SkillAudit covers the full security surface of your server code; npm audit covers the live advisory state of your dependency tree.

4
Permission manifest validation

The most common permissions hygiene issue in the corpus is over-permissioned manifests: an mcp-manifest.json that declares fs:write because the author wasn't sure if they needed it, when the actual source code only ever calls fs.readFile. The declared permissions are what users and directories see. Unnecessary permissions erode trust — and they widen blast radius if the server is ever compromised.

The shell script below automates this check. It greps the source for actual filesystem and network API calls, then compares them against what the manifest declares. If the manifest declares a permission that no source call requires, the check warns. If the source makes a call that the manifest doesn't declare, the check fails hard — that's an undeclared capability, which is a security finding. See Permission scope patterns from the corpus for the full taxonomy of permission declarations the check covers.

#!/usr/bin/env bash
# scripts/validate-permissions.sh
# Compares declared permissions in mcp-manifest.json against actual API calls in src/.
set -euo pipefail

MANIFEST="mcp-manifest.json"
SRC="src"
FAIL=0

declared_perms() { jq -r '.permissions[]? // empty' "$MANIFEST" 2>/dev/null; }

check() {
  local perm="$1"; local pattern="$2"; local label="$3"
  local found; found=$(grep -rl "$pattern" "$SRC" 2>/dev/null | wc -l | tr -d ' ')
  local declared; declared=$(declared_perms | grep -c "^${perm}$" || true)

  if [ "$found" -gt 0 ] && [ "$declared" -eq 0 ]; then
    echo "FAIL  undeclared capability: $label ($pattern) used in src/ but '$perm' not in manifest"
    FAIL=1
  elif [ "$found" -eq 0 ] && [ "$declared" -gt 0 ]; then
    echo "WARN  over-declared: '$perm' in manifest but no $label call found in src/"
  else
    echo "PASS  $perm"
  fi
}

echo "Validating permissions manifest: $MANIFEST"
check "fs:read"    "fs.readFile\|fs.readdir\|fs.stat"       "filesystem read"
check "fs:write"   "fs.writeFile\|fs.appendFile\|fs.unlink" "filesystem write"
check "fs:exec"    "exec\|spawn\|execFile\|execSync"         "process execution"
check "network"    "fetch(\|https\.get\|http\.get\|axios"    "network call"
check "env"        "process\.env"                            "environment variable access"

if [ "$FAIL" -gt 0 ]; then
  echo ""
  echo "Permission manifest validation failed. Update mcp-manifest.json to match actual usage."
  echo "See: https://skillaudit.dev/blog/mcp-server-permissions-checklist/"
  exit 1
fi
echo "Permission manifest validation passed."

The script is intentionally conservative on grep patterns — it matches on substring, so fs.readFile catches both the callback and promise variants. Adjust the patterns for your codebase if you use a wrapper library. The key principle is automatic: the check runs on every PR, so a developer who adds a new fetch() call without updating the manifest sees the failure immediately, rather than when a user reads the manifest and notices the missing network permission.

For teams that use a typed manifest schema (TypeScript or JSON Schema), the script can be extended to validate the manifest structure itself before running the permission comparison. The core value is the diff between declared and actual — that's what catches the the ambient token problem in miniature: permissions that exist but shouldn't, creating unnecessary attack surface.

5
Branch protection settings

Steps 1–4 produce security signals. Step 5 makes them non-bypassable. GitHub branch protection on your main branch ensures that no PR can merge unless all required checks pass — including by org admins who might otherwise merge directly to unblock a deadline.

Configure the following required status checks via Settings → Branches → Branch protection rules on your repository's main branch:

Also enable Require branches to be up to date before merging. This is more important for security pipelines than it is for typical feature branches. Without it, a race condition is possible: PR A fixes a HIGH finding on the security axis; PR B (which was opened before PR A's fix) merges first; PR A's checks pass against the state before PR B's fix, not after. The security fix that was supposed to improve the grade actually merges on top of code that already had the fix reverted by an overlapping change. Requiring branches to be up to date before merging closes this race.

To configure via the GitHub API (useful for automating this across multiple repos in an org):

# Set branch protection via GitHub CLI (gh) — run once per repo
gh api repos/{owner}/{repo}/branches/main/protection \
  --method PUT \
  --field required_status_checks='{"strict":true,"contexts":["SkillAudit grade gate (min B)","Verify lockfile is in sync","Audit for known CVEs","Validate permission manifest"]}' \
  --field enforce_admins=true \
  --field required_pull_request_reviews='{"required_approving_review_count":1}' \
  --field restrictions=null

The enforce_admins: true flag is load-bearing. Without it, org admins can merge PRs that bypass required checks by using the GitHub web UI's "Merge without waiting for requirements to be met" button. In practice, this button gets clicked by senior engineers under deadline pressure — exactly the scenarios where a security regression is most likely to slip through. Enable enforce_admins from day one and treat exception requests as the signal to fix the pipeline friction, not as a reason to weaken the protection.

Putting it all together: the complete GitHub Actions workflow

The following workflow file combines Steps 2, 3, and 4 into a single .github/workflows/mcp-security.yml. It runs on every pull request, produces four named status checks (the four required checks listed in Step 5), and posts a grade comment to the PR.

# .github/workflows/mcp-security.yml
name: mcp-security
on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:

  # ── Job 1: SkillAudit grade gate ──────────────────────────────────────────
  skillaudit:
    name: SkillAudit grade gate (min B)
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run SkillAudit grade check
        env:
          SKILLAUDIT_API_TOKEN: ${{ secrets.SKILLAUDIT_API_TOKEN }}
        run: |
          npx skillaudit check \
            --min-grade B \
            --fail-on-drop \
            --token "$SKILLAUDIT_API_TOKEN"

      - name: Post grade badge as PR comment
        if: always()
        env:
          SKILLAUDIT_API_TOKEN: ${{ secrets.SKILLAUDIT_API_TOKEN }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          npx skillaudit comment \
            --pr ${{ github.event.pull_request.number }} \
            --token "$SKILLAUDIT_API_TOKEN"

  # ── Job 2: Dependency lockfile + CVE audit ────────────────────────────────
  dependency-audit:
    name: Dependency lockfile + CVE audit
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Verify lockfile is in sync
        run: npm ci

      - name: Audit for known CVEs
        run: npm audit --audit-level=high

  # ── Job 3: Permission manifest validation ─────────────────────────────────
  permission-manifest:
    name: Validate permission manifest
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Validate permission manifest
        run: |
          chmod +x scripts/validate-permissions.sh
          bash scripts/validate-permissions.sh

Each job maps directly to a named status check. When you configure branch protection (Step 5), the job name becomes the required check name — so the names shown here match the required checks list exactly. Keep them in sync if you rename a job.

The three jobs run in parallel. On a typical MCP server, the full pipeline completes in under 3 minutes: SkillAudit is the longest step at roughly 90 seconds for the full six-axis audit; npm ci is fast with the node_modules cache from setup-node; the permission manifest script runs in under a second. Total wall-clock time before a PR is unblocked: 90–120 seconds. That is an acceptable overhead for a security gate that catches real issues.

Minimum viable pipeline vs. full pipeline

If 30 minutes is your budget, wire up the minimum viable version first and layer in the full pipeline over the following sprint. The minimum viable pipeline catches supply-chain CVEs and enforces basic branch hygiene — the two highest-leverage checks for most MCP server teams starting from zero.

Check Minimum viable Full pipeline
Pre-commit hooks (husky + ESLint) Optional ✓ husky + ESLint + semgrep
SkillAudit CI grade gate ✓ grade ≥ B, --fail-on-drop
npm ci (lockfile enforcement)
npm audit --audit-level=high
Permission manifest validation ✓ shell script
Branch protection (required checks) ✓ npm ci + npm audit ✓ all checks + up-to-date

The minimum viable pipeline is two GitHub Actions steps plus one branch protection rule. It adds no secrets, no external service accounts, and no scripts. Any team with a GitHub repo can enable it in under 30 minutes. Start there, run it for a week, then layer in the SkillAudit gate once the team is accustomed to seeing security checks on PRs.

The full pipeline adds meaningful coverage that the minimum viable version misses. The SkillAudit gate catches the full surface area: SSRF patterns, credential echoing, the ambient token problem, input validation gaps, and compatibility issues — not just CVEs in dependencies. The permission manifest check catches the hygiene issues that a pure code scan misses: declared permissions that don't match actual usage. The pre-commit hooks catch obvious issues before they even leave the developer's machine. The combination of all five steps creates a security posture that no single check can replicate alone.

For teams that ship to the Anthropic Skills Directory: the directory audits use the same six-axis methodology as the SkillAudit CI gate. Wiring the gate first means your directory submission grade is predictable — you know your grade before you submit because the CI gate has been enforcing a minimum grade on every PR that built the release.

Closing: establish a grade baseline first

Before wiring in the SkillAudit CI gate, run an initial audit to establish the baseline grade that the --fail-on-drop flag compares against. If you wire the gate before establishing a baseline, the gate has no reference point for regressions — it can only check the absolute --min-grade threshold, not the delta.

Run the free audit on skillaudit.dev against your main branch, review the findings, and commit the baseline file to the repo root. Then wire in the CI gate. From that point forward, every PR that regresses any axis will be caught automatically — even if the regression is subtle enough not to drop the overall letter grade. That combination of absolute threshold and regression detection is what makes the gate genuinely useful rather than just a checkbox.

Related reading: the GitHub Action gate for MCP security grades post covers the install-gate use case (gating which MCP servers your team can install) — a complementary policy to the build pipeline gate covered here. The MCP server permissions checklist covers the full permissions surface area that the manifest validation script checks against.


Related posts: GitHub Action gate for MCP security grades · The MCP server permissions checklist · Dependency pinning for MCP servers · Permission scope patterns from the corpus

Run a free audit to establish your grade baseline before wiring in the CI gate.

Get your grade → Pro/Team plan — CI gate integration →