DevSecOps Guide
MCP server security for DevSecOps: integrating SkillAudit into CI/CD pipelines
Shift security left: wire grade-gates into GitHub Actions, GitLab CI, and pre-commit hooks so a server that fails the SSRF or prompt-injection check never reaches the registry.
2026-06-10 · DevSecOps Guide · 12 min read
Why CI/CD is the right choke point
MCP security vulnerabilities are discovered late. The typical sequence: a developer ships a server, a security team runs a post-hoc scan, findings pile into a backlog, and by the time the fix lands the server has been installed by dozens of Claude Code users or approved in an enterprise directory.
The DevSecOps answer is to move the audit into the pipeline — treat a failing grade the same way you'd treat a failing unit test or a failing SAST scan. The build is red. The merge is blocked. The author sees the issue in the same PR review where they'd fix any other bug.
SkillAudit's CI webhook makes this mechanical. You add a GitHub Action or GitLab CI job that calls the audit API, parses the returned grade, and sets the exit code accordingly. If the grade is below your policy threshold, the job fails and the merge is blocked. No humans required in the normal path; the security team's job becomes setting policy, not triaging individual findings.
What the CI webhook returns
A SkillAudit audit API call returns a JSON report card. The structure you need for grade-gating:
{
"audit_id": "au_8f3b4c2d",
"repo": "https://github.com/myorg/mcp-filesystem-server",
"grade": "B",
"score": 74,
"axes": {
"security": { "grade": "B", "score": 71, "findings": 2 },
"permissions": { "grade": "A", "score": 95, "findings": 0 },
"credentials": { "grade": "A", "score": 100, "findings": 0 },
"maintenance": { "grade": "B", "score": 68, "findings": 1 },
"compatibility": { "grade": "C", "score": 52, "findings": 3 },
"documentation": { "grade": "B", "score": 80, "findings": 0 }
},
"critical_findings": [
{
"axis": "security",
"code": "SSRF-001",
"severity": "HIGH",
"file": "src/tools/fetch.ts",
"line": 47,
"message": "fetch() called with unvalidated URL — SSRF risk"
}
],
"badge_url": "https://skillaudit.dev/badge/au_8f3b4c2d.svg",
"report_url": "https://skillaudit.dev/reports/au_8f3b4c2d"
}
The fields you'll use most in policy scripts: grade (overall letter), score (0–100 numeric for threshold comparisons), axes.security.grade (can set a stricter gate on security specifically), and critical_findings (any HIGH severity finding can be a hard block regardless of overall grade).
Strategy 1: fail-closed (recommended for production)
Fail-closed: block merges below grade threshold
The pipeline job fails (non-zero exit) if the audit grade falls below the configured minimum. PR merge is blocked until the author either fixes the findings or gets a security-team exemption override.
Pros
- Zero insecure servers reach the registry
- Author fixes issues in context
- Audit trail on every merge
- Security policy is code, not a checklist
Cons
- Can block urgent fixes if scanner is down
- Requires an exemption workflow
- Grade drops on dependency update PRs need policy
Strategy 2: fail-open (recommended for onboarding)
Fail-open: annotate and warn, never block
The pipeline job always exits 0, but posts the audit report as a PR comment and annotates findings with the inline annotation API. Developers see the security feedback but aren't blocked. Graduate to fail-closed after the team has adapted to the findings cadence.
Pros
- No friction during adoption period
- Developers learn the scoring system
- No exemption process needed
Cons
- Insecure servers still merge
- Findings can be ignored indefinitely
- Requires discipline to enforce later
Recommendation: Start fail-open for 2–3 sprints to calibrate grade baselines across your existing servers. Set the fail-closed threshold at the 25th percentile of current grades — blocks new regressions without retroactively blocking every open PR.
GitHub Actions: complete workflow
Save this as .github/workflows/mcp-security.yml in your MCP server repository:
name: MCP Security Audit
on:
pull_request:
branches: [ main ]
push:
branches: [ main ]
jobs:
skillaudit:
name: SkillAudit grade gate
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- name: Run SkillAudit
id: audit
env:
SKILLAUDIT_API_KEY: ${{ secrets.SKILLAUDIT_API_KEY }}
run: |
# Trigger audit — pass the public or private repo URL
REPO_URL="https://github.com/${{ github.repository }}"
RESPONSE=$(curl -sf \
-H "Authorization: Bearer $SKILLAUDIT_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"repo\": \"$REPO_URL\", \"ref\": \"${{ github.sha }}\"}" \
https://skillaudit.dev/api/v1/audits)
echo "$RESPONSE" > audit.json
GRADE=$(jq -r '.grade' audit.json)
SCORE=$(jq -r '.score' audit.json)
REPORT=$(jq -r '.report_url' audit.json)
SECURITY_GRADE=$(jq -r '.axes.security.grade' audit.json)
CRITICAL=$(jq '.critical_findings | length' audit.json)
echo "grade=$GRADE" >> $GITHUB_OUTPUT
echo "score=$SCORE" >> $GITHUB_OUTPUT
echo "report_url=$REPORT" >> $GITHUB_OUTPUT
echo "security_grade=$SECURITY_GRADE" >> $GITHUB_OUTPUT
echo "critical_count=$CRITICAL" >> $GITHUB_OUTPUT
echo "### SkillAudit Results" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
echo "| Overall grade | **$GRADE** ($SCORE/100) |" >> $GITHUB_STEP_SUMMARY
echo "| Security axis | **$SECURITY_GRADE** |" >> $GITHUB_STEP_SUMMARY
echo "| Critical findings | $CRITICAL |" >> $GITHUB_STEP_SUMMARY
echo "| Full report | [$REPORT]($REPORT) |" >> $GITHUB_STEP_SUMMARY
- name: Post PR comment
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const grade = '${{ steps.audit.outputs.grade }}';
const score = '${{ steps.audit.outputs.score }}';
const report = '${{ steps.audit.outputs.report_url }}';
const secGrade = '${{ steps.audit.outputs.security_grade }}';
const critical = parseInt('${{ steps.audit.outputs.critical_count }}', 10);
const emoji = { A: '🟢', B: '🔵', C: '🟡', D: '🟠', F: '🔴' };
const icon = emoji[grade] || '⚪';
const body = [
`## ${icon} SkillAudit: Grade **${grade}** (${score}/100)`,
'',
`| Axis | Grade |`,
`|------|-------|`,
`| Security | **${secGrade}** |`,
'',
critical > 0
? `⚠️ **${critical} critical finding(s) detected** — see full report.`
: '✅ No critical findings.',
'',
`[View full report](${report}) · [SkillAudit methodology](/methodology)`
].join('\n');
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
body
});
- name: Enforce grade policy
env:
MIN_GRADE: B # Set to A, B, C, or D
MIN_SCORE: 70 # Numeric fallback (used when grade ties)
BLOCK_ON_CRITICAL: true
run: |
GRADE="${{ steps.audit.outputs.grade }}"
SCORE="${{ steps.audit.outputs.score }}"
CRITICAL="${{ steps.audit.outputs.critical_count }}"
GRADE_ORDER="A B C D F"
min_index=$(echo $GRADE_ORDER | tr ' ' '\n' | grep -n "^${MIN_GRADE}$" | cut -d: -f1)
actual_index=$(echo $GRADE_ORDER | tr ' ' '\n' | grep -n "^${GRADE}$" | cut -d: -f1)
FAILED=false
if [ "$actual_index" -gt "$min_index" ]; then
echo "::error::Grade $GRADE is below minimum $MIN_GRADE"
FAILED=true
fi
if [ "$BLOCK_ON_CRITICAL" = "true" ] && [ "$CRITICAL" -gt "0" ]; then
echo "::error::$CRITICAL critical finding(s) detected — blocking merge"
FAILED=true
fi
if [ "$FAILED" = "true" ]; then
echo ""
echo "Fix the findings, push a new commit, and re-run the audit."
echo "For emergency exemptions, a team lead can add the 'security-exemption' label to bypass this gate."
exit 1
fi
echo "Grade policy passed: $GRADE ($SCORE/100)"
Secret setup: Add SKILLAUDIT_API_KEY to GitHub repository secrets (Settings → Secrets and variables → Actions). On the SkillAudit Team plan, private repo audits are enabled and the API key is per-workspace, not per-repository.
GitLab CI: equivalent pipeline
For GitLab CI/CD, add this to your .gitlab-ci.yml:
skillaudit:
stage: test
image: alpine:3.20
before_script:
- apk add --no-cache curl jq
script:
- |
REPO_URL="https://gitlab.com/$CI_PROJECT_PATH"
RESPONSE=$(curl -sf \
-H "Authorization: Bearer $SKILLAUDIT_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"repo\": \"$REPO_URL\", \"ref\": \"$CI_COMMIT_SHA\"}" \
https://skillaudit.dev/api/v1/audits)
echo "$RESPONSE" > audit.json
GRADE=$(jq -r '.grade' audit.json)
SCORE=$(jq -r '.score' audit.json)
CRITICAL=$(jq '.critical_findings | length' audit.json)
REPORT=$(jq -r '.report_url' audit.json)
echo "Grade: $GRADE ($SCORE/100)"
echo "Critical findings: $CRITICAL"
echo "Report: $REPORT"
# Emit GitLab metrics for the pipeline dashboard
cat > audit_metrics.txt <<EOF
skillaudit_score $SCORE
skillaudit_critical_findings $CRITICAL
EOF
# Grade gate
GRADE_VALUES="A:1 B:2 C:3 D:4 F:5"
MIN_GRADE="${SKILLAUDIT_MIN_GRADE:-B}"
get_value() { echo "$GRADE_VALUES" | tr ' ' '\n' | grep "^$1:" | cut -d: -f2; }
min_val=$(get_value "$MIN_GRADE")
actual_val=$(get_value "$GRADE")
if [ "$actual_val" -gt "$min_val" ]; then
echo "POLICY FAILED: Grade $GRADE below minimum $MIN_GRADE"
exit 1
fi
if [ "$CRITICAL" -gt 0 ]; then
echo "POLICY FAILED: $CRITICAL critical finding(s)"
exit 1
fi
echo "Policy passed."
artifacts:
reports:
metrics: audit_metrics.txt
paths:
- audit.json
expire_in: 30 days
variables:
SKILLAUDIT_MIN_GRADE: B
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
Set SKILLAUDIT_API_KEY in GitLab CI/CD variables (Settings → CI/CD → Variables, masked, not protected so it's available on feature branches).
Pre-commit hook: scan before push
For a second line of defence — or for teams where PRs are too infrequent for real-time feedback — add a pre-push hook that audits the local working copy before allowing a push to reach CI:
#!/usr/bin/env bash
# .git/hooks/pre-push
# Install: cp pre-push .git/hooks/pre-push && chmod +x .git/hooks/pre-push
# Or use pre-commit framework: see .pre-commit-config.yaml below
set -euo pipefail
API_KEY="${SKILLAUDIT_API_KEY:-}"
if [ -z "$API_KEY" ]; then
echo "SkillAudit: SKILLAUDIT_API_KEY not set — skipping pre-push audit"
exit 0
fi
REPO_URL=$(git remote get-url origin)
echo "SkillAudit: auditing $REPO_URL ..."
RESPONSE=$(curl -sf \
--max-time 60 \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d "{\"repo\": \"$REPO_URL\"}" \
https://skillaudit.dev/api/v1/audits) || {
echo "SkillAudit: audit API unreachable — failing open (push allowed)"
exit 0
}
GRADE=$(echo "$RESPONSE" | jq -r '.grade')
SCORE=$(echo "$RESPONSE" | jq -r '.score')
CRITICAL=$(echo "$RESPONSE" | jq '.critical_findings | length')
REPORT=$(echo "$RESPONSE" | jq -r '.report_url')
echo "SkillAudit: Grade $GRADE ($SCORE/100) | Critical findings: $CRITICAL"
echo "Full report: $REPORT"
if [ "$CRITICAL" -gt 0 ]; then
echo ""
echo "Push blocked: $CRITICAL critical security finding(s)."
echo "Fix them before pushing. Run locally: skillaudit scan ."
exit 1
fi
# Warn on grade below B, but don't block
if [[ "$GRADE" == "C" || "$GRADE" == "D" || "$GRADE" == "F" ]]; then
echo ""
echo "WARNING: Grade $GRADE — below recommended minimum B."
echo "This won't block your push, but CI will."
fi
exit 0
For teams using the pre-commit framework, add this to .pre-commit-config.yaml:
repos:
- repo: https://github.com/skillaudit-dev/pre-commit-hooks
rev: v1.0.0
hooks:
- id: skillaudit-scan
args: [--min-grade, B, --block-on-critical]
always_run: false
stages: [push]
Policy thresholds: what grade to require
| Context | Recommended threshold | Rationale |
|---|---|---|
| Internal tooling, single-team, low sensitivity | C + no criticals | Blocks only the worst offenders; workable during adoption |
| Enterprise internal deployment, multi-team | B + no criticals | Catches SSRF, command-exec, credential leakage; standard DevSecOps bar |
| Community marketplace / public directory listing | B on overall, A on security axis | Anthropic's directory requires security review; A on security axis approximates that bar |
| Regulated environment (HIPAA, SOC 2, FedRAMP) | A overall | Compliance audit evidence requires documented risk assessment; A-grade is the defensible artifact |
Axis-level gating
Overall grade is a weighted average. If your threat model cares most about a specific axis, gate on that axis independently. The most common split in enterprise deployments:
# Strict gate: overall B AND security A AND no credentials exposure
enforce_policy() {
local overall="$1"
local security="$2"
local credentials="$3"
local critical="$4"
local failed=false
# Overall grade: minimum B
[[ "$overall" == "C" || "$overall" == "D" || "$overall" == "F" ]] && {
echo "::error::Overall grade $overall below B"
failed=true
}
# Security axis: minimum A — no SSRF, command-exec, prompt-injection
[[ "$security" != "A" ]] && {
echo "::error::Security axis grade $security — must be A for this repo"
failed=true
}
# Credentials: any exposure = block regardless of score
[[ "$credentials" != "A" ]] && {
echo "::error::Credentials axis grade $credentials — credential exposure blocks merge"
failed=true
}
# Critical findings: hard block always
[[ "$critical" -gt 0 ]] && {
echo "::error::$critical critical finding(s) — hard block"
failed=true
}
$failed && exit 1
echo "All axis policies passed"
}
SBOM export and audit log for compliance
SkillAudit's Team plan exports a Software Bill of Materials alongside each audit report. In a regulated environment, capture this as a build artifact tied to the commit SHA:
- name: Export SBOM
run: |
AUDIT_ID=$(jq -r '.audit_id' audit.json)
curl -sf \
-H "Authorization: Bearer ${{ secrets.SKILLAUDIT_API_KEY }}" \
"https://skillaudit.dev/api/v1/audits/$AUDIT_ID/sbom" \
-o sbom.json
- name: Upload SBOM artifact
uses: actions/upload-artifact@v4
with:
name: skillaudit-sbom-${{ github.sha }}
path: sbom.json
retention-days: 365
This creates a per-commit SBOM artifact retained for one year — sufficient for SOC 2 Type II audit evidence and most HIPAA security risk assessments.
Exemption workflow
Grade-gates occasionally block legitimate work — a dependency upgrade that drops maintenance score, or a network tool with unavoidable external calls. You need an exemption path that doesn't become a rubber-stamp:
Author requests exemption
Adds the security-exemption-requested label on the PR. CI job detects the label and switches to fail-open mode — posts the audit results as a comment but exits 0.
Security team reviews
A required CODEOWNERS reviewer on the .github/security-exemptions/ path must approve before merge. They add the specific exemption to a YAML file: repo, commit SHA, findings exempted, expiry date, and justification.
Exemption expires
A weekly scheduled workflow re-runs audits on all exempted servers. If a finding is still open at expiry, the exemption file is auto-deleted and the next PR to that repo is blocked again. Exemptions don't accumulate silently.
# .github/security-exemptions/mcp-filesystem-server.yml
exemptions:
- finding: "SSRF-001"
file: "src/tools/fetch.ts"
justification: "fetch target is a fixed internal registry URL; runtime config prevents external calls"
exempted_by: "security-team@myorg.com"
expires: "2026-09-10"
commit: "a3f7c9d"
Monitoring: pipeline grade trends
Grade-gating prevents regressions, but monitoring catches slow drift — a server that held an A grade for months slowly accumulating C-grade maintenance findings as its dependencies go unmaintained.
Wire the score numeric output into your observability stack:
# In your scheduled re-scan workflow (runs weekly):
- name: Emit grade metric to Datadog
run: |
SCORE=$(jq -r '.score' audit.json)
GRADE=$(jq -r '.grade' audit.json)
REPO="${{ github.repository }}"
curl -sf \
-H "Content-Type: application/json" \
-H "DD-API-KEY: ${{ secrets.DD_API_KEY }}" \
"https://api.datadoghq.com/api/v1/series" \
-d "{
\"series\": [{
\"metric\": \"skillaudit.score\",
\"points\": [[$(date +%s), $SCORE]],
\"tags\": [\"repo:${REPO/\//:}\", \"grade:$GRADE\"],
\"type\": \"gauge\"
}]
}"
Alert on: score dropping below threshold over a 4-week window; any new critical finding on a server that was previously finding-free; any server going more than 30 days without a rescan.
Connecting CI/CD to SkillAudit's broader audit surface
CI/CD integration catches what can be caught statically at author time — SSRF patterns, command-exec paths, credential exposure, schema validation. The axes that improve after CI/CD adoption are security (finding-based) and credentials (static scan).
The axes that require runtime behaviour to score well are maintenance (last commit date, open CVEs in dependencies) and compatibility (tested against live Claude Code, Cursor, Windsurf clients). Those are scored by SkillAudit's scheduled weekly re-scan, not the CI webhook. See managing security debt over time for how to track maintenance grade drift and set up Dependabot to keep it green.
For the prompt-injection axis specifically — LLM-assisted red-teaming of tool inputs — the CI webhook runs a fast static pass. Deep prompt-injection testing (adversarial LLM calls against live tool handlers) requires the full audit report rather than the CI webhook. Run those on the main branch weekly, not on every PR. See how SkillAudit red-teams for prompt injection for the methodology.
Quick start: Copy the GitHub Actions workflow above, add your API key as a repo secret, set MIN_GRADE: C to start fail-open on criticals only, and push a commit. The first audit report will appear in your PR comment thread within 60 seconds. Graduate to MIN_GRADE: B after one sprint. That's the entire onboarding path for most teams.
Related resources
- Input validation patterns — what the CI scan checks for under the security axis
- Rate limiting deep dive — missing rate limiting is a C-grade trigger
- Security policy template — SECURITY.md spec that feeds the documentation axis
- Security debt over time — tracking maintenance grade drift after CI/CD is wired up
- Permissions hygiene checklist — the permissions axis in the CI report card
- SSRF attack patterns reference — the most common critical finding the CI gate will block