Blog
Skill and MCP server security, reported in public.
Methodology posts, public scan data, and field notes on the supply-chain surface of LLM agents. No fluff, no recycled vendor marketing.
-
2026-06-07 · Developer Guide
How to write a SkillAudit-ready SECURITY.md for your MCP server
A SECURITY.md file is one of the highest-leverage, lowest-effort improvements you can make before publishing an MCP server. This guide explains section by section what the SkillAudit scanner checks — Scope, Credentials, Vulnerability Reporting, Logging, Known Limitations, and Audit History — with the point values, the common mistakes that drop your grade, a full copy-paste template, and a minimum viable SECURITY.md for when you need to pass the gate quickly.
Read the post → -
2026-06-06 · Security Leadership
MCP server security for CISOs: executive briefing on LLM tool chain risk
MCP servers are the new shadow IT: they run on engineer laptops, hold production credentials, and execute code, yet most organizations approve them with no more scrutiny than a browser extension. This briefing covers the full threat model (supply chain, prompt injection, misconfiguration), a realistic incident walkthrough, how SkillAudit grades map to board-level risk tiers, the five controls that define a mature MCP security program, and a three-sentence board summary you can drop into your next risk review.
Read the post → -
2026-06-07 · Product Management
MCP server security for product managers: translating SkillAudit grades into business risk
Your team wants to ship faster by adding new MCP servers. Your security team is worried. This guide gives product managers the vocabulary to translate A–F SkillAudit sub-scores into business risk language, communicate findings to leadership and legal, and build a lightweight four-step intake process that evaluates new servers without creating a security bottleneck. Covers the sub-score override rule, common PM mistakes, and when to involve legal.
Read the post → -
2026-06-06 · Procurement & Governance
MCP server vendor security questionnaire: what to ask before approving internal adoption
Community MCP servers are typically evaluated with a GitHub star count and a README skim. This post gives procurement teams a 15-question security questionnaire covering SSRF, credential scope, prompt injection, dependency practices, and incident response — with the answers that should block adoption, how SkillAudit grades map to procurement risk tiers, a CI gate configuration for minimum grade enforcement, and a sign-off template for internal audit trails.
Read the post → -
2026-06-05 · Remediation
From C to A grade: a week-by-week MCP server remediation plan
A C grade is not a rejection — it is a prioritized work order. This post turns the SkillAudit report into a concrete four-week remediation plan: week 1 fixes Critical and High security findings (SSRF, injection), week 2 reduces permission scopes to minimum, week 3 removes credential logging and hardcoded fallbacks, week 4 commits the lock file and adds a SECURITY.md. Includes expected sub-score progression and the 30-day arc from C (60) to A (88).
Read the post → -
2026-06-05 · Security Architecture
The MCP server supply chain: trust boundaries from tool call to upstream API
Every MCP server sits at the center of a five-layer dependency graph. The attack surface that gets exploited spans all five: the LLM prompt boundary where adversarial context steers tool argument generation, the argument trust boundary where server code must validate LLM-supplied input as untrusted, the server code internals covering credential handling and logging, the npm dependency layer with ambient access to process.env and the network, and the upstream API credential scope and response content that can carry secondary prompt injection. Maps what SkillAudit checks at each layer and what a complete supply chain audit actually requires.
Read the post → -
2026-06-05 · Policy & Governance
How to write an MCP server security policy for your organization
A template security policy for organizations adopting MCP servers internally. Covers grade thresholds by environment (A for production PII, B for internal data, C for dev), a GitHub Action CI gate that enforces minimum grades and blocks F sub-scores, the vendor assessment checklist for new adoptions, the exception workflow (60-day expiry, manager + security sign-off, stored artifact required), re-audit triggers beyond the 90-day cadence, and a condensed plain-text policy block you can drop into your internal wiki.
Read the post → -
2026-06-05 · Compliance
MCP server audit trail design for SOC 2 and GDPR compliance
MCP servers that touch personal data or operate under SOC 2 scope need structured, tamper-evident audit logs. This post covers the minimum viable audit log schema (event_id, timestamp, session_id, user_id, tool_name, input_summary, outcome, data_classes, credential_ref), how to sanitize input_summary to avoid logging credentials or PII in plaintext, how to implement a SHA-256 hash chain for tamper-evidence, the SOC 2 criteria mapped to specific fields (CC6.1 logical access, CC7.2 monitoring, CC7.4 incident response), GDPR requirements (Article 30 RoPA, Article 15/20 Subject Access Requests, Article 17 right to erasure with the two-table architecture), a tiered retention scheme (90-day hot with PII, 12-month warm without PII, 7-year incident archive), and a checklist for PR review before deploying any MCP server that touches production data.
Read the post → -
2026-06-05 · Resilience Engineering
Designing for resiliency: how to build MCP servers that fail securely
Failure in an MCP server is not just an uptime problem — it is a security event. Without deliberate failure design, partial state, ambient credentials, and exploitable retry paths become the attack surface. This post covers eight patterns that turn failure into a security property: per-call timeout budgets with AbortController, circuit breakers that fast-fail on open upstreams, exponential backoff with full jitter to prevent retry storms, fail-closed authorization checks, idempotency keys for mutating tools, error message hygiene to prevent information disclosure via stack traces, graceful degradation with Promise.allSettled for multi-source tools, and connection pool limits with concurrent-call gates.
Read the post → -
2026-06-04 · Security Methodology
The SkillAudit scorecard explained: how we grade MCP server security across six axes
A complete breakdown of every check we run, how findings map to severity levels, how axis scores are combined into a letter grade, and what you need to fix to move up. Covers the weights for all six axes (Security 35%, Credentials 25%, Permissions 15%, Maintenance 12%, Compatibility 8%, Documentation 5%), the HIGH/MEDIUM/LOW/INFO severity tiers with point deductions, the A–F letter grade thresholds, a worked example showing how a single path traversal finding drops a server from A to B, and what an A grade actually requires in practice.
Read the post → -
2026-06-04 · Incident Response
MCP server security incident response playbook — what to do in the first 60 minutes
When a deployed MCP server has a confirmed security incident, the first sixty minutes determine whether you contain the breach or let it compound. This step-by-step playbook covers all six phases: confirming the incident and scoping the credential set (0–5 min), isolating the server and fencing credentials (5–15 min), rotating every long-lived credential without causing a cascading outage (15–30 min), preserving the audit log with a SHA-256 hash before log rotation (30–45 min), triaging what data was accessed and identifying the entry point (45–55 min), and communicating to vendors, stakeholders, and users (55–60 min). Also covers return-to-service validation and how SkillAudit's audit log history view supports both incident triage and post-incident verification.
Read the post → -
2026-06-04 · Security Analytics
SkillAudit grade drift: how MCP server security scores change over time
A SkillAudit score is not a permanent seal of approval. Analysis of 30-day rescan data shows that Maintenance and Dependency axes regress in 87% and 74% of servers respectively — driven by time and the external advisory ecosystem, not code changes. Security and Permissions hygiene axes improve after authors read their first report. This post covers drift velocity by axis, the badge staleness problem, rescan cadence recommendations, and how to set a CI minimum-grade gate that accounts for drift rather than treating the point-in-time scan as a permanent approval.
Read the post → -
2026-06-04 · Security Architecture
The minimal-footprint MCP server: building for security from the ground up
Most MCP security problems are architectural, not code-level — they stem from decisions made in the first hour. This guide covers six principles that collectively produce A-grade security from the first commit: stdio-only transport (no network attack surface), zero external dependencies (no supply-chain CVE exposure), per-tool credential isolation (blast radius bounded at the tool boundary), no shell invocation (no eval() or exec() on arguments), a declarative permission manifest (machine-readable blast-radius declaration), and an immutable append-only audit log. Includes a 60-line template you can drop into any new MCP server project.
Read the post → -
2026-06-04 · Security Methodology
Why MCP server security scanning is different from code scanning
Semgrep finds exec(userInput). It does not find the tool description that makes an LLM call exec() on untrusted input. This post maps the five gaps between traditional SAST and MCP-specific security scanning — prompt injection via tool descriptions, capability pairing amplification, trust boundary violations, context-window exfiltration, and maintenance drift — with a side-by-side coverage table showing what SAST catches, what MCP-specific scanning catches, and what falls through both.
Read the post → -
2026-06-03 · Security Architecture
Threat modeling an MCP server from scratch: the STRIDE approach
Most MCP servers ship without a threat model. STRIDE — Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege — gives you a structured framework to enumerate attack classes before writing a line of code. This post applies each STRIDE letter to the three MCP trust boundaries (caller→server, server→backend, tool description→model), with a concrete attack scenario, real mitigation code, and SkillAudit check name for each. Ends with a 30-minute threat model exercise you can run on any MCP server today.
Read the post → -
2026-06-03 · Security Testing
Property-based fuzzing for MCP server security testing
Unit tests verify the cases you thought of. Property-based fuzzing generates thousands of adversarial inputs automatically — path traversal strings, null bytes, Unicode bypass variants, boundary integers — and breaks your tool handlers before attackers do. Uses fast-check (Node.js) and Hypothesis (Python) to express four security properties: path containment, no exception leakage, numeric bounds, and Unicode blocklist bypass. Includes CI integration with fixed seeds, the shrinking advantage for minimal failing cases, and a ready-to-drop 30-line template for any MCP server project.
Read the post → -
2026-06-03 · Security Research
Anatomy of a prompt injection attack on an MCP server — kill chain trace
A step-by-step kill chain trace from planted document to exfiltrated secrets: how an attacker embeds a malicious instruction in a workspace file, waits for the user to ask the LLM to summarize it, and uses the LLM's own tool calls to read credentials and exfiltrate them via a webhook — without the user seeing any indication. Covers why correct path checks and HTTPS validation don't stop it, six defenses that actually work (domain allowlists, sensitive-file exclusions, trust tagging, capability separation, user confirmation gates, audit logging), and what SkillAudit's red-team check looks for.
Read the post → -
2026-06-03 · Security Research
The ten most common SkillAudit C grades — and what they share
After auditing hundreds of MCP servers, ten finding patterns account for most C grades: argument logging leaking credentials, over-declared permissions, unvalidated URL schemes, stack trace disclosure, implicit env var reads, shell:true with user-controlled args, unchecked numeric bounds, unsandboxed file paths, unaddressed medium advisories, and stale maintenance signals. All ten share one root assumption: that the caller is always trustworthy. Includes pre-screen checklist and the B-grade path for each finding.
Read the post → -
2026-06-03 · Security Architecture
Building a multi-agent MCP pipeline that doesn't trust itself: security isolation between agents
Identity verification stops impersonation — it does nothing against a compromised orchestrator that sends verified instructions to its workers. This post covers the three-layer model for multi-agent MCP pipeline security: privilege isolation by agent role (orchestrator vs. worker), command allowlists on cross-agent tools (no
Read the post →run_commandsurfaces), and confirmation gates on irreversible operations. Includes the threat model table showing what each compromised agent can do with and without the three layers applied. -
2026-06-03 · DevOps
MCP server security CI/CD pipeline: a complete build pipeline audit checklist
Most MCP servers ship without a single automated security check in CI. This post builds the complete pipeline: pre-commit hooks with ESLint and semgrep, a SkillAudit grade gate in your GitHub Actions PR workflow, lockfile enforcement with npm ci, a shell script that validates declared permissions against actual API usage in source, and branch protection settings that make a security regression impossible to merge. Includes the full combined GitHub Actions YAML and a minimum-viable vs. full-pipeline comparison table.
Read the post → -
2026-06-02 · Deep Dive
The ambient token problem: how LLM-controlled credential selection enables silent exfiltration
When multiple credentials are in scope for an MCP server, the model — not the code — selects which one to use. A prompt injection instruction can direct that selection to an attacker endpoint. Three real scenarios (credential-name argument, credential-store tool, credentials in descriptions), the complete exfiltration kill chain, why static analysis misses it, and the per-tool credential isolation fix that eliminates the attack surface entirely.
Read the post → -
2026-06-02 · Forensics
Five MCP servers that nearly earned an A — and what they fixed to get there
A forensic look at five near-miss MCP servers that scored B or C on first SkillAudit scan. Each got four or five of six axes right — but one specific pattern tripped them. Covers a database bridge leaking connection strings on error, a filesystem tool with over-declared write permissions, an HTTP proxy with unchecked SSRF, a sandbox with a misleading manifest, and a notification dispatcher vulnerable to tool-description injection. The exact finding and the fix for each.
Read the post → -
2026-06-02 · Buyer Guide
MCP server security for non-technical buyers: a 10-minute field guide
A plain-language guide for team leads evaluating MCP servers for internal adoption. What each SkillAudit axis measures — Security, Credentials, Permissions, Maintenance, Compatibility, Documentation — what A through F grades mean in practice, and a three-outcome decision framework (install / review first / block) for turning an audit report into an adoption decision. No code required.
Read the post → -
2026-06-02 · Research
30-day re-scan delta: what moved on a fresh recrawl of the 101-server MCP corpus
Thirty days after publishing the initial 101-server MCP security scan, we re-ran the full corpus. 7 of 67 F-graded servers improved. 6 were archived or deleted. 88 were unchanged. The vendor-official tier: still zero improvements. Here's the full breakdown — what changed, which disclosure mechanisms drove improvement, and how we project the numbers to move through the rest of 2026.
Read the post → -
2026-06-02 · Engineering
How to write a zero-finding MCP server: a step-by-step construction guide
Only two of the 101 servers in our corpus have zero HIGH findings across all six audit axes. Both share the same construction pattern: schema-first Zod tool design, an allowlist-based SSRF firewall, env-only credential isolation, a command execution allowlist (no
Read the post →shell: true), and a maintenance posture with exact pinning and Dependabot. This guide walks through each step with code you can copy, explains the WHY behind each decision, and closes with a complete reference skeleton that passes all six axes out of the box. -
2026-06-01 · Research
MCP server security in 2026: state of the ecosystem mid-year update
Six weeks after publishing the 101-server corpus scan: 36.7% SSRF, 43% unsafe command-exec, 79% vendor-official F grades. Not a single vendor-official server improved after six weeks of public disclosure. Here's what changed, what didn't, where the attack surface is growing in H2 (multi-agent pipelines, event-driven architectures, GraphQL/gRPC proxies), and what a healthy MCP security posture looks like heading into the second half of 2026.
Read the post → -
2026-06-01 · Engineering
The MCP server permissions checklist: 5 questions before you request org scope
The permissions axis is where the blast radius math is most unforgiving: a server that requests
Read the post →admin:organd has an SSRF vulnerability doesn't have two problems — it has one full-account-takeover problem. This checklist gives you five questions to answer before submitting to the Anthropic Skills Directory: GitHub Apps vs PATs, minimal scope declaration, permanent vs ephemeral tokens, credential-in-URL patterns, and LLM-controlled credential selection. Code examples for every fix and the exact SkillAudit finding for each. -
2026-06-01 · Engineering
MCP server dependency pinning: a supply chain incident walkthrough
A floating
Read the post →^in yourpackage.jsonis a standing invitation for npm to fetch whatever the upstream maintainer publishes next. This post walks through a concrete incident timeline — a compromised transitive dependency silently exfiltrating SSH keys and AWS credentials from 400 installers — then shows the three changes that prevent it: exact version pinning, lockfile integrity enforcement withnpm ci, and a patch-only Dependabot configuration that delivers security updates within 24 hours of advisory publication. -
2026-06-01 · Research
Vendor-official vs community MCP servers: updated security grade breakdown
Six weeks after our April 2026 scan, the grade distribution has not materially shifted: 79% of vendor-official MCP servers in our corpus earn F, versus 26% of community-maintained servers. Only one vendor-official server (Microsoft Playwright) earns A. The counterintuitive finding: indie devs who understand the SSRF and credential threat model outperform large corporate teams who ship MCP servers as afterthoughts. Grade distribution tables, three structural reasons behind the gap, what changed after public disclosure, and what it means for Anthropic's directory certification program.
Read the post → -
2026-06-01 · Research
MCP server permission scope patterns — what the corpus shows
68% of corpus MCP servers request org-wide API scopes when only repo-level access is needed. This research post shows the three code patterns behind that number — OAuth app scope over-declaration, permanent tokens where ephemeral tokens would work, and credentials in URL parameters — explains the blast-radius math (SSRF × org-write = full account takeover), and gives the scope-down implementation that eliminates each finding without breaking functionality.
Read the post → -
2026-05-31 · Engineering
GitHub Action gate: enforcing MCP security grades in CI/CD — the complete setup guide
The full CI/CD implementation guide for teams wiring a SkillAudit grade gate into GitHub Actions: lockfile diffing, observe-only mode, multi-agent-client support, org-level reusable workflows, the exception path with expiry tokens, a weekly re-scan cron with Slack alerts on grade regressions, branch protection integration, and the Team-plan version-pinned grade endpoint. Three complete workflow files, copy-paste ready.
Read the post → -
2026-05-31 · Buyer Guide
How to Read a SkillAudit Report — Understanding Every Section of an MCP Security Audit
A complete field guide to every section of a SkillAudit report: what the six axis grades mean (Security, Credentials, Permissions, Maintenance, Compatibility, Documentation), when HIGH vs WARN vs PASS vs INFO applies, the install-gate decision framework for team leads (install / review first / block), and a prioritized fix order for authors. Includes the grade scale (A–F definitions with corpus context), how the badge and public permalink work, and the three gaps the current engine doesn't cover.
Read the post → -
2026-05-31 · Engineering
MCP Server Security Testing: What Static Analysis Catches and What It Doesn't
An honest accounting of the SkillAudit v0.3 engine after 101 servers: what static AST/taint analysis reliably catches (SSRF, command injection, hardcoded secrets, credential echoes), what the LLM-probe layer adds (prompt-injection susceptibility scoring, scope vs. handler drift confirmation), and three finding classes neither layer handles well (cross-tool privilege chaining, long-lived session state, unsafe deserialization). Includes a 9-row coverage table mapping each finding class to each detection layer.
Read the post → -
2026-05-31 · Security Research
MCP Server OWASP Top 10: What the Threat Map Actually Looks Like After 101 Servers
A field guide from running SkillAudit across 101 community and vendor-official MCP servers — which OWASP categories map cleanly onto Model Context Protocol, which stretch, and which three MCP-specific threats (credential echo into tool response, permission scope vs. handler drift, client compatibility drift) have no clean home in either the API Top 10 or the LLM Top 10. With corpus rates, priority order, and specific code examples for each class.
Read the post → -
2026-05-31 · Hardening Guide
MCP Server Security Checklist: 12 Items Before You Publish
Across 101 MCP servers in the public corpus, 42 earned an F — and most of those failures trace back to the same dozen patterns. A 12-item hardening checklist covering every check the SkillAudit engine runs: SSRF prevention, command-exec gates, prompt-injection walls, credential handling, permission scope, dependency pinning, CHANGELOG discipline, client compatibility testing, and runnable documentation. One item per axis, with concrete grep commands for each and code patterns showing the before and after.
Read the post → -
2026-04-30 · Engineering
Anatomy of a credential leak — four patterns across 38 of 101 MCP servers
Of the 101 community Model Context Protocol servers we audited, 38 emit findings on the credentials axis. The leaks group into four named patterns: 64 hardcoded secrets in source across 18 repos (30 OpenAI/Anthropic-style keys, 10 GitHub PATs, 8 Stripe test secrets, 6 AWS access keys, plus Slack and GitHub OAuth), 13 echoes of
Read the post →process.env/os.environto stdout across 7 repos including Klavis-AI, PipedreamHQ, Honeycomb, mcp-use, punkpeye/fastmcp, Stripe agent-toolkit, and Pydantic AI, 1 error-message env-var echo at JetBrains, and 44.envfiles committed to the repo tree across 28 repos. Why credentials are the one axis whose blast radius is unbounded by the host process — every leaked value travels with the LLM conversation. Bad-vs-good code shapes per pattern, install-gate rule, author + buyer checklists. -
2026-04-30 · Research
Nine of 101 most-installed MCP servers are archived — what the maintenance signal looks like in 2026
Of the 101 community Model Context Protocol servers we audited, nine have been declared dead by their maintainers — eight of them vendor-official, including Azure (1,213 stars), Gmail Server (1,098 stars), Mem0, E2B, Pydantic Logfire, PostHog, Honeycomb, and two of Anthropic's own scaffolders. Four more haven't been pushed to in 365+ days; sixty lack a SECURITY.md disclosure file. The maintenance axis is the only one of the six that gets worse over time without anyone touching the code. Full list with grades, days-since-push distribution, and a four-signal install-gate framework. Calendar-axis counterpart to the anatomy-of-an-A post.
Read the post → -
2026-04-30 · Engineering
Anatomy of an A-grade MCP server — five code patterns shared by the 19 that passed our 101-repo audit
Of the 101 community Model Context Protocol servers we audited, 19 earned an A and two earned a perfect 100. Reading their source side by side, the patterns that separate them from the 82 that didn't are five boring habits — no
Read the post →fetch(url)from a tool argument without an allowlist, noexecwith template-string argv, env-vars read once at startup never echoed, narrow verb-shaped tool surface, and current maintenance signal. The single missing piece across 17 of 19 is the same: noSECURITY.md. The two perfect 100s (LangChain MCP adapters, Vectara) are explained, plus the Microsoft Playwright case study (A despiteexecSyncin tests). Author and buyer checklists at the bottom. -
2026-04-30 · Engineering
Engine v0.3 calibration delta — 22 grades moved when surface tiering shipped
SkillAudit engine v0.3 introduces surface tiering — findings in examples, benchmarks, top-level scripts, and installer code now deduct at lower weight than runtime tool surface. Across the 101-repo audit corpus, 22 grades moved: 9 letter promotions (Stripe F→C, Anthropic's MCP TypeScript SDK F→B as the corpus's first B, plus GitHub MCP, Grafana, Pydantic AI, MCP Go SDK, MCP Python SDK, MCP Quickstart, lastmile-ai), 8 within-band lifts (including Vectara A 90→100 — the corpus's second perfect score), and 5 honest cap-fix drops (punkpeye/fastmcp D→F, glips/figma-context-mcp D→F, plus three F-band internal moves) where v0.2's shared-cap bug had been silencing real production-source SSRFs. Every move named, every audit page linked.
Read the post → -
2026-04-30 · Install guide
The MCP install shortlist — 19 community servers that earn an A in our 101-repo audit (April 2026)
Of the 101 most-installed Model Context Protocol servers we audited, 19 earned an A grade. The full shortlist grouped by use case (vector DBs, operational DBs, web fetch, backend platforms, voice, frameworks) — Pinecone, Qdrant, Milvus, Vectara, Meilisearch, Redis, ClickHouse, Couchbase, Snowflake, Exa, FireCrawl, DuckDuckGo, fetch-mcp, Microsoft Playwright, Appwrite, Box, FastAPI-MCP, ElevenLabs, and the LangChain MCP adapters at a perfect 100/100. The buyer-side counterpart to the vendor-official F-grades post.
Read the post → -
2026-04-30 · Playbook
Block 52 of 101 community MCP servers with one CI gate — the 2026 team policy template
A min-grade-C team policy blocks 52 of the 101 most-installed Model Context Protocol servers in 2026, including 29 vendor-official releases your developers would have waved through on brand alone. The full policy paragraph, the 30-line GitHub Action that gates new installs on the SkillAudit grade, the re-scan cadence, the 12-week rollout calendar, and the four week-1 gotchas your security engineer will hit.
Read the post → -
2026-04-29 · Research post
29 vendor-official MCP servers earned an F — every name, every file path
A line-by-line walk through every vendor-official MCP server in our 101-repo corpus that ended at an F grade — Cloudflare, Stripe, Heroku, MongoDB, GitHub, AWS, Azure, Auth0, Sentry, PostHog, Anthropic's own SDKs, and 18 more. Real findings, real file paths, honest calibration notes on where the engine is unambiguous and where it is still learning to subdivide runtime tool surface from examples and scripts.
Read the post → -
2026-04-24 · Research post
We scanned 52 MCP servers — 56% had SSRF, 44% leaked credentials
SkillAudit's first 52-repo scan — vendor-official releases from AWS, Cloudflare, Stripe, MongoDB, Heroku, Redis, and indie frameworks — with every report public. Real grades, ranked F offenders, A-grade counterfactuals. (Corpus since expanded to 71; see the UPDATE banner inside the post.)
Read the post → -
2026-04-23 · Launch post
Why 36.7% of community MCP servers fail a basic SSRF check
A public 2026 scan of community Model Context Protocol servers found SSRF in more than a third of them and unsafe command execution in 43%. Here is what that actually looks like in code, why existing dependency scanners miss it, and what we built about it.
Read the post →