The MCP Server Security Review Checklist: 50 Questions for Teams
Deploying or approving a community MCP server? Use this 50-question checklist to systematically audit authentication, authorization, network trust, secrets handling, supply chain integrity, observability, and incident response posture — before a tool call reaches production.
Why a checklist?
A SkillAudit scan gives you an automated grade across six axes — but automated analysis has limits. It can't tell you whether your team has an incident-response contact for a server you're about to adopt, or whether a third-party MCP server's privacy policy covers the data you'll pass through it. That's human-judgment territory.
This checklist bridges the gap. Run it alongside (or instead of) an automated scan for any MCP server before it becomes load-bearing in your agent workflows. It takes about 30 minutes for a typical server; a thorough review with test execution takes 2–4 hours.
How to use severity tiers: Items marked Blocker mean the server should not be deployed until the issue is resolved. Major items create meaningful risk that should be remediated before production workloads. Minor items are hygiene improvements you can schedule in the next sprint.
1. Authentication 6 questions
Authentication questions focus on how the MCP server verifies caller identity — token validation, algorithm pinning, and session lifecycle. Weaknesses here map directly to the JWT algorithm confusion class of vulnerabilities (CVSSv3 9.8).
-
1
Does the server explicitly pin the JWT algorithm (e.g.
algorithms: ['RS256']) and rejectalg:none? BlockerAccepting algorithm negotiation from the client lets an attacker downgrade to HMAC or
alg:noneand forge tokens. Check source forjwt.verify(token, secret)without analgorithmsoption, orjwt.decode()used in auth paths. -
2
Is the HMAC secret long enough and randomly generated (≥256-bit entropy)? Blocker
Short, predictable, or hardcoded HMAC secrets can be cracked offline in minutes using hashcat. Run
grep -r 'secret' --include="*.js"looking for quoted string literals. Any JWT secret derivable from environment variable names likeSECRET=passwordis a blocker. -
3
Are
iss(issuer) andaud(audience) claims validated on every token? BlockerWithout issuer and audience checks, a token from one service can be replayed against another. Many MCP servers only check the signature. Look for
issuerandaudienceoptions in thejwt.verify()call. -
4
Is token expiry (
expclaim) enforced and is the validity window ≤ 1 hour? MajorTokens without expiry or with multi-day validity windows remain exploitable long after a breach. Most JWT libraries enforce
expby default, but check that noignoreExpiration: trueoption exists in the verify call. -
5
Is there a token revocation mechanism for compromised credentials (deny-list or short-lived + refresh)? Major
Stateless JWTs cannot be revoked once issued. For long-running MCP servers handling sensitive operations, the server should maintain a deny-list or use short-lived access tokens (≤15 min) with refresh token rotation.
-
6
Are authentication failures rate-limited to prevent brute-force? Minor
Without rate limiting on auth endpoints, attackers can attempt credential stuffing at high speed. Check for middleware like
express-rate-limitor equivalent applied to token endpoint routes.
2. Authorization & Permissions Hygiene 7 questions
Authorization questions cover whether the server enforces least-privilege at the tool, resource, and API level. See the permissions hygiene deep-dive for patterns.
-
7
Does the server request only the permissions it actually uses — no over-broad scopes? Blocker
An MCP server that requests
fs:read *when it only needsfs:read /tmp/uploadscreates unnecessary blast radius. Review the manifest / permissions declaration against actual tool implementations. -
8
Is authorization checked at the tool invocation level, not just at connect time? Blocker
Some servers authenticate the session once at connect, then trust all subsequent tool calls. A compromised client can call any tool. Each tool handler should re-validate that the caller has permission for that specific operation.
-
9
Are admin-only tools gated behind a separate privilege check, not just a different route? Major
Check whether
admin_*orinternal_*tools have an explicit permission guard inside the handler function, not just a router-level middleware that could be bypassed by direct invocation. -
10
Does path traversal protection cover all file-reading tools (deny
../and absolute paths outside the allowed root)? MajorFile-reading tools are a common path traversal target. The fix is
path.resolve()+ checking that the result still starts with the allowed root, not a simple.replace('../', '')which can be bypassed with URL encoding. -
11
Is object-level authorization enforced? (Can user A access user B's data by changing an ID in tool arguments?) Major
Horizontal privilege escalation via direct object reference is the OWASP API Security #1. Any tool that accepts a user or resource ID must verify the calling user is authorized for that specific ID, not just that they're authenticated.
-
12
Are tool descriptions and parameter names non-misleading? (No prompt injection via tool metadata?) Minor
Tool names and descriptions are injected into the LLM's context. A malicious server can include hidden instructions in tool descriptions to manipulate the agent's behavior. Review
name,description, andinputSchema.descriptionfields for embedded instructions. -
13
Is there a documented permission model (what scopes exist, what each grants, how they're assigned)? Minor
Without documentation, reviewers can't verify the permission model is complete. Team leads can't write a policy. Check
SECURITY.md,README.md, and API docs.
3. Input Validation & Injection Prevention 7 questions
Input handling questions cover prompt injection, SSRF, command injection, and SQL injection — the four most common active attack paths against MCP tools in 2026.
-
14
Does every tool parameter have a JSON Schema type constraint, and is it enforced server-side? Blocker
Type constraints are the first defense layer. A
urlparameter typed asstringwith no format validation acceptsfile:///etc/passwd. At minimum, check thatinputSchemais present and that server-side validation rejects malformed payloads before handler code runs. -
15
Are network-fetch tools protected against SSRF? (Block RFC1918, metadata endpoints, and
file://URLs.) BlockerSSRF was found in 36.7% of scanned community MCP servers. The fix requires resolving the hostname to an IP and checking it against a blocklist before the request is made — not just checking the URL string, which can be bypassed with DNS rebinding or redirects.
-
16
Are shell-executing tools free of command injection? (No
exec()orshell: truewith unsanitized input.) BlockerCommand injection via shell interpolation is an instant RCE. Grep for
exec(,spawn(,shell: true, andexecSync(. Any user-controlled variable interpolated into a shell command string is a blocker. UseexecFilewith argument arrays instead. -
17
Are database-accessing tools using parameterized queries (no string interpolation into SQL)? Blocker
SQL injection in an MCP server is particularly dangerous because the LLM can be manipulated into crafting injection payloads via prompt injection. Check for
`SELECT ... ${userInput}`patterns. All database queries must use prepared statements or an ORM that enforces parameterization. -
18
Does the server sanitize or strip content returned from external sources before returning it to the LLM? Major
External content (web pages, files, database rows) can contain prompt injection payloads — text designed to hijack the LLM's behavior. A server that fetches and returns raw HTML gives an attacker control over what the agent does next. See the prompt injection anatomy post for detection patterns.
-
19
Are file upload tools restricted to expected MIME types and file sizes, with server-side validation? Major
Client-supplied
Content-Typeheaders cannot be trusted. Usefile-typeor equivalent magic-byte detection server-side. Enforce size limits before reading the stream to prevent memory exhaustion. -
20
Are string length limits enforced on all tool inputs to prevent DoS via large payloads? Minor
Unbounded string inputs can exhaust memory, CPU (via regex backtracking), or downstream service quotas. Add
maxLengthconstraints in JSON Schema and reject early in validation middleware.
4. Network Security & Transport 5 questions
Network questions cover TLS configuration, port exposure, and trust boundary assumptions between the MCP server and downstream services.
-
21
Is all inter-service communication encrypted (TLS 1.2+ with valid certificates — no
rejectUnauthorized: false)? BlockerDisabled TLS verification is one of the most common findings in MCP server source audits. Grep for
rejectUnauthorized: false,NODE_TLS_REJECT_UNAUTHORIZED=0, andverify: false. These patterns in non-test code are always a blocker. -
22
Does the server bind only to necessary network interfaces? (Not
0.0.0.0unless a public listener is intentional.) MajorAn MCP server deployed as an internal service should bind to
127.0.0.1or a private interface. Binding to0.0.0.0in a container or VM exposes the server to every network interface the host has, including public ones. -
23
Are CORS origins restricted to known, specific domains (no wildcard
*on credentialed endpoints)? MajorA wildcard CORS policy on endpoints that accept credentials allows any website to make authenticated requests on behalf of an authenticated user's browser. Check the CORS configuration for
origin: '*'combined withcredentials: true— that combination is forbidden by the spec but some frameworks silently accept it. -
24
Are timeouts configured on all outbound HTTP calls to prevent resource exhaustion via slow-response attacks? Major
Without timeouts, a slow or unresponsive downstream server holds connections open indefinitely. Check that all
fetch(),axios, andgotcalls have explicittimeoutorsignal: AbortSignal.timeout(ms)options. -
25
Is the server's network topology documented? (What it connects to, what connects to it, what trust boundaries exist.) Minor
Without a network diagram or architecture doc, reviewers can't assess the lateral movement potential if the server is compromised. This is a hygiene item for production deployments, not a show-stopper for initial review.
5. Secrets Management 6 questions
Secrets management questions look for hardcoded credentials, insecure environment variable handling, and key rotation practices. See the secrets management deep-dive for full treatment.
-
26
Are secrets absent from the source code and git history (no API keys, passwords, or tokens in committed files)? Blocker
Run
git log --all --full-history -- '*.env' | headandtrufflehog git file://.to check history. A secret committed and then deleted is still in the git object store. The only remediation is secret rotation — the historical version is permanently compromised. -
27
Are secret values never logged, even partially? (No
console.log(config)or error serialization that dumps env vars.) BlockerGrep for
console.log(process.env,JSON.stringify(config, andconsole.log(...config. Error objects that includereq.headerscan also leak authorization headers in structured logs. Check error handler middleware for full request/response serialization. -
28
Is there a documented secret rotation procedure and has it been tested? Major
An undocumented rotation procedure means secrets stay compromised for hours or days after detection while someone figures out the process under pressure. The procedure should be written, tested in staging, and reviewed in this security check.
-
29
Are secrets scoped to minimum necessary permissions? (DB credentials with only SELECT, not superuser.) Major
Overly privileged credentials turn a credential leak into a full compromise. Database credentials should use a role with only the SQL operations the server actually needs. API keys should use the minimal OAuth scope.
-
30
Is a secrets manager (AWS Secrets Manager, Vault, Doppler) used rather than environment variables for sensitive credentials? Minor
Environment variables are visible in process listings, crash dumps, and some monitoring tools. For credentials that access sensitive data, use a secrets manager with audit logging and automatic rotation. This is a hygiene item for most contexts, a major item for PII-handling servers.
-
31
Is a
.env.examplefile maintained with placeholder values (not real ones) to document required secrets? MinorDevelopers sometimes commit real
.envfiles because they don't know what's expected to stay local. A maintained.env.exampleremoves the ambiguity and should be the documented onboarding reference.
6. Supply Chain Integrity 7 questions
Supply chain questions focus on dependency risk, provenance, and the risk that a compromised upstream package introduces malicious behavior into the MCP server. See the supply chain attestation page for SLSA and Sigstore patterns.
-
32
Are all dependencies pinned to exact versions in
package.json(no^or~ranges for production deps)? BlockerUnpinned ranges allow a malicious or buggy package update to silently enter your dependency tree on the next install. Pin production dependencies to exact versions and use a lockfile (
package-lock.json/yarn.lock). CI should runnpm ci, notnpm install. -
33
Are there no known high/critical CVEs in the dependency tree (
npm auditreturns 0 high+)? MajorRun
npm audit --audit-level=high. Any high or critical findings are a major blocker for production deployment. Check whether the CVE is actually in the code path used by this server — a critical CVE in a test-only dependency is lower risk than one in a runtime path. -
34
Do
postinstallscripts execute, and if so, have they been reviewed? MajorMalicious packages often use
postinstallhooks to exfiltrate environment variables at install time. Runnpm install --ignore-scriptsand check what breaks — anything that breaks was running code at install time. Review those scripts explicitly. -
35
Is the published artifact (npm package or Docker image) built from a verified, auditable CI pipeline? Major
A developer's local machine may be compromised. Packages should be built and published from CI with SLSA provenance attestations. Check whether the package's published version matches the GitHub release SHA.
-
36
Is the dependency count reasonable (fewer than ~50 direct+transitive for a focused tool server)? Minor
Every dependency is an attack surface. An MCP server with 300 transitive dependencies for simple REST proxying is carrying disproportionate supply chain risk. Audit with
npm ls --depth=0and question any heavyweight framework dependencies. -
37
Is a Software Bill of Materials (SBOM) generated and attached to releases? Minor
An SBOM lets downstream users run their own vulnerability scans and satisfy regulatory SBOM requirements (EO 14028 in the US, CRA in the EU). Generate with
cdxgenorcyclonedx-npmand attach the JSON to GitHub releases. -
38
Is there a
SECURITY.mdfile with a responsible disclosure policy and contact address? MinorWithout a disclosed vulnerability reporting path, researchers finding security issues have no clear channel and may disclose publicly. A
SECURITY.mdat the repo root with a contact email and expected response time is a minimal standard.
7. Observability & Auditability 6 questions
Observability questions cover whether the server generates the structured log data needed to detect, investigate, and respond to incidents. A server you can't observe is a server you can't secure in production.
-
39
Does every tool invocation produce a structured log entry (tool name, caller identity, timestamp, outcome)? Blocker
Without per-invocation logs, you cannot reconstruct what an agent did in an incident. This is the minimum audit trail required for any production deployment. Check the handler code for a logging call that captures at least these four fields.
-
40
Are authentication events (success, failure, token refresh) logged with enough context to detect brute-force? Major
Auth failure logs need: timestamp, source IP, user identifier (not password), and outcome. Without IP logging, distributed brute-force attacks are invisible. Without user identifier, you can't see which accounts are being targeted.
-
41
Are logs written to an external sink (not just stdout/stderr), with structured JSON format? Major
Container stdout logs are ephemeral — they disappear when the container restarts. For incident investigation, logs must be forwarded to a persistent sink. Structured JSON (not free-form text) is required for reliable parsing and alerting.
-
42
Is there an alert or metric for anomalous tool call volume that would surface an agent-loop or abuse incident? Major
Runaway agent loops and abuse both generate abnormal call volumes. A simple rate-of-invocation metric with a threshold alert catches both. Without this, an agent loop can consume quota or downstream API credits undetected.
-
43
Can a specific tool invocation be correlated to the originating conversation via a trace ID? Minor
Distributed tracing across the LLM → MCP server → downstream service boundary lets you reconstruct the full causal chain for a security incident. This requires passing a correlation ID from the tool call through to all downstream calls.
-
44
Are log retention policies defined and do they meet your compliance requirements? Minor
GDPR requires deletion of PII-containing logs after the retention period. SOC 2 typically requires 90 days minimum. Healthcare contexts require 6 years (HIPAA). Know what's in your logs and how long you're keeping it.
8. Incident Response Readiness 6 questions
Incident response questions measure whether your team is prepared to respond when something goes wrong — not if, when. These are organizational, not just code-level, questions.
-
45
Is there a named owner for this MCP server with an on-call escalation path? Major
Ownerless infrastructure gets ignored in incidents. Record in your service catalog: team owner, primary on-call contact, escalation path. Community MCP servers without a named owner for your deployment create a response gap.
-
46
Can the server be disabled or have its credentials rotated in under 15 minutes during an active incident? Major
The mean time to contain matters as much as detection. Test your kill switch: can you revoke the API key this server uses, rotate its JWT signing secret, or disable it at the load balancer — and have that take effect within 15 minutes? If not, document the gap.
-
47
Is there a written runbook for the most likely security incidents (credential compromise, prompt injection, data exfiltration)? Major
During an active incident, you do not want to be designing the response process. A runbook with decision trees for each scenario reduces MTTR and prevents ad-hoc decisions under pressure. See the MCP incident response playbook post for a template.
-
48
Have you run a tabletop exercise simulating an MCP server compromise in the last 6 months? Minor
A tabletop exercise surfaces gaps in your runbook and incident coordination before a real incident. For MCP servers handling sensitive data or agentic workflows with write access, a 2-hour tabletop per year is a reasonable baseline.
-
49
Does the server's privacy policy or data processing agreement cover the data types being passed through tool calls? Minor
Agentic workflows often pass PII (names, emails, documents) through MCP tools without the data ever being shown to a human reviewer. Confirm that the server's privacy policy and your DPA covers this processing.
-
50
Is there a cadence for re-reviewing this MCP server as its codebase and dependency tree evolve? Minor
A server that passed review six months ago may have introduced new dependencies, new tool handlers, or new configuration since. For actively developed servers, schedule a re-review on every major version bump. For stable servers, set a calendar reminder for annual re-review.
Scoring summary
Tally your findings by severity. A deployable server has zero Blockers, ideally zero Majors (or a remediation plan with deadlines for any remaining), and Minor items tracked as tech debt.
| Category | Blocker | Major | Minor | Total |
|---|---|---|---|---|
| 1. Authentication | 3 | 2 | 1 | 6 |
| 2. Authorization & Permissions | 2 | 3 | 2 | 7 |
| 3. Input Validation & Injection | 4 | 2 | 1 | 7 |
| 4. Network Security | 1 | 3 | 1 | 5 |
| 5. Secrets Management | 2 | 2 | 2 | 6 |
| 6. Supply Chain | 1 | 3 | 3 | 7 |
| 7. Observability | 1 | 3 | 2 | 6 |
| 8. Incident Response | 0 | 3 | 3 | 6 |
| Total | 14 | 21 | 15 | 50 |
Automate the automatable parts. About 30 of these 50 questions can be partially answered by a static analysis tool or automated scanner. Run a SkillAudit scan first to flag the obvious issues, then use this checklist for the questions that require reading context, testing behavior, and evaluating organizational readiness. The two approaches are complementary, not redundant.
How SkillAudit maps to this checklist
When you run an automated audit on SkillAudit, the findings map to this checklist as follows:
- Security axis → questions 14–20 (input validation) and 21–25 (network security)
- Permissions hygiene axis → questions 7–13 (authorization)
- Credential exposure axis → questions 26–31 (secrets management)
- Maintenance axis → questions 32–38 (supply chain) and 50 (re-review cadence)
- Client compatibility axis → not covered in this checklist (separate evaluation)
- Documentation axis → questions 13, 37, 38
The automated scan cannot assess questions 39–50 (observability and incident response) — these require organizational context. Treat those sections as the human reviewer's exclusive domain.