Jun 13, 2026 · 17 min read · Security

The MCP Server Security Review Checklist: 50 Questions for Teams

Deploying or approving a community MCP server? Use this 50-question checklist to systematically audit authentication, authorization, network trust, secrets handling, supply chain integrity, observability, and incident response posture — before a tool call reaches production.

review questions

security domains

blocker-class items

major-class items

Why a checklist?

A SkillAudit scan gives you an automated grade across six axes — but automated analysis has limits. It can't tell you whether your team has an incident-response contact for a server you're about to adopt, or whether a third-party MCP server's privacy policy covers the data you'll pass through it. That's human-judgment territory.

This checklist bridges the gap. Run it alongside (or instead of) an automated scan for any MCP server before it becomes load-bearing in your agent workflows. It takes about 30 minutes for a typical server; a thorough review with test execution takes 2–4 hours.

How to use severity tiers: Items marked Blocker mean the server should not be deployed until the issue is resolved. Major items create meaningful risk that should be remediated before production workloads. Minor items are hygiene improvements you can schedule in the next sprint.

Blocker — do not deploy Major — remediate before production Minor — schedule for next sprint

1. Authentication 6 questions

Authentication questions focus on how the MCP server verifies caller identity — token validation, algorithm pinning, and session lifecycle. Weaknesses here map directly to the JWT algorithm confusion class of vulnerabilities (CVSSv3 9.8).

1

Does the server explicitly pin the JWT algorithm (e.g. algorithms: ['RS256']) and reject alg:none? Blocker

Accepting algorithm negotiation from the client lets an attacker downgrade to HMAC or alg:none and forge tokens. Check source for jwt.verify(token, secret) without an algorithms option, or jwt.decode() used in auth paths.
2

Is the HMAC secret long enough and randomly generated (≥256-bit entropy)? Blocker

Short, predictable, or hardcoded HMAC secrets can be cracked offline in minutes using hashcat. Run grep -r 'secret' --include="*.js" looking for quoted string literals. Any JWT secret derivable from environment variable names like SECRET=password is a blocker.
3

Are iss (issuer) and aud (audience) claims validated on every token? Blocker

Without issuer and audience checks, a token from one service can be replayed against another. Many MCP servers only check the signature. Look for issuer and audience options in the jwt.verify() call.
4

Is token expiry (exp claim) enforced and is the validity window ≤ 1 hour? Major

Tokens without expiry or with multi-day validity windows remain exploitable long after a breach. Most JWT libraries enforce exp by default, but check that no ignoreExpiration: true option exists in the verify call.
5

Is there a token revocation mechanism for compromised credentials (deny-list or short-lived + refresh)? Major

Stateless JWTs cannot be revoked once issued. For long-running MCP servers handling sensitive operations, the server should maintain a deny-list or use short-lived access tokens (≤15 min) with refresh token rotation.
6

Are authentication failures rate-limited to prevent brute-force? Minor

Without rate limiting on auth endpoints, attackers can attempt credential stuffing at high speed. Check for middleware like express-rate-limit or equivalent applied to token endpoint routes.

2. Authorization & Permissions Hygiene 7 questions

Authorization questions cover whether the server enforces least-privilege at the tool, resource, and API level. See the permissions hygiene deep-dive for patterns.

7

Does the server request only the permissions it actually uses — no over-broad scopes? Blocker

An MCP server that requests fs:read * when it only needs fs:read /tmp/uploads creates unnecessary blast radius. Review the manifest / permissions declaration against actual tool implementations.
8

Is authorization checked at the tool invocation level, not just at connect time? Blocker

Some servers authenticate the session once at connect, then trust all subsequent tool calls. A compromised client can call any tool. Each tool handler should re-validate that the caller has permission for that specific operation.
9

Are admin-only tools gated behind a separate privilege check, not just a different route? Major

Check whether admin_* or internal_* tools have an explicit permission guard inside the handler function, not just a router-level middleware that could be bypassed by direct invocation.
10

Does path traversal protection cover all file-reading tools (deny ../ and absolute paths outside the allowed root)? Major

File-reading tools are a common path traversal target. The fix is path.resolve() + checking that the result still starts with the allowed root, not a simple .replace('../', '') which can be bypassed with URL encoding.
11

Is object-level authorization enforced? (Can user A access user B's data by changing an ID in tool arguments?) Major

Horizontal privilege escalation via direct object reference is the OWASP API Security #1. Any tool that accepts a user or resource ID must verify the calling user is authorized for that specific ID, not just that they're authenticated.
12

Are tool descriptions and parameter names non-misleading? (No prompt injection via tool metadata?) Minor

Tool names and descriptions are injected into the LLM's context. A malicious server can include hidden instructions in tool descriptions to manipulate the agent's behavior. Review name, description, and inputSchema.description fields for embedded instructions.
13

Is there a documented permission model (what scopes exist, what each grants, how they're assigned)? Minor

Without documentation, reviewers can't verify the permission model is complete. Team leads can't write a policy. Check SECURITY.md, README.md, and API docs.

3. Input Validation & Injection Prevention 7 questions

Input handling questions cover prompt injection, SSRF, command injection, and SQL injection — the four most common active attack paths against MCP tools in 2026.

14

Does every tool parameter have a JSON Schema type constraint, and is it enforced server-side? Blocker

Type constraints are the first defense layer. A url parameter typed as string with no format validation accepts file:///etc/passwd. At minimum, check that inputSchema is present and that server-side validation rejects malformed payloads before handler code runs.
15

Are network-fetch tools protected against SSRF? (Block RFC1918, metadata endpoints, and file:// URLs.) Blocker

SSRF was found in 36.7% of scanned community MCP servers. The fix requires resolving the hostname to an IP and checking it against a blocklist before the request is made — not just checking the URL string, which can be bypassed with DNS rebinding or redirects.
16

Are shell-executing tools free of command injection? (No exec() or shell: true with unsanitized input.) Blocker

Command injection via shell interpolation is an instant RCE. Grep for exec(, spawn(, shell: true, and execSync(. Any user-controlled variable interpolated into a shell command string is a blocker. Use execFile with argument arrays instead.
17

Are database-accessing tools using parameterized queries (no string interpolation into SQL)? Blocker

SQL injection in an MCP server is particularly dangerous because the LLM can be manipulated into crafting injection payloads via prompt injection. Check for `SELECT ... ${userInput}` patterns. All database queries must use prepared statements or an ORM that enforces parameterization.
18

Does the server sanitize or strip content returned from external sources before returning it to the LLM? Major

External content (web pages, files, database rows) can contain prompt injection payloads — text designed to hijack the LLM's behavior. A server that fetches and returns raw HTML gives an attacker control over what the agent does next. See the prompt injection anatomy post for detection patterns.
19

Are file upload tools restricted to expected MIME types and file sizes, with server-side validation? Major

Client-supplied Content-Type headers cannot be trusted. Use file-type or equivalent magic-byte detection server-side. Enforce size limits before reading the stream to prevent memory exhaustion.
20

Are string length limits enforced on all tool inputs to prevent DoS via large payloads? Minor

Unbounded string inputs can exhaust memory, CPU (via regex backtracking), or downstream service quotas. Add maxLength constraints in JSON Schema and reject early in validation middleware.

4. Network Security & Transport 5 questions

Network questions cover TLS configuration, port exposure, and trust boundary assumptions between the MCP server and downstream services.

21

Is all inter-service communication encrypted (TLS 1.2+ with valid certificates — no rejectUnauthorized: false)? Blocker

Disabled TLS verification is one of the most common findings in MCP server source audits. Grep for rejectUnauthorized: false, NODE_TLS_REJECT_UNAUTHORIZED=0, and verify: false. These patterns in non-test code are always a blocker.
22

Does the server bind only to necessary network interfaces? (Not 0.0.0.0 unless a public listener is intentional.) Major

An MCP server deployed as an internal service should bind to 127.0.0.1 or a private interface. Binding to 0.0.0.0 in a container or VM exposes the server to every network interface the host has, including public ones.
23

Are CORS origins restricted to known, specific domains (no wildcard * on credentialed endpoints)? Major

A wildcard CORS policy on endpoints that accept credentials allows any website to make authenticated requests on behalf of an authenticated user's browser. Check the CORS configuration for origin: '*' combined with credentials: true — that combination is forbidden by the spec but some frameworks silently accept it.
24

Are timeouts configured on all outbound HTTP calls to prevent resource exhaustion via slow-response attacks? Major

Without timeouts, a slow or unresponsive downstream server holds connections open indefinitely. Check that all fetch(), axios, and got calls have explicit timeout or signal: AbortSignal.timeout(ms) options.
25

Is the server's network topology documented? (What it connects to, what connects to it, what trust boundaries exist.) Minor

Without a network diagram or architecture doc, reviewers can't assess the lateral movement potential if the server is compromised. This is a hygiene item for production deployments, not a show-stopper for initial review.

5. Secrets Management 6 questions

Secrets management questions look for hardcoded credentials, insecure environment variable handling, and key rotation practices. See the secrets management deep-dive for full treatment.

26

Are secrets absent from the source code and git history (no API keys, passwords, or tokens in committed files)? Blocker

Run git log --all --full-history -- '*.env' | head and trufflehog git file://. to check history. A secret committed and then deleted is still in the git object store. The only remediation is secret rotation — the historical version is permanently compromised.
27

Are secret values never logged, even partially? (No console.log(config) or error serialization that dumps env vars.) Blocker

Grep for console.log(process.env, JSON.stringify(config, and console.log(...config. Error objects that include req.headers can also leak authorization headers in structured logs. Check error handler middleware for full request/response serialization.
28

Is there a documented secret rotation procedure and has it been tested? Major

An undocumented rotation procedure means secrets stay compromised for hours or days after detection while someone figures out the process under pressure. The procedure should be written, tested in staging, and reviewed in this security check.
29

Are secrets scoped to minimum necessary permissions? (DB credentials with only SELECT, not superuser.) Major

Overly privileged credentials turn a credential leak into a full compromise. Database credentials should use a role with only the SQL operations the server actually needs. API keys should use the minimal OAuth scope.
30

Is a secrets manager (AWS Secrets Manager, Vault, Doppler) used rather than environment variables for sensitive credentials? Minor

Environment variables are visible in process listings, crash dumps, and some monitoring tools. For credentials that access sensitive data, use a secrets manager with audit logging and automatic rotation. This is a hygiene item for most contexts, a major item for PII-handling servers.
31

Is a .env.example file maintained with placeholder values (not real ones) to document required secrets? Minor

Developers sometimes commit real .env files because they don't know what's expected to stay local. A maintained .env.example removes the ambiguity and should be the documented onboarding reference.

6. Supply Chain Integrity 7 questions

Supply chain questions focus on dependency risk, provenance, and the risk that a compromised upstream package introduces malicious behavior into the MCP server. See the supply chain attestation page for SLSA and Sigstore patterns.

32

Are all dependencies pinned to exact versions in package.json (no ^ or ~ ranges for production deps)? Blocker

Unpinned ranges allow a malicious or buggy package update to silently enter your dependency tree on the next install. Pin production dependencies to exact versions and use a lockfile (package-lock.json / yarn.lock). CI should run npm ci, not npm install.
33

Are there no known high/critical CVEs in the dependency tree (npm audit returns 0 high+)? Major

Run npm audit --audit-level=high. Any high or critical findings are a major blocker for production deployment. Check whether the CVE is actually in the code path used by this server — a critical CVE in a test-only dependency is lower risk than one in a runtime path.
34

Do postinstall scripts execute, and if so, have they been reviewed? Major

Malicious packages often use postinstall hooks to exfiltrate environment variables at install time. Run npm install --ignore-scripts and check what breaks — anything that breaks was running code at install time. Review those scripts explicitly.
35

Is the published artifact (npm package or Docker image) built from a verified, auditable CI pipeline? Major

A developer's local machine may be compromised. Packages should be built and published from CI with SLSA provenance attestations. Check whether the package's published version matches the GitHub release SHA.
36

Is the dependency count reasonable (fewer than ~50 direct+transitive for a focused tool server)? Minor

Every dependency is an attack surface. An MCP server with 300 transitive dependencies for simple REST proxying is carrying disproportionate supply chain risk. Audit with npm ls --depth=0 and question any heavyweight framework dependencies.
37

Is a Software Bill of Materials (SBOM) generated and attached to releases? Minor

An SBOM lets downstream users run their own vulnerability scans and satisfy regulatory SBOM requirements (EO 14028 in the US, CRA in the EU). Generate with cdxgen or cyclonedx-npm and attach the JSON to GitHub releases.
38

Is there a SECURITY.md file with a responsible disclosure policy and contact address? Minor

Without a disclosed vulnerability reporting path, researchers finding security issues have no clear channel and may disclose publicly. A SECURITY.md at the repo root with a contact email and expected response time is a minimal standard.

7. Observability & Auditability 6 questions

Observability questions cover whether the server generates the structured log data needed to detect, investigate, and respond to incidents. A server you can't observe is a server you can't secure in production.

39

Does every tool invocation produce a structured log entry (tool name, caller identity, timestamp, outcome)? Blocker

Without per-invocation logs, you cannot reconstruct what an agent did in an incident. This is the minimum audit trail required for any production deployment. Check the handler code for a logging call that captures at least these four fields.
40

Are authentication events (success, failure, token refresh) logged with enough context to detect brute-force? Major

Auth failure logs need: timestamp, source IP, user identifier (not password), and outcome. Without IP logging, distributed brute-force attacks are invisible. Without user identifier, you can't see which accounts are being targeted.
41

Are logs written to an external sink (not just stdout/stderr), with structured JSON format? Major

Container stdout logs are ephemeral — they disappear when the container restarts. For incident investigation, logs must be forwarded to a persistent sink. Structured JSON (not free-form text) is required for reliable parsing and alerting.
42

Is there an alert or metric for anomalous tool call volume that would surface an agent-loop or abuse incident? Major

Runaway agent loops and abuse both generate abnormal call volumes. A simple rate-of-invocation metric with a threshold alert catches both. Without this, an agent loop can consume quota or downstream API credits undetected.
43

Can a specific tool invocation be correlated to the originating conversation via a trace ID? Minor

Distributed tracing across the LLM → MCP server → downstream service boundary lets you reconstruct the full causal chain for a security incident. This requires passing a correlation ID from the tool call through to all downstream calls.
44

Are log retention policies defined and do they meet your compliance requirements? Minor

GDPR requires deletion of PII-containing logs after the retention period. SOC 2 typically requires 90 days minimum. Healthcare contexts require 6 years (HIPAA). Know what's in your logs and how long you're keeping it.

8. Incident Response Readiness 6 questions

Incident response questions measure whether your team is prepared to respond when something goes wrong — not if, when. These are organizational, not just code-level, questions.

45

Is there a named owner for this MCP server with an on-call escalation path? Major

Ownerless infrastructure gets ignored in incidents. Record in your service catalog: team owner, primary on-call contact, escalation path. Community MCP servers without a named owner for your deployment create a response gap.
46

Can the server be disabled or have its credentials rotated in under 15 minutes during an active incident? Major

The mean time to contain matters as much as detection. Test your kill switch: can you revoke the API key this server uses, rotate its JWT signing secret, or disable it at the load balancer — and have that take effect within 15 minutes? If not, document the gap.
47

Is there a written runbook for the most likely security incidents (credential compromise, prompt injection, data exfiltration)? Major

During an active incident, you do not want to be designing the response process. A runbook with decision trees for each scenario reduces MTTR and prevents ad-hoc decisions under pressure. See the MCP incident response playbook post for a template.
48

Have you run a tabletop exercise simulating an MCP server compromise in the last 6 months? Minor

A tabletop exercise surfaces gaps in your runbook and incident coordination before a real incident. For MCP servers handling sensitive data or agentic workflows with write access, a 2-hour tabletop per year is a reasonable baseline.
49

Does the server's privacy policy or data processing agreement cover the data types being passed through tool calls? Minor

Agentic workflows often pass PII (names, emails, documents) through MCP tools without the data ever being shown to a human reviewer. Confirm that the server's privacy policy and your DPA covers this processing.
50

Is there a cadence for re-reviewing this MCP server as its codebase and dependency tree evolve? Minor

A server that passed review six months ago may have introduced new dependencies, new tool handlers, or new configuration since. For actively developed servers, schedule a re-review on every major version bump. For stable servers, set a calendar reminder for annual re-review.

Scoring summary

Tally your findings by severity. A deployable server has zero Blockers, ideally zero Majors (or a remediation plan with deadlines for any remaining), and Minor items tracked as tech debt.

Category	Blocker	Major	Minor	Total
1. Authentication	3	2	1	6
2. Authorization & Permissions	2	3	2	7
3. Input Validation & Injection	4	2	1	7
4. Network Security	1	3	1	5
5. Secrets Management	2	2	2	6
6. Supply Chain	1	3	3	7
7. Observability	1	3	2	6
8. Incident Response	0	3	3	6
Total	14	21	15	50

Automate the automatable parts. About 30 of these 50 questions can be partially answered by a static analysis tool or automated scanner. Run a SkillAudit scan first to flag the obvious issues, then use this checklist for the questions that require reading context, testing behavior, and evaluating organizational readiness. The two approaches are complementary, not redundant.

How SkillAudit maps to this checklist

When you run an automated audit on SkillAudit, the findings map to this checklist as follows:

Security axis → questions 14–20 (input validation) and 21–25 (network security)
Permissions hygiene axis → questions 7–13 (authorization)
Credential exposure axis → questions 26–31 (secrets management)
Maintenance axis → questions 32–38 (supply chain) and 50 (re-review cadence)
Client compatibility axis → not covered in this checklist (separate evaluation)
Documentation axis → questions 13, 37, 38

The automated scan cannot assess questions 39–50 (observability and incident response) — these require organizational context. Treat those sections as the human reviewer's exclusive domain.