MCP server formal verification: TLA+, Alloy, and model checking for protocol security
Testing can show the presence of bugs. Formal verification can prove their absence. For MCP server security properties — "a tool call is never authorized without a valid session token" or "a revoked permission can never be used to access a resource" — model checking can exhaustively verify the property over all reachable states in the protocol specification. That is a qualitatively different assurance than a test suite.
What formal verification addresses in MCP security
Formal methods are most valuable for verifying protocol properties — not implementation correctness, but the logical design of the state machine and access control model. For MCP servers, the high-value targets are:
- Authentication completeness: is there any sequence of tool calls that results in an unauthenticated call being authorized?
- Permission revocation safety: after a permission is revoked, does the model guarantee no tool call can succeed with the revoked permission?
- Session isolation: in a multi-tenant model, can session A ever access resources belonging to session B?
- Upgrade atomicity: during a rolling deploy where both V1 and V2 server instances are running, can a token issued by V1 gain V2-level permissions?
These properties are difficult or impossible to verify through testing because the relevant state spaces are large, the failure conditions depend on specific sequences of events, and the bugs are logic errors (not crashes or exceptions).
TLA+ for MCP authentication state machines
TLA+ (Temporal Logic of Actions) is a specification language designed for concurrent and distributed systems. Developed by Leslie Lamport at Microsoft Research, it is used by Amazon (S3, DynamoDB, EBS), Google, and Microsoft Azure to verify distributed protocol properties.
For an MCP server's authentication model, a TLA+ spec defines the system's state (active sessions, token validity, permission grants), the allowed actions (session creation, tool call, token revocation, session expiry), and the invariants that must hold across all reachable states. The TLC model checker then exhaustively explores all reachable states and verifies the invariants hold in every one.
A typical authentication invariant for an MCP server in TLA+:
AuthorizedCallRequiresValidSession ==
\A call \in ActiveToolCalls :
\E session \in Sessions :
/\ session.id = call.session_id
/\ session.status = "active"
/\ session.token_expiry > CurrentTime
TLC verifies this invariant holds across all reachable system states. If it finds a counterexample (a sequence of actions that reaches a state where the invariant is violated), it outputs the exact state sequence — a minimal bug reproduction.
Alloy for access control models
Alloy is a relational modeling language that excels at verifying access control invariants. Where TLA+ is oriented toward temporal properties in concurrent systems, Alloy is oriented toward structural properties of data models — "who can access what" relationships.
For MCP server permission models — tool-level permissions, resource-level scoping, role-based access — Alloy's Analyzer can check: is there any assignment of users, roles, and permissions that would allow a user to access a resource they should not access? This is particularly useful for complex MCP permission models where roles have implicit inheritance or where tool permissions compose.
Property-based testing as a practical bridge
Full formal verification has a steep learning curve. For MCP server teams not ready to invest in TLA+ or Alloy, property-based testing with fast-check (TypeScript/JavaScript) or Hypothesis (Python) provides 80% of the benefit with 20% of the investment.
Property-based testing generates hundreds of random inputs and checks that specified properties hold across all of them. For MCP security properties:
- For any randomly generated tool call without a valid token,
authorize()always returns an error — verified across thousands of random call shapes - For any randomly generated permission revocation followed by a tool call using the revoked permission, the call always fails — verified across thousands of random revocation/call sequences
- For any randomly generated pair of session IDs, session A's resources are never accessible from session B's context — verified across thousands of random session pairs
Property-based testing finds a meaningful fraction of the bugs that model checking would find, at much lower cost — and it integrates directly into existing CI test suites.
SkillAudit and formal verification signals
SkillAudit's static analysis is not formal verification — it cannot prove the absence of bugs. But the security axis findings (missing authentication checks, incorrect permission logic, scope validation gaps) are exactly the types of errors that formal verification would catch. An MCP server with a clean SkillAudit report is a better candidate for formal verification investment than one with unresolved findings — fix the known issues first, then use formal methods to prove the absence of the unknown ones.
Start with a SkillAudit scan to identify the known authentication and permission logic issues before investing in TLA+ or Alloy specification work. Run a free audit →