Security·Testing

MCP server API contract testing security: schema validation, fuzzing, and Schemathesis

API contract testing verifies that your MCP server behaves consistently with its declared schema — and that it handles violations safely. Combined with property-based fuzzing, it catches entire classes of input validation vulnerabilities that unit tests miss.

Why contract testing finds security bugs

Most MCP server security vulnerabilities come from a gap between the declared input schema and the actual validation logic. A tool may declare a parameter as type: string, maxLength: 256 in its inputSchema — but if the server doesn't enforce that constraint in handler code, attackers can pass 10MB strings, null bytes, or SQL injection payloads.

Contract testing systematically exploits that gap by generating thousands of inputs that conform to the schema (to test happy paths), violate the schema (to test error handling), and are at the boundary of valid values (to test edge cases).

Pattern 1: JSON Schema enforcement with Ajv

Before writing fuzz tests, enforce the contract in your MCP tool handler. Use Ajv (Another JSON Validator) with strict mode:

import Ajv from 'ajv';
import addFormats from 'ajv-formats';

const ajv = new Ajv({
  strict: true,        // disallow unknown keywords
  allErrors: true,     // collect all errors, not just the first
  coerceTypes: false,  // NEVER coerce — coercion hides injection payloads
});
addFormats(ajv);

const fetchUrlSchema = {
  type: 'object',
  required: ['url'],
  additionalProperties: false,
  properties: {
    url: {
      type: 'string',
      format: 'uri',
      maxLength: 2048,
      pattern: '^https://',  // enforce HTTPS, block file:// and http://
    },
    timeout_ms: {
      type: 'integer',
      minimum: 100,
      maximum: 10000,
    },
  },
};

const validate = ajv.compile(fetchUrlSchema);

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (!validate(request.params.arguments)) {
    return {
      content: [{ type: 'text', text: 'Invalid arguments' }],
      isError: true,
      _meta: { errors: validate.errors },
    };
  }
  // safe to proceed
});

Key settings: coerceTypes: false prevents Ajv from accepting the string "true" for a boolean parameter — a coercion that masks injection payloads. additionalProperties: false rejects extra fields that bypassed the schema check.

Pattern 2: Schemathesis property-based fuzzing

Schemathesis generates thousands of test cases from your OpenAPI or JSON Schema definition, including values that are structurally valid but semantically adversarial. For an MCP server with an HTTP transport layer:

# Install
pip install schemathesis

# Run against the running MCP server (adjust port)
st run http://localhost:3000/openapi.json \
  --checks all \
  --hypothesis-max-examples 500 \
  --auth "Bearer $TEST_TOKEN" \
  --stateful=links \
  --experimental=openapi-3.1

For testing MCP-specific tool endpoints, generate a thin OpenAPI wrapper that maps tool calls to HTTP POST requests, then fuzz that. Key Schemathesis checks to enable:

not_a_server_error — no 5xx responses (tool handlers must not panic on bad input)
response_schema_conformance — response body must match declared schema
content_type_conformance — response Content-Type matches declaration
use_after_free — no responses that look like internal state leakage

# In CI (GitHub Actions)
- name: Fuzz MCP server tool API
  run: |
    st run http://localhost:3000/openapi.json \
      --checks not_a_server_error,response_schema_conformance \
      --hypothesis-max-examples 200 \
      --auth "Bearer ${{ secrets.TEST_TOKEN }}" \
      --junit-xml=schemathesis-results.xml
  continue-on-error: false

Pattern 3: Consumer-driven contracts with Pact

Pact tests verify that your MCP server's responses remain compatible with what the LLM client (or another consumer) expects. This catches security regressions where a "refactoring" removes a validation that a downstream consumer depended on:

// Consumer test — what the LLM agent expects from the fetch_url tool
const { Pact } = require('@pact-foundation/pact');

const provider = new Pact({
  consumer: 'agent-gateway',
  provider: 'mcp-fetch-server',
  port: 8080,
});

describe('MCP fetch tool security contract', () => {
  it('rejects SSRF target URLs with 422', async () => {
    await provider.addInteraction({
      state: 'server is running',
      uponReceiving: 'a tool call with an internal IP URL',
      withRequest: {
        method: 'POST',
        path: '/mcp/tools/fetch_url',
        body: { arguments: { url: 'http://169.254.169.254/latest/meta-data/' } },
      },
      willRespondWith: {
        status: 422,
        body: { isError: true },
      },
    });
    // verify the interaction...
  });
});

The Pact contract is published to a Pact Broker and verified against the real MCP server in CI. If the server stops returning 422 for SSRF URLs (e.g., someone removed the allowlist check during a refactor), the provider verification fails and the PR is blocked.

Pattern 4: Security-focused schema diff in CI

Gate PRs that change tool inputSchema with a schema diff check that flags security-relevant changes — widened types, removed constraints, added fields without validation:

# .github/workflows/schema-security-diff.yml
- name: Check schema security regression
  run: |
    node scripts/schema-security-diff.js \
      --base origin/main \
      --head HEAD \
      --fail-on-widened-types \
      --fail-on-removed-maxlength \
      --fail-on-added-additionalproperties-true

This catches the most common "refactoring" regression: a developer loosens a maxLength constraint or changes a type from string to any and doesn't realize the security implication.

SkillAudit and contract testing

SkillAudit's static analysis checks whether your MCP tool inputSchema declarations have maxLength, pattern, and format constraints — and flags tools that declare type: string for URL parameters without format restrictions. The contract testing patterns here are the dynamic complement: run them in CI to verify the enforcement logic matches the declarations. Run a free audit to see how your server's input schemas score.