Topic: mcp server input fuzzing security

MCP server input fuzzing security — black-box fuzzing tool handlers, mutation testing, and crash triage for MCP APIs

MCP tool handlers are distinct from conventional REST APIs in a critical way: their inputs are generated by an LLM that has been processing arbitrary user text, including content from external documents, web pages, and tool outputs that may contain prompt injection payloads. The LLM's argument generation is probabilistic and non-deterministic — it can produce structurally valid JSON that is semantically absurd, and a prompt injection payload can nudge that generation toward edge cases that expose path traversals, type confusion bugs, and crashes. Black-box fuzzing finds these before attackers do.

Why MCP tool inputs are a fuzzing target

Conventional API input validation assumes a human or a well-typed SDK is generating the request. MCP tool arguments come from an LLM, which means:

A minimal fuzzing harness for MCP tool handlers

You don't need AFL++ or a full fuzzing infrastructure to usefully fuzz an MCP server. A type-aware mutation fuzzer targeting your JSON Schema is enough to find the majority of handler crashes:

// fuzz-runner.ts — minimal schema-aware MCP tool fuzzer
import { myServer } from './server';

// Mutation strategies — each transforms a seed value
const mutators = {
  string: (v: string) => [
    '', // empty string
    v + '\x00', // null byte
    v.repeat(10_000), // oversized
    '../../../etc/passwd', // path traversal
    '‮' + v, // right-to-left override
    "'; DROP TABLE users; --", // SQL injection attempt
    '{{' + v + '}}', // template injection
    null, // null where string expected
    42, // wrong type
  ],
  number: (v: number) => [
    -1, 0, -2147483648, 2147483647, 9999999999,
    0.1, NaN, Infinity, -Infinity,
    '42', // string coercion
    null, undefined,
  ],
  array: (v: unknown[]) => [
    [], // empty
    new Array(10_000).fill(v[0]), // oversized
    [...v, null], // null element
    null, // null where array expected
  ],
};

async function fuzzTool(toolName: string, seedArgs: Record<string, unknown>) {
  const results: Array<{ args: unknown; error: string }> = [];

  for (const [key, seedValue] of Object.entries(seedArgs)) {
    const type = typeof seedValue === 'string' ? 'string'
      : typeof seedValue === 'number' ? 'number'
      : Array.isArray(seedValue) ? 'array'
      : 'string';

    const mutations = mutators[type]?.(seedValue as never) ?? [null, '', 0];

    for (const mutant of mutations) {
      const args = { ...seedArgs, [key]: mutant };
      try {
        await myServer.callTool(toolName, args);
      } catch (err) {
        const msg = err instanceof Error ? err.message : String(err);
        // Expected validation errors are OK; unexpected crashes are findings
        if (!msg.includes('Invalid') && !msg.includes('required')) {
          results.push({ args, error: msg });
        }
      }
    }
  }

  return results;
}

// Example: fuzz read_file tool
const crashes = await fuzzTool('read_file', { path: 'readme.md' });
console.log('Unexpected crashes:', crashes.length);
crashes.forEach(c => console.log(JSON.stringify(c)));

What to look for in fuzzing output

Not all thrown errors are findings. Classify crash outputs into three categories:

Automated fuzzing in CI with fast-check

Property-based testing via fast-check integrates fuzzing into your normal test suite without a separate fuzzing infrastructure:

import fc from 'fast-check';
import { callTool } from './server';

describe('read_file fuzzing', () => {
  it('never crashes on arbitrary path inputs', async () => {
    await fc.assert(
      fc.asyncProperty(fc.string(), async (path) => {
        try {
          await callTool('read_file', { path });
        } catch (err) {
          const msg = String(err);
          // Assertion: errors must be safe validation messages
          expect(msg).not.toMatch(/ENOENT.*\.\.\//); // no path traversal in error
          expect(msg).not.toMatch(/Cannot read prop/); // no null reference crash
          expect(msg).not.toMatch(/at Object\./); // no stack trace leak
        }
      }),
      { numRuns: 1_000 }
    );
  });

  it('integer fields reject non-integers without crashing', async () => {
    const arbitraryNonInteger = fc.oneof(
      fc.string(), fc.double(), fc.constant(null),
      fc.constant(undefined), fc.constant(Infinity)
    );
    await fc.assert(
      fc.asyncProperty(arbitraryNonInteger, async (count) => {
        try {
          await callTool('list_items', { count });
        } catch (err) {
          // Must throw a clean validation error — not a process crash
          expect(err).toBeInstanceOf(Error);
        }
      }),
      { numRuns: 500 }
    );
  });
});

SkillAudit detection

Static analysis catches known-bad patterns in source code; fuzzing catches unknown-bad behavior at runtime. See the MCP scanner vs. SAST comparison for why both are needed, or run a SkillAudit scan to get static findings on your server's input validation coverage.