MCP Server Security

Mutation testing for MCP server security

Line coverage tells you which code was executed. Mutation testing tells you whether your assertions actually catch failures. For security-critical MCP tool handlers — input validation, authentication checks, rate limiting guards — a surviving mutant is evidence of a missing security assertion, not just a test quality gap.

What mutation testing is

A mutation testing tool (Stryker for JavaScript/TypeScript) makes small, targeted changes to your production code — flipping a > to >=, changing a string comparison, removing a conditional — and then runs your test suite. If a test fails when the code is mutated, that mutant is "killed." If all tests pass with the mutant in place, the mutant "survives."

A surviving mutant means: the change to your code went undetected. For security code, this is direct evidence of a gap. If you flip the condition in your authentication check and all tests still pass, you don't have a test for the authentication failure path.

Setting up Stryker for an MCP server

npm install --save-dev @stryker-mutator/core @stryker-mutator/typescript-checker

// stryker.config.mjs
export default {
  packageManager: 'npm',
  reporters: ['html', 'clear-text'],
  testRunner: 'jest', // or 'vitest', 'mocha'
  coverageAnalysis: 'perTest',
  timeoutMS: 10000,
  checkers: ['typescript'],
  tsconfigFile: 'tsconfig.json',

  // focus mutation testing on security-critical files
  mutate: [
    'src/tools/**/*.ts',
    'src/auth/**/*.ts',
    'src/validation/**/*.ts',
    '!src/**/*.test.ts',
    '!src/**/*.spec.ts'
  ],

  // exclude mutations that produce noise without security signal
  mutator: {
    excludedMutations: [
      'StringLiteral',   // string constant changes rarely reveal security gaps
      'ArrowFunction',   // changes to arrow fn bodies are usually noise
    ]
  }
}

npx stryker run

Stryker produces an HTML report at reports/mutation/html/index.html. Open it and filter to "Survived" mutants in your security-critical files.

Reading surviving mutants as security signals

The mutation categories most revealing for security are:

ConditionalExpression mutations — Stryker changes if (isAuthenticated) to if (true) and if (false). If the if (true) mutation (always allow) survives, you have no test that exercises the denied path. This is a missing authorization test.

// Original code
async function handleTool(args, { sessionId }) {
  if (!isAuthenticated(sessionId)) {
    throw new Error('Unauthorized')
  }
  return runScan(args)
}

// Stryker mutant: ConditionalExpression — changes `!isAuthenticated(sessionId)` to `false`
async function handleTool(args, { sessionId }) {
  if (false) { // mutation: never throws
    throw new Error('Unauthorized')
  }
  return runScan(args)
}

// If your tests don't include a test with an unauthenticated session that
// verifies the error is thrown, this mutant survives — revealing the gap.

EqualityOperator mutations — flipping > to >= in rate limiting logic. If the off-by-one change survives, your rate limit test doesn't verify the exact boundary.

// Original
if (callCount >= this.maxCalls) { // limit at exactly maxCalls

// Stryker mutant: EqualityOperator — changes >= to >
if (callCount > this.maxCalls) { // allows one extra call before blocking

// Surviving mutant reveals: no test checks that exactly maxCalls calls are
// accepted and the (maxCalls+1)th call is rejected.

LogicalOperator mutations — changing && to || in compound security conditions. If your path-traversal check is resolved.startsWith(base) && !resolved.includes('\0'), a mutation changing && to || tests whether you have cases that isolate each condition.

Writing tests to kill surviving security mutants

For each surviving mutant in security-critical code, write a test that is specifically designed to kill that mutant — a test that passes with the original code and fails with the mutation:

describe('rate limiter boundary', () => {
  test('allows exactly maxCalls calls within window', async () => {
    const limiter = new SlidingWindowLimiter({ maxCalls: 5, windowMs: 60_000 })
    for (let i = 0; i < 5; i++) {
      expect(limiter.check('session-1').allowed).toBe(true)
    }
  })

  test('rejects the (maxCalls + 1)th call', async () => {
    const limiter = new SlidingWindowLimiter({ maxCalls: 5, windowMs: 60_000 })
    for (let i = 0; i < 5; i++) limiter.check('session-1')
    expect(limiter.check('session-1').allowed).toBe(false)
    // this test specifically kills the > vs >= mutant
  })
})

Stryker configuration for large codebases

Mutation testing is slow — it runs your entire test suite once per mutant, and a codebase with 200 mutants means 200 test runs. Keep it practical:

// stryker.config.mjs — performance-optimized for CI
export default {
  // ...
  concurrency: 4,           // parallel test runners
  timeoutMS: 5000,          // kill slow mutants early
  dryRunTimeoutMinutes: 1,  // abort if baseline is slow

  // incremental mode: only re-run mutants for changed files
  incremental: true,
  incrementalFile: '.stryker-incremental.json',

  // focus on security-critical directories only in CI
  mutate: process.env.CI
    ? ['src/auth/**/*.ts', 'src/tools/**/*.ts']
    : ['src/**/*.ts', '!src/**/*.test.ts']
}

In CI, restrict mutation testing to security-critical directories and run full mutation testing locally before significant releases. An initial mutation run that takes 10 minutes will shrink to 2–3 minutes in incremental mode after the first run.

What SkillAudit looks for

SkillAudit checks for Stryker configuration in the repository root and for a mutation report in the CI artifacts. Repositories with mutation testing configured score higher on the Documentation Completeness axis (evidence of thorough security test coverage) and the Maintenance axis (active quality improvement practice). A mutation score below 50% on security-critical files is noted in the audit comments as a maintenance finding, not a security finding — but it flags that the security tests exist to check, not necessarily to catch failures.

See how thoroughly your security tests are exercised

SkillAudit's audit includes a test quality check for MCP servers with test suites. Free for public repos.

Run a free audit