Developer Guide · 2026-06-11

How to harden an existing MCP server without breaking integrations

You have an MCP server that's already deployed. Clients are using it. You just ran a SkillAudit scan and got a C grade: unsafe shell: true calls, no input validation, no rate limiting, credentials echoed back in error messages. The obvious fix — rewrite the tool handlers with proper guards — would break every integration you have if you do it wrong. This guide walks you through a six-phase additive hardening path that brings your server to B+ or higher without a single surprise break for existing callers.

In this post

  1. Why additive hardening is different from rewriting
  2. Phase 1 — Observe (zero behavior change)
  3. Phase 2 — Warn-only input validation
  4. Phase 3 — Soft rate limiting
  5. Phase 4 — Shadow authentication
  6. Phase 5 — Command injection hardening
  7. Phase 6 — Permission scoping (last, highest risk)
  8. Testing strategy: shadow mode and contract tests
  9. Realistic timeline and prioritization
  10. How SkillAudit scores each phase

Why additive hardening is different from rewriting

When you write a new MCP server from scratch, every security control is designed in from the start. A caller that sends a path traversal attempt hits your validation layer and gets a clean error. A caller that hits rate limits gets a 429 response it's expected from the beginning. No client was ever built against the insecure version.

When you harden an existing server, existing callers were built against the insecure behavior. If they're sending what your new validation layer rejects, they break. If they're relying on the verbose error message that leaked your database structure, they break when you sanitize it. The hardening is correct, but you've introduced a breaking change.

The key insight: security controls don't have to be binary on their first day. Most controls can be introduced in an observe → warn → enforce lifecycle that lets you verify no legitimate caller triggers the control before you flip the switch to enforcement.

The additive principle: Before enforcing any security control, run it in warn-only mode long enough to see whether legitimate callers trigger it. If zero legitimate triggers appear after 7–14 days, enforce. If legitimate callers trigger it, fix their input or add an exemption before enforcing.

Phase 1 — Observe (zero behavior change)

1 Observe: structured logging + alerting Zero breakage risk

Add structured logging to every tool handler before you change any behavior. You need a baseline of what legitimate callers actually send before you can know which security controls are safe to enforce.

What to log: the tool name, caller identity (if any), the incoming parameter shapes (not values if sensitive), response isError flag, response size, and duration. Write to a structured log (JSON lines) that you can query.

// Phase 1: wrap every handler with an observation middleware
// No behavior change — just structured visibility

function observeHandler(toolName, handler) {
  return async (params) => {
    const start = Date.now();
    let result;
    let threw = false;
    try {
      result = await handler(params);
    } catch (err) {
      threw = true;
      throw err;
    } finally {
      const entry = {
        ts: new Date().toISOString(),
        tool: toolName,
        durationMs: Date.now() - start,
        paramKeys: Object.keys(params || {}),
        // Log param shapes (not values) for sensitive fields
        paramTypes: Object.fromEntries(
          Object.entries(params || {}).map(([k, v]) => [k, typeof v])
        ),
        threw,
        isError: result?.isError ?? false,
        responseSize: JSON.stringify(result ?? '').length,
      };
      process.stderr.write(JSON.stringify(entry) + '\n');
    }
    return result;
  };
}

// Wrap all handlers at registration time
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: params } = request.params;
  const handler = observeHandler(name, handlers[name]);
  return handler(params);
});

Run this for 7 days minimum before Phase 2. You'll discover which tools get called most often, what parameter shapes are common, and whether any callers are already hitting what you'd consider invalid inputs. This baseline is your safety net for every subsequent phase.

What to look for in Phase 1 logs: Tools called more than 100×/hour (rate limit candidates), params containing / or .. (path traversal candidates), params containing shell metacharacters (command injection candidates), error responses with stack traces or database errors (information leakage candidates).

Phase 2 — Warn-only input validation

2 Warn-only input validation Low breakage risk

Implement your full input validation logic — path traversal guards, SSRF blocklists, command injection detection, string length limits — but don't reject yet. Log a VALIDATION_WOULD_REJECT event instead. Let the request proceed as before.

After 7–14 days of warn-only mode with no legitimate triggers, flip ENFORCE_VALIDATION=true to enable hard rejection. This flip is your first real security control change.

// Phase 2: warn-only validation — no enforcement yet
const ENFORCE_VALIDATION = process.env.ENFORCE_VALIDATION === 'true';

function validateInput(toolName, params) {
  const violations = [];

  // Path traversal
  for (const [key, val] of Object.entries(params)) {
    if (typeof val === 'string') {
      if (val.includes('..') || val.includes('\0')) {
        violations.push({ rule: 'PATH_TRAVERSAL', key, pattern: '..' });
      }
      // SSRF: URL params pointing to internal addresses
      if (key.toLowerCase().includes('url')) {
        try {
          const u = new URL(val);
          if (['169.254.169.254', '127.0.0.1', 'localhost', '0.0.0.0'].includes(u.hostname)) {
            violations.push({ rule: 'SSRF_INTERNAL', key, hostname: u.hostname });
          }
        } catch { /* not a URL */ }
      }
      // Command injection metacharacters in fields used in shell contexts
      if (/[;&|`$<>]/.test(val) && (key === 'command' || key === 'args' || key === 'query')) {
        violations.push({ rule: 'CMD_INJECTION_CHARS', key });
      }
    }
  }

  if (violations.length > 0) {
    process.stderr.write(JSON.stringify({
      event: 'VALIDATION_WOULD_REJECT',
      tool: toolName,
      violations,
      ts: new Date().toISOString(),
    }) + '\n');

    if (ENFORCE_VALIDATION) {
      // Return MCP error response (not throw — per MCP error handling spec)
      return {
        content: [{ type: 'text', text: 'Invalid input: ' + violations.map(v => v.rule).join(', ') }],
        isError: true,
      };
    }
  }

  return null; // proceed normally
}

// In handler:
async function readFileTool(params) {
  const rejection = validateInput('readFile', params);
  if (rejection) return rejection;

  // ... existing handler code unchanged ...
}

The critical rule: during warn-only mode, validateInput always returns null, which means the existing handler code runs exactly as before. The only thing that changes is a log line. Existing callers cannot tell the difference.

Before flipping ENFORCE_VALIDATION: Query your logs for VALIDATION_WOULD_REJECT events. If any are from callers you don't control, audit whether they're genuine security violations (attacker-like inputs) or legitimate use cases you need to allow. Legitimate use cases need an explicit allow-list entry; attacker-like inputs confirm the control is needed and safe to enforce.

Phase 3 — Soft rate limiting

3 Soft rate limiting Low breakage risk

Add rate limit tracking before enforcement. Count requests per caller per minute. Log when a caller exceeds the threshold. Run for 7 days to see whether your proposed limit would have throttled any legitimate caller. Only then set ENFORCE_RATE_LIMITS=true.

Start with conservative limits — 3× your 95th percentile observed burst rate. If your Phase 1 logs show the busiest caller sent 40 requests/minute at peak, set your initial limit at 120/minute. You can tighten it later; a limit that fires on legitimate traffic on day 1 is a breaking change.

// Phase 3: in-process sliding window rate limiter with soft enforcement
const ENFORCE_RATE_LIMITS = process.env.ENFORCE_RATE_LIMITS === 'true';
const RATE_LIMIT_RPM = parseInt(process.env.RATE_LIMIT_RPM || '120', 10);

// Per-caller sliding window using a Map of timestamp arrays
const callerWindows = new Map();

function checkRateLimit(callerId, toolName) {
  const now = Date.now();
  const windowMs = 60_000;
  const key = `${callerId}:${toolName}`;

  if (!callerWindows.has(key)) callerWindows.set(key, []);
  const window = callerWindows.get(key);

  // Evict expired entries
  while (window.length > 0 && window[0] < now - windowMs) window.shift();

  window.push(now);
  const count = window.length;

  if (count > RATE_LIMIT_RPM) {
    process.stderr.write(JSON.stringify({
      event: ENFORCE_RATE_LIMITS ? 'RATE_LIMIT_ENFORCED' : 'RATE_LIMIT_WOULD_THROTTLE',
      tool: toolName,
      caller: callerId,
      countInWindow: count,
      limitRpm: RATE_LIMIT_RPM,
      ts: new Date().toISOString(),
    }) + '\n');

    if (ENFORCE_RATE_LIMITS) {
      return {
        content: [{ type: 'text', text: `Rate limit exceeded: ${RATE_LIMIT_RPM} requests/minute` }],
        isError: true,
      };
    }
  }

  return null;
}

// Clean up old windows every 5 minutes to avoid memory growth
setInterval(() => {
  const cutoff = Date.now() - 70_000;
  for (const [key, ts] of callerWindows) {
    const filtered = ts.filter(t => t > cutoff);
    if (filtered.length === 0) callerWindows.delete(key);
    else callerWindows.set(key, filtered);
  }
}, 5 * 60_000);

Caller identity in MCP: Without auth, "caller identity" is often just the transport session or origin. With stdio transport, every call comes from the same process — rate limiting is less critical but still useful to catch runaway tool loops from a misbehaving LLM. For HTTP/SSE transports, use IP + session token composite keys.

Phase 4 — Shadow authentication

4 Shadow authentication Medium breakage risk — requires client coordination

Adding authentication to a previously unauthenticated server is the change most likely to break existing clients, because clients must actively add credentials to their requests. Shadow auth lets you add the verification side first, in warn-only mode, before any client is required to send credentials.

Shadow auth works in three sub-phases: (1) implement token verification logic but log failures instead of rejecting, (2) require tokens but allow a grace-period fallback for unsigned requests with a deprecation header in the response, (3) require tokens with no fallback.

// Phase 4a: shadow auth — verify but don't enforce
const AUTH_MODE = process.env.AUTH_MODE || 'shadow'; // 'shadow' | 'soft' | 'enforce'

function checkAuth(request) {
  const authHeader = request.headers?.authorization;

  if (!authHeader) {
    if (AUTH_MODE === 'enforce') {
      return { allowed: false, reason: 'MISSING_TOKEN' };
    }
    // shadow or soft: allow through, log the miss
    process.stderr.write(JSON.stringify({
      event: 'AUTH_WOULD_REJECT',
      reason: 'MISSING_TOKEN',
      ts: new Date().toISOString(),
    }) + '\n');
    return { allowed: true, unauthenticated: true };
  }

  const token = authHeader.replace(/^Bearer /, '');
  const valid = verifyToken(token); // your token verification logic

  if (!valid) {
    if (AUTH_MODE === 'enforce') {
      return { allowed: false, reason: 'INVALID_TOKEN' };
    }
    process.stderr.write(JSON.stringify({
      event: 'AUTH_WOULD_REJECT',
      reason: 'INVALID_TOKEN',
      ts: new Date().toISOString(),
    }) + '\n');
    return { allowed: true, unauthenticated: true };
  }

  return { allowed: true, unauthenticated: false };
}

// Phase 4b (soft mode): add deprecation header to unauthenticated responses
// so clients can see they need to update, without being broken yet
function addDeprecationHint(response, authResult) {
  if (authResult.unauthenticated && AUTH_MODE === 'soft') {
    // MCP doesn't have standard response headers, but HTTP transport can add them
    // Or add a meta field in the response content for stdio
    response._deprecationWarning = 'Authentication will be required in 30 days';
  }
  return response;
}

The coordination step: before moving from soft to enforce, contact every known client author (or update your own client code) and confirm they are sending valid tokens. Only then flip to enforce. This is the one phase where a pure code change is not enough — you need communication alongside it.

For community-distributed MCP servers: If you don't control the callers, move from soft to enforce behind a minor version bump and document the breaking change in CHANGELOG.md. Give callers at least 30 days on soft mode before the enforcement release.

Phase 5 — Command injection hardening

5 Command injection hardening Medium breakage risk — depends on how params are used

If your server uses shell: true to build command strings, the fix is to switch to execFile (or spawn with an array of arguments). This is a behavioral change for callers who are relying on shell expansion features (~, glob patterns, pipes), but most legitimate callers aren't doing that.

The safe migration: implement both the old exec(command) path and the new execFile(binary, [args]) path, route all new calls through the new path, but keep a compatibility shim that detects whether the input contains shell features and temporarily falls back to the old path with a loud deprecation log.

// Phase 5: safe command execution with shell-feature detection
const { execFile, exec } = require('child_process');
const { promisify } = require('util');
const execFileP = promisify(execFile);
const execP = promisify(exec);

// Detect shell features that would break if we remove shell: true
const SHELL_FEATURE_RE = /[|&;`$(){}[\]*?~<>!]/;

async function runCommand(binary, args, opts = {}) {
  // args must be an array; never a pre-built shell string
  if (!Array.isArray(args)) {
    throw new Error('runCommand: args must be an array');
  }

  const hasShellFeatures = args.some(a => SHELL_FEATURE_RE.test(a));

  if (hasShellFeatures) {
    // Log the shell feature use — do NOT execute
    process.stderr.write(JSON.stringify({
      event: 'CMD_SHELL_FEATURE_DETECTED',
      binary,
      // Don't log arg values — they may contain secrets or paths
      argCount: args.length,
      ts: new Date().toISOString(),
    }) + '\n');

    // Return an error to the caller, with a non-revealing message
    return {
      content: [{ type: 'text', text: 'Command contains characters not permitted by security policy.' }],
      isError: true,
    };
  }

  // Safe path: no shell, arguments passed as array
  try {
    const { stdout, stderr } = await execFileP(binary, args, {
      timeout: opts.timeout ?? 30_000,
      maxBuffer: opts.maxBuffer ?? 1_048_576,
      env: { PATH: '/usr/local/bin:/usr/bin:/bin' }, // minimal env
    });
    return { content: [{ type: 'text', text: stdout }] };
  } catch (err) {
    // Don't return err.message — may contain path info or stderr
    return {
      content: [{ type: 'text', text: `Command failed with exit code ${err.code ?? 'unknown'}` }],
      isError: true,
    };
  }
}

Most real-world MCP server tools that use exec are wrapping a single binary with user-supplied parameters (a git command, a linter, a converter). These are safe to migrate to execFile immediately. The cases that genuinely need a shell (piped commands) should be replaced with a native implementation that doesn't invoke a shell at all.

Error message hardening is safe to do in one step. Replacing return { isError: true, content: err.message } with a generic message doesn't change the structure of the response — only the content. Callers that pattern-match on error text (they shouldn't, but some do) will break, but callers that just check isError: true won't. This is safe to enforce immediately without a warn-only phase.

Phase 6 — Permission scoping (last, highest risk)

6 Permission scoping High breakage risk — must verify complete call graph first

SkillAudit's permissions hygiene axis grades you on whether your server declares only the permissions it actually needs. If your server declares filesystem: read-write but only reads, restricting to filesystem: read is the right move — but any hidden write path will break. Permission scope changes affect what the client can rely on the server to do, and they're the hardest to add additively.

Do this last: use Phase 1 observation logs to build a complete inventory of what every tool actually does. Only remove permissions that no log entry ever exercises. If any log entry shows a write operation on a tool you thought was read-only, that write path must be removed (not just under-declared) before you can safely narrow the scope.

// Phase 6: build a permissions-in-use inventory from Phase 1 logs
// Run this as a one-off analysis script against your structured log:

// Log query (using jq on your JSON-lines log):
// cat access.log | jq -s '[
//   .[] | select(.tool != null) |
//   {tool: .tool, paramKeys: .paramKeys, isError: .isError}
// ] | group_by(.tool) | map({tool: .[0].tool, samples: length, keys: (.[].paramKeys | add | unique)})'

// Permissions mapping built from log analysis:
const PERMISSIONS_INVENTORY = {
  readFile:    { observed: ['fs:read'],          declared: ['fs:read', 'fs:write'] },
  searchFiles: { observed: ['fs:read'],          declared: ['fs:read', 'fs:write', 'net:fetch'] },
  runLinter:   { observed: ['exec:node', 'fs:read'], declared: ['exec:*'] },
};

// For each tool where declared > observed, investigate the gap before narrowing:
for (const [tool, perms] of Object.entries(PERMISSIONS_INVENTORY)) {
  const excess = perms.declared.filter(p => !perms.observed.includes(p));
  if (excess.length > 0) {
    console.log(`${tool}: excess permissions ${excess.join(', ')} — audit source before removing`);
  }
}

Testing strategy: shadow mode and contract tests

The safest pattern for any phase transition is to run the new behavior in shadow alongside the old behavior and compare outputs. For input validation and rate limiting this is straightforward: add the new code path, log what it would have done, compare against the old path.

For behavioral changes (command execution, error messages), write contract tests against the live server before you make the change, capturing exact response shapes. After the change, run the same tests. Any test that breaks is a callsite that needs fixing or an exemption before the change goes out.

// Contract test: capture current behavior as a snapshot before hardening
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import assert from 'assert';

const client = new Client({ name: 'contract-test', version: '1.0.0' });
const transport = new StdioClientTransport({ command: 'node', args: ['./server.js'] });
await client.connect(transport);

// Test: normal path — confirm it still works after Phase 5
const result = await client.callTool('readFile', { path: 'test-fixture.txt' });
assert.equal(result.isError, false);
assert.ok(result.content[0].text.includes('expected content'));

// Test: security violation — confirm it's blocked in enforce mode
const attack = await client.callTool('readFile', { path: '../../etc/passwd' });
assert.equal(attack.isError, true);
// Don't assert on the exact error message text — let it be generic

await client.close();

Realistic timeline and prioritization

Phase Duration before enforce SkillAudit axes improved Breaking change risk
1 — Observe 7 days minimum None (baseline only) Zero
2 — Input validation 7–14 days warn, then enforce Security (SSRF, path traversal, cmd injection) Low if warn phase is thorough
3 — Rate limiting 7 days soft, then enforce Security (DoS resistance) Low with conservative initial limits
4 — Auth 30 days soft, then enforce Security (credential protection) Medium — requires client updates
5 — Command hardening Immediate for new code; detect+log for legacy Security (command injection) Medium — breaks shell-feature callers
6 — Permission scoping After full Phase 1 log analysis Permissions hygiene High if any hidden paths not yet logged

A typical existing MCP server can go from a C to a B+ in 30 days following this timeline, with no surprise breaks to existing callers. The full transition to A typically takes 60–90 days — not because the code changes are complex, but because the warn-only observation windows take time and the auth coordination requires communicating with callers.

How SkillAudit scores each phase

SkillAudit doesn't grade on intention — it grades on what the scanner can observe in your published code. Each phase has a corresponding set of checks that the scanner runs:

Run SkillAudit after each phase: The scanner gives you a per-axis breakdown, so you can see exactly which checks you've cleared and what remains. A Phase 2-only deploy will show Security moving and Permissions Hygiene staying red — that's expected. The breakdown lets you communicate progress to stakeholders ("we've cleared 4 of 7 Security checks") before the full hardening is complete.

Summary

Hardening an existing MCP server is a sequencing problem as much as a coding problem. The code changes for most controls are not complex — the complexity is in running them in observation mode long enough to verify no legitimate caller trips them, then flipping the enforcement flag with confidence. The six-phase path here gives you a repeatable process: observe first, validate in warn-only, rate limit in soft mode, add shadow auth, harden commands, narrow permissions last. Each phase has a concrete enforcement gate: flip the flag only when your warn-only logs show zero legitimate violations over the observation window.

If you want a faster path, run a SkillAudit scan first — the per-axis report tells you exactly which checks you're failing, so you can skip phases that are already clean and focus time on the actual gaps.


Related: MCP server input validation patterns · Rate limiting deep dive · DevSecOps CI/CD integration · Reject vs error