Topic: response output validation in MCP servers

Response Output Validation in MCP Servers

MCP tool responses flow directly into the LLM's context window, where they can be read, summarized, and — if the LLM is acting as an agent — included in subsequent outbound API calls or returned to the user. An unvalidated tool response is a data exfiltration path: error messages containing stack traces with file system paths, API responses that echo back authentication headers, and database query results that include fields the LLM had no business receiving. Response sanitization is the application-layer control that closes this path before data leaves the server.

Credential echo — when API responses include auth material

The most common class of credential leak in MCP tool responses is credential echo: a tool calls an external API, the API response body happens to include the API key used to authenticate (as an account metadata field, a debug header reflected in the body, or an OAuth token in the access_token field of a JSON response), and the MCP server returns the full response body to the LLM without scrubbing.

The LLM now has the API key in its context. If the conversation is logged (to a database, to an observability platform, to a user-visible chat history), the credential is in the log. If the LLM is instructed by a later prompt injection to "summarize what you know about authentication", it may include the credential in its summary.

The mitigation is a response sanitizer that identifies and redacts credential-shaped patterns before the tool response is returned:

// Patterns that match common credential formats
// Each pattern includes a label for the redaction placeholder
const CREDENTIAL_PATTERNS: Array<{ pattern: RegExp; label: string }> = [
  // AWS access key IDs (AKIA... or ASIA...)
  { pattern: /\b(AKIA|ASIA|AROA|AIDA)[A-Z0-9]{16}\b/g, label: 'AWS_KEY_ID' },
  // AWS secret access keys (40 base64 chars after known context words)
  { pattern: /(?<=[Ss]ecret[_\s]?[Aa]ccess[_\s]?[Kk]ey["']?\s*[:=]\s*["']?)[A-Za-z0-9+/]{40}/g, label: 'AWS_SECRET' },
  // GitHub tokens (classic and fine-grained)
  { pattern: /\bghp_[A-Za-z0-9]{36}\b/g, label: 'GITHUB_TOKEN' },
  { pattern: /\bghs_[A-Za-z0-9]{36}\b/g, label: 'GITHUB_APP_TOKEN' },
  { pattern: /\bgithub_pat_[A-Za-z0-9_]{82}\b/g, label: 'GITHUB_PAT' },
  // Slack tokens
  { pattern: /\bxox[bprs]-[A-Za-z0-9-]{24,}\b/g, label: 'SLACK_TOKEN' },
  // Bearer tokens in JSON field values
  { pattern: /(?<="(?:access_token|bearer_token|api_key|apikey|secret_key)":\s*")[A-Za-z0-9\-._~+/]{20,}/g, label: 'API_TOKEN' },
  // Generic high-entropy strings in key-like positions (conservative — false positive risk)
  // Only flag if next to a recognizable key field name
  { pattern: /(?<=[Aa][Pp][Ii][-_]?[Kk][Ee][Yy]["']?\s*[:=]\s*["'])[A-Za-z0-9+/]{32,}/g, label: 'API_KEY' },
];

// Stack trace path patterns that reveal internal file structure
const PATH_PATTERNS: Array<{ pattern: RegExp; label: string }> = [
  // Absolute Unix paths in stack traces
  { pattern: /\/(?:home|root|var|usr|opt|app|srv)\/[^\s"']+/g, label: '[PATH]' },
  // Windows paths
  { pattern: /[A-Z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*/g, label: '[WIN_PATH]' },
  // Node.js internal module paths
  { pattern: /at (?:Object\.|Function\.)?[\w.]+\s+\([^)]+:\d+:\d+\)/g, label: '[STACK_FRAME]' },
];

function sanitizeToolResponse(text: string): string {
  let sanitized = text;

  for (const { pattern, label } of CREDENTIAL_PATTERNS) {
    sanitized = sanitized.replace(pattern, `[REDACTED:${label}]`);
  }

  for (const { pattern, label } of PATH_PATTERNS) {
    sanitized = sanitized.replace(pattern, label);
  }

  return sanitized;
}

// Apply in a wrapper around all tool handlers
function wrapToolResponse(
  content: Array<{ type: string; text: string }>
): Array<{ type: string; text: string }> {
  return content.map((item) =>
    item.type === 'text'
      ? { ...item, text: sanitizeToolResponse(item.text) }
      : item
  );
}

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const rawResult = await dispatch(request);
  return {
    content: wrapToolResponse(rawResult.content),
  };
});

The regex patterns above are intentionally conservative — they target high-confidence credential formats with low false-positive risk. A more aggressive approach using entropy analysis (Shannon entropy of token-like strings) catches unknown credential formats but requires tuning to avoid redacting legitimate hex IDs, UUIDs, and base64-encoded content.

PII scrubbing — keeping user data out of LLM context

Beyond credentials, tool responses from CRM systems, user databases, and email APIs frequently return PII fields that the tool's intended use case does not require. A "get customer account status" tool may return a JSON object with name, email, phone, billing address, and payment method last four — when the caller only needed the account status string.

The server-side defense is response schema projection: define the exact fields the tool response should contain and strip everything else before returning. This is distinct from input validation — it is output allowlisting:

import { z } from 'zod';

// Define the response schema — only the fields the LLM should see
const AccountStatusResponseSchema = z.object({
  accountId: z.string(),
  status: z.enum(['active', 'suspended', 'closed']),
  plan: z.string(),
  // Note: no email, no name, no payment info
}).strip();  // .strip() drops any extra fields not in the schema

async function getAccountStatus(accountId: string): Promise {
  // Raw API response may contain many PII fields
  const raw = await accountsApi.getAccount(accountId);

  // Parse through the schema — strips PII fields silently
  const parsed = AccountStatusResponseSchema.safeParse(raw);
  if (!parsed.success) {
    // Return a structured error that does not include raw API response
    return JSON.stringify({ error: 'Unable to retrieve account status' });
  }

  return JSON.stringify(parsed.data);
}

The Zod .strip() mode (the default) drops unknown keys. An explicit .strict() schema would throw on unknown keys — useful during development to catch when an API adds new fields, but too loud in production. Use .strip() in production tool response parsing.

Error handling that does not leak internal state

Unhandled promise rejections in tool handlers often bubble up as error responses that include the full exception message and stack trace. A Node.js error message like ENOENT: no such file or directory, open '/var/secrets/db-password' tells an attacker exactly where your secrets file is expected to be. A database error like column "admin_override" of relation "users" does not exist reveals your database schema.

Wrap all tool handlers in a try/catch that logs the full error internally (to your observability platform) and returns a sanitized message to the MCP protocol layer:

import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { randomUUID } from 'crypto';

async function safeToolHandler(
  name: string,
  fn: () => Promise
): Promise {
  try {
    return await fn();
  } catch (err) {
    // Generate a correlation ID for internal log lookup
    const errorId = randomUUID().slice(0, 8).toUpperCase();

    // Log full error internally — include stack trace for debugging
    console.error(`[${errorId}] Tool '${name}' error:`, err);

    // Check if this is already a well-typed MCP error (e.g. from validation)
    if (err instanceof McpError) throw err;

    // Surface a minimal, non-leaking error to the LLM
    // The correlation ID lets engineers look up the full error in logs
    throw new McpError(
      ErrorCode.InternalError,
      `Tool '${name}' encountered an error (ref: ${errorId}). No additional details available.`
    );
  }
}

What SkillAudit checks

The credential exposure and privacy axes check for response sanitization gaps:

See also

Check your tool responses for credential echo, PII leakage, and stack trace exposure.

Run a free audit → How grading works →