Topic: mcp server error handling security
MCP server error handling security — sanitized error responses, stack trace leakage, structured error codes
In a conventional web application, an error message is displayed to a browser. In an MCP server, that same error message is returned to an AI model that may relay it verbatim to the user, store it in its reasoning context, or — if a prompt injection attack is in progress — use it as reconnaissance. Stack traces expose file paths and function names. PostgreSQL errors expose table and column names. A "user not found" vs "wrong password" distinction is enough to enumerate valid email addresses at scale. This page covers every major error-handling security pattern an MCP server needs, with Node.js code showing the unsafe pattern and the safe replacement.
Stack trace leakage in tool responses
The fastest way to expose server internals is a catch block that returns err.stack or even err.message directly. A Node.js stack trace includes the absolute file path of every frame: /app/src/handlers/user.ts:47:12, the function name, and the column offset. A prompt injection payload that induces a specific code path to throw can enumerate the server's directory structure in a handful of calls.
// DANGEROUS: returning err.stack directly to the LLM
// Exposes: /app/src/handlers/search.ts:23:14, function names, node_modules paths
server.tool('search', searchSchema, async ({ query }) => {
try {
const results = await searchIndex.query(query);
return { content: [{ type: 'text', text: JSON.stringify(results) }] };
} catch (err) {
// NEVER return err.stack or err.message from an internal operation
return {
content: [{ type: 'text', text: `Search failed: ${err.stack}` }]
};
}
});
// SAFE: sanitized error response — correlation ID only, full detail to internal log
import crypto from 'node:crypto';
function internalError(toolName: string, err: unknown): string {
const id = crypto.randomUUID();
// Full stack trace goes to structured internal log only — never to the LLM
console.error(JSON.stringify({
event: 'tool_error',
tool: toolName,
errorName: err instanceof Error ? err.name : 'Unknown',
errorMessage: err instanceof Error ? err.message : String(err),
stack: err instanceof Error ? err.stack : undefined,
correlationId: id,
ts: new Date().toISOString(),
}));
return id;
}
server.tool('search', searchSchema, async ({ query }) => {
try {
const results = await searchIndex.query(query);
return { content: [{ type: 'text', text: JSON.stringify(results) }] };
} catch (err) {
const correlationId = internalError('search', err);
return {
isError: true,
content: [{ type: 'text', text: `Search temporarily unavailable (ref: ${correlationId})` }]
};
}
});
The correlation ID is the only thing that crosses the trust boundary to the LLM. An operator can look up the full stack trace internally without exposing any structural information to the conversation context. When a user reports an error, the correlation ID is sufficient for incident triage.
Database error information disclosure
PostgreSQL driver errors are notoriously information-rich. A unique constraint violation on the users.email column produces: duplicate key value violates unique constraint "users_email_key". That single error reveals the table name, column name, and constraint naming convention. Returning it to the LLM is equivalent to providing a partial schema dump on every failed insert.
// DANGEROUS: passing pg error message directly to tool response
// Leaks: table name ("users"), column name ("email"), constraint name ("users_email_key")
import { Pool } from 'pg';
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
server.tool('registerUser', registerSchema, async ({ email, password }) => {
try {
const hashedPw = await bcrypt.hash(password, 12);
await pool.query('INSERT INTO users (email, password_hash) VALUES ($1, $2)', [email, hashedPw]);
return { content: [{ type: 'text', text: 'User registered.' }] };
} catch (err: any) {
// DANGEROUS: "duplicate key value violates unique constraint users_email_key"
return { content: [{ type: 'text', text: `Registration failed: ${err.message}` }] };
}
});
// SAFE: extract only the PostgreSQL SQLSTATE error code; map to a safe business message
// pg SQLSTATE codes: https://www.postgresql.org/docs/current/errcodes-appendix.html
const PG_UNIQUE_VIOLATION = '23505';
const PG_FOREIGN_KEY_VIOLATION = '23503';
const PG_NOT_NULL_VIOLATION = '23502';
const PG_CHECK_VIOLATION = '23514';
const PG_SERIALIZATION_FAILURE = '40001';
function pgErrorToSafeMessage(err: any): string | null {
switch (err?.code) {
case PG_UNIQUE_VIOLATION:
// Do NOT include constraint name — reveals column/table structure
return 'An account with this email address already exists.';
case PG_FOREIGN_KEY_VIOLATION:
return 'Related record not found.';
case PG_NOT_NULL_VIOLATION:
return 'A required field was missing.';
case PG_CHECK_VIOLATION:
return 'A field value failed validation.';
case PG_SERIALIZATION_FAILURE:
return 'Concurrent update conflict — please retry.';
default:
return null; // Unknown: treat as internal error
}
}
server.tool('registerUser', registerSchema, async ({ email, password }) => {
try {
const hashedPw = await bcrypt.hash(password, 12);
await pool.query('INSERT INTO users (email, password_hash) VALUES ($1, $2)', [email, hashedPw]);
return { content: [{ type: 'text', text: 'User registered.' }] };
} catch (err: any) {
const safeMsg = pgErrorToSafeMessage(err);
if (safeMsg) {
return { isError: true, content: [{ type: 'text', text: safeMsg }] };
}
const correlationId = internalError('registerUser', err);
return {
isError: true,
content: [{ type: 'text', text: `Registration failed (ref: ${correlationId})` }]
};
}
});
Error-based enumeration via distinguishing error messages
A login handler that returns different messages for "user doesn't exist" vs "wrong password" allows a caller to enumerate valid email addresses with a binary probe: "user not found" means the email is unregistered, "invalid password" means it is registered. An automated attacker (or a prompt injection that calls this tool in a loop) can map an entire user database without ever logging in.
// DANGEROUS: distinguishing "user not found" from "wrong password"
// Binary probe: "user not found" → email unregistered; "wrong password" → email registered
server.tool('login', loginSchema, async ({ email, password }) => {
const result = await pool.query('SELECT * FROM users WHERE email = $1', [email]);
if (!result.rows[0]) {
return { content: [{ type: 'text', text: 'User not found.' }] }; // LEAKS existence
}
const valid = await bcrypt.compare(password, result.rows[0].password_hash);
if (!valid) {
return { content: [{ type: 'text', text: 'Wrong password.' }] }; // LEAKS existence
}
return { content: [{ type: 'text', text: `Logged in.` }] };
});
// SAFE: constant-time response — always run bcrypt to equalize timing;
// always return the same message regardless of which check failed.
// Pre-computed dummy hash — bcrypt.compare with dummy prevents timing shortcut
// when user doesn't exist (avoids "no user row → instant return" timing leak)
const DUMMY_HASH = await bcrypt.hash('__sentinel_value_never_used__', 12);
server.tool('login', loginSchema, async ({ email, password }) => {
try {
const result = await pool.query(
'SELECT id, password_hash, is_active FROM users WHERE email = $1',
[email]
);
const user = result.rows[0];
// Always run bcrypt.compare — prevents timing-based enumeration.
// If no user row, compare against dummy hash so execution time is identical.
const hashToCompare = user?.password_hash ?? DUMMY_HASH;
const valid = await bcrypt.compare(password, hashToCompare);
// Single unified rejection message for all failure cases
if (!user || !valid || !user.is_active) {
return {
isError: true,
content: [{ type: 'text', text: 'Invalid email or password.' }]
};
}
const sessionToken = await createSession(user.id);
return { content: [{ type: 'text', text: `Authenticated. Session: ${sessionToken}` }] };
} catch (err) {
const correlationId = internalError('login', err);
return {
isError: true,
content: [{ type: 'text', text: `Login failed (ref: ${correlationId})` }]
};
}
});
Structured error codes for LLM consumption
A generic Error('Something went wrong') gives the LLM no actionable information. The model cannot distinguish a rate-limit error (back off and retry) from a permission error (surface to user) or a not-found error (handle gracefully) without parsing free-text messages. Text parsing is fragile, locale-sensitive, and breaks across version changes. Structured error codes let the LLM branch on a code field rather than guessing from a human string.
// DANGEROUS: unstructured errors — LLM must parse message text to decide how to respond
// Fragile, locale-dependent, breaks with any message wording change
server.tool('fetchDocument', docSchema, async ({ docId, callerId }) => {
const doc = await db.documents.findById(docId);
if (!doc) throw new Error('Document not found'); // LLM must match text
if (doc.ownerId !== callerId) throw new Error('Access denied');
if (rateLimiter.isExceeded(callerId)) throw new Error('Too many requests, wait 60s');
return { content: [{ type: 'text', text: doc.content }] };
});
// SAFE: structured error objects with machine-readable code + optional retryAfterMs
// The LLM branches on error.code — stable across refactors, localizations, and versions.
type ErrorCode =
| 'NOT_FOUND'
| 'PERMISSION_DENIED'
| 'RATE_LIMITED'
| 'VALIDATION_ERROR'
| 'UPSTREAM_ERROR'
| 'INTERNAL_ERROR';
function toolError(
code: ErrorCode,
message: string,
extras: { retryAfterMs?: number; correlationId?: string } = {}
) {
return {
isError: true,
content: [{
type: 'text',
text: JSON.stringify({ error: { code, message, ...extras } })
}]
};
}
server.tool('fetchDocument', docSchema, async ({ docId, callerId }) => {
try {
const doc = await db.documents.findById(docId);
if (!doc) {
return toolError('NOT_FOUND', 'Document not found.');
}
if (doc.ownerId !== callerId) {
return toolError('PERMISSION_DENIED', 'You do not have access to this document.');
}
const rateStatus = rateLimiter.check(callerId);
if (rateStatus.exceeded) {
return toolError('RATE_LIMITED', 'Rate limit exceeded.', {
retryAfterMs: rateStatus.resetInMs
});
}
return { content: [{ type: 'text', text: doc.content }] };
} catch (err) {
const correlationId = internalError('fetchDocument', err);
return toolError('INTERNAL_ERROR', `Fetch failed (ref: ${correlationId})`, { correlationId });
}
});
// The LLM's system prompt or tool description instructs it:
// - NOT_FOUND → tell user the document doesn't exist
// - PERMISSION_DENIED → surface to user, do not retry
// - RATE_LIMITED → wait error.retryAfterMs ms then retry
// - INTERNAL_ERROR → tell user to report correlationId to support
Circuit breaker for upstream API errors
Without a circuit breaker, a downed upstream API causes every tool invocation to attempt a request, fail, log the full internal URL (which may contain hostnames, paths, and tokens), and return an error. Ten failed calls from a single LLM session accumulate ten log lines with the internal URL, each of which may reach a monitoring dashboard visible to more people than the original secret required. A circuit breaker stops hammering the upstream after a threshold and returns predictable errors without logging sensitive details repeatedly.
// DANGEROUS: no circuit breaker — each retry exposes internal URL in logs + error response
// After 10 attempts: 10 log lines with "https://internal.corp/api/v2?token=abc123"
server.tool('fetchExternalData', schema, async ({ resourceId }) => {
const url = `${process.env.INTERNAL_API_URL}/resources/${resourceId}`;
const res = await fetch(url, { headers: { Authorization: `Bearer ${process.env.API_TOKEN}` } });
if (!res.ok) {
// Internal URL and status reach the LLM directly
throw new Error(`HTTP ${res.status} from ${url}`);
}
return { content: [{ type: 'text', text: await res.text() }] };
});
// SAFE: CircuitBreaker with CLOSED / OPEN / HALF_OPEN states
// Stops retrying after failureThreshold; opens for recoveryTimeMs before probing again.
type CircuitState = 'CLOSED' | 'OPEN' | 'HALF_OPEN';
class CircuitBreaker {
private state: CircuitState = 'CLOSED';
private failures = 0;
private openedAt = 0;
constructor(
private readonly failureThreshold = 5,
private readonly recoveryTimeMs = 30_000,
) {}
isOpen(): boolean {
if (this.state !== 'OPEN') return false;
if (Date.now() - this.openedAt >= this.recoveryTimeMs) {
this.state = 'HALF_OPEN'; // Allow one probe request through
return false;
}
return true;
}
recordSuccess() {
this.failures = 0;
this.state = 'CLOSED';
}
recordFailure() {
this.failures++;
if (this.failures >= this.failureThreshold) {
this.state = 'OPEN';
this.openedAt = Date.now();
}
}
get retryAfterMs(): number {
return Math.max(0, this.recoveryTimeMs - (Date.now() - this.openedAt));
}
}
const breaker = new CircuitBreaker(5, 30_000);
server.tool('fetchExternalData', schema, async ({ resourceId }) => {
if (breaker.isOpen()) {
return toolError('UPSTREAM_ERROR', 'External service temporarily unavailable.', {
retryAfterMs: breaker.retryAfterMs
});
}
// Use only a logical resource reference in logs — never the full URL with credentials
const resourceRef = `/resources/${encodeURIComponent(resourceId)}`;
try {
const res = await fetch(`${process.env.EXTERNAL_API_BASE}${resourceRef}`, {
headers: { Authorization: `Bearer ${await getApiToken()}` },
signal: AbortSignal.timeout(5_000),
});
if (!res.ok) {
breaker.recordFailure();
// Log status and logical path only — no full URL, no token
console.error(JSON.stringify({
event: 'upstream_http_error',
status: res.status,
resourceRef,
}));
return toolError('UPSTREAM_ERROR', 'External service returned an error.', {
retryAfterMs: res.status === 429 ? 60_000 : undefined
});
}
breaker.recordSuccess();
return { content: [{ type: 'text', text: JSON.stringify(await res.json()) }] };
} catch (err) {
breaker.recordFailure();
// Log error class only — never err.message which may contain internal URL fragments
console.error(JSON.stringify({
event: 'upstream_fetch_failed',
errorName: err instanceof Error ? err.name : 'Unknown',
resourceRef,
}));
return toolError('UPSTREAM_ERROR', 'External service unreachable.', {
retryAfterMs: breaker.retryAfterMs
});
}
});
What SkillAudit checks in this area
err.stackin return value — AST taint analysis tracing the error binding in catch blocks to tool response content strings. Any path whereerr.stackor a template literal including it flows into returned content. Flagged HIGH.err.messagefrom database errors in return value — taint analysis frompg,mysql2,prisma, andsequelizecatch blocks whereerr.messageis interpolated into the response. Flagged HIGH; database messages carry schema-level information.throw errre-throw in tool handlers — catch blocks that re-throw the original error without wrapping it in a sanitized type. The MCP framework's default serializer determines whether the raw message reaches the caller. Flagged WARN.- No error code structure — tool handlers whose error responses contain only a
messagefield with nocodefield. The LLM must parse text to decide how to handle the error. Flagged INFO with a structured-codes recommendation.
Scan your MCP server for stack trace leakage, database error disclosure, and unstructured error responses.
Run a free audit → How grading works →See also
- MCP server error message information disclosure — deeper dive on stack trace and file path leakage patterns
- MCP server audit logging — where correlation IDs should go and how to query them post-incident
- MCP server rate limiting — the RATE_LIMITED error code pattern and backoff strategies in context
- MCP server prompt injection — how error disclosure becomes an active attack vector via crafted inputs
- MCP server security checklist — comprehensive pre-publication hardening checklist