Topic: Error boundary security
MCP server error boundary security — try/catch placement, error classification, structured error response
Error handling in MCP servers is not just an operational concern — it is a security boundary. A misplaced try/catch lets exceptions escape to the MCP runtime and expose internal state in ways the caller can observe. Re-throwing an upstream service error verbatim leaks internal hostnames, database schema details, and dependency versions. An authorization check that throws and is not caught can accidentally allow access to a resource that should be denied. Five patterns that treat error boundaries as a first-class security layer.
1. Where to place try/catch in a tool handler
The most common error handling mistake in MCP tool handlers is wrapping only the database or API call in try/catch while leaving the rest of the handler unprotected. This pattern looks reasonable — the database call is where errors "happen" — but it misses every other failure mode. Argument validation can throw if the schema library receives an unexpected type. Authorization checks can throw if the auth service is unavailable. Response serialization can throw if the result object contains a circular reference or a value that JSON.stringify rejects. Any of these unprotected exceptions will escape to the MCP runtime as an unhandled rejection.
What the MCP runtime does with an unhandled rejection depends on the implementation, but the possibilities include: terminating the server process (crashing all concurrent requests), sending the raw error object as the tool response (leaking internal details to the client), or silently hanging the request. None of these outcomes are acceptable. The outer try/catch at the tool handler boundary must catch everything, log the internal details, and return a sanitized error response.
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { z } from 'zod'
const server = new McpServer({ name: 'example', version: '1.0.0' })
server.tool(
'get_user_record',
{ userId: z.string().uuid() },
async ({ userId }) => {
// WRONG: only the DB call is wrapped — validation, auth, and serialize errors escape
// try {
// const row = await db.query('SELECT * FROM users WHERE id = $1', [userId])
// return { content: [{ type: 'text', text: JSON.stringify(row) }] }
// } catch (err) {
// return errorResponse(err)
// }
// CORRECT: the entire handler body is the try block
try {
// 1. Authorization — can throw if auth service is down
await requirePermission(userId, 'users:read')
// 2. Data access — can throw on DB error
const row = await db.query(
'SELECT id, email, created_at FROM users WHERE id = $1',
[userId]
)
if (!row) {
return { content: [{ type: 'text', text: 'User not found' }], isError: true }
}
// 3. Serialization — can throw on circular refs / non-serializable values
return { content: [{ type: 'text', text: JSON.stringify(row) }] }
} catch (err) {
// Single catch handles ALL failure modes — nothing escapes
return handleToolError(err)
}
}
)
The outer boundary should also handle the case where the handler itself returns a rejected promise — for instance, if an async function has an await that is not inside the try block. Write a wrapper utility that enforces the boundary pattern for every registered tool rather than relying on each developer to remember to add the outer try/catch manually.
2. Error classification: operational vs programmer errors
Not all errors are equal, and treating them equally makes both security and reliability worse. Operational errors are failures that are expected to happen in the normal course of operation: the requested resource does not exist, the caller provided invalid input, the upstream service returned a rate limit response, or a network timeout occurred. These errors are not bugs. They should produce clear, informative messages that tell the caller what went wrong and what they can do about it. They should be logged at INFO or WARN level, not ERROR.
Programmer errors are unexpected failures that indicate a bug in the server code: a TypeError because a variable that should have been a string was undefined, a ReferenceError because a module import failed, or an assertion failure because the code reached a state that was believed to be impossible. These errors should produce a generic "internal error" message to the caller — the specific details are not actionable by the caller and may leak sensitive information. Internally, they should be logged at ERROR level with the full stack trace and, in production, trigger an alert.
// error-types.ts — operational error class with caller-visible properties
export class OperationalError extends Error {
constructor(
public readonly code: string,
message: string,
public readonly retryable: boolean = false,
public readonly httpStatus: number = 400
) {
super(message)
this.name = 'OperationalError'
}
}
// Common operational error factory functions
export const Errors = {
notFound: (resource: string) =>
new OperationalError('NOT_FOUND', `${resource} not found`, false, 404),
invalidInput: (detail: string) =>
new OperationalError('INVALID_INPUT', detail, false, 400),
rateLimited: () =>
new OperationalError('RATE_LIMITED', 'Too many requests, please retry later', true, 429),
unauthorized: () =>
new OperationalError('UNAUTHORIZED', 'Access denied', false, 403),
}
// handleToolError — classify and respond appropriately
export function handleToolError(err: unknown): ToolResult {
if (err instanceof OperationalError) {
// Operational: safe to return details to the caller
logger.warn({ code: err.code, message: err.message }, 'operational error')
return {
content: [{ type: 'text', text: JSON.stringify({
code: err.code,
message: err.message,
retryable: err.retryable,
})}],
isError: true,
}
}
// Programmer error: log everything internally, return nothing useful externally
logger.error({ err }, 'unexpected tool handler error')
return {
content: [{ type: 'text', text: JSON.stringify({
code: 'INTERNAL_ERROR',
message: 'An internal error occurred. Please try again later.',
retryable: true,
})}],
isError: true,
}
}
3. Never re-throw upstream errors verbatim
When an MCP tool calls an external service — a database, a REST API, an internal microservice — and that service returns an error, the error object from the service client typically contains far more information than the caller of the MCP tool should ever see. A PostgreSQL error thrown by the pg driver includes the table name, column names, constraint names, the failed SQL query, and the database server version. An OpenAI API error includes the organization ID, the specific model endpoint, and internal rate limit details. A fetch error includes the full URL that failed, which may contain internal hostnames or path parameters that reveal your service topology.
A sanitizeError() helper should transform any upstream error into a safe internal representation before it is classified and returned. The helper strips the original message, preserves only a safe human-readable summary, and optionally maps specific error codes to OperationalError instances. The full original error is logged internally where it is useful for debugging, but the caller receives only what they need to understand the outcome.
// sanitize-error.ts — prevent upstream error details from reaching callers
import { OperationalError, Errors } from './error-types.js'
interface PgError extends Error {
code?: string // PostgreSQL error code (e.g. '23505' for unique violation)
routine?: string // Internal PG routine name — never expose this
detail?: string // May include column values from a constraint violation
}
export function sanitizeDbError(err: unknown): OperationalError | Error {
if (!isError(err)) return new Error('Unknown database error')
const pg = err as PgError
// Map known PG error codes to safe operational errors
switch (pg.code) {
case '23505': return new OperationalError('CONFLICT', 'Resource already exists', false, 409)
case '23503': return new OperationalError('INVALID_INPUT', 'Referenced resource not found', false, 400)
case '57014': return new OperationalError('TIMEOUT', 'Database query timed out', true, 503)
default:
// Log the full error internally — caller gets nothing
logger.error({
pgCode: pg.code,
// Do NOT include pg.detail or pg.routine — they contain schema info
message: '[sanitized — see server logs]',
}, 'database error')
return new Error('Database error') // Handled as programmer error upstream
}
}
export function sanitizeFetchError(err: unknown): OperationalError {
// Do not include the URL — it may contain internal hostnames or auth tokens
const status = (err as any)?.response?.status ?? 0
if (status === 429) return Errors.rateLimited()
if (status === 404) return Errors.notFound('upstream resource')
if (status >= 500) return new OperationalError('UPSTREAM_ERROR', 'Upstream service unavailable', true, 502)
return new OperationalError('UPSTREAM_ERROR', 'Upstream request failed', false, 502)
}
function isError(v: unknown): v is Error {
return v instanceof Error
}
4. Structured error response schema
An MCP tool response that returns an error as a plain string — "Something went wrong" or, worse, the raw err.message — is not actionable by the caller. Claude cannot determine whether to retry the operation, ask the user for different input, or report a system failure. A structured error schema gives the caller the information it needs to handle the error programmatically without exposing internal implementation details.
The minimal structure is three fields: code, a machine-readable identifier that the caller can switch on (e.g. "RATE_LIMITED", "NOT_FOUND", "INVALID_INPUT"); message, a human-readable string suitable for display; and retryable, a boolean indicating whether retrying the same request without modification is likely to succeed. Stack traces, internal error codes from upstream services, SQL error details, and file paths are never included in the response — they belong in internal logs only.
// error-response.ts — typed error response schema and builder
import { z } from 'zod'
// The schema is enforced both when building responses (TypeScript types)
// and in tests (Zod validation of actual tool output)
export const ErrorResponseSchema = z.object({
code: z.string().min(1).max(64),
message: z.string().min(1).max(512),
retryable: z.boolean(),
}).strict() // .strict() rejects extra keys — stack, detail, etc. cannot sneak in
export type ErrorResponse = z.infer<typeof ErrorResponseSchema>
export function buildErrorResponse(
code: string,
message: string,
retryable: boolean,
): import('@modelcontextprotocol/sdk/types.js').CallToolResult {
const payload: ErrorResponse = { code, message, retryable }
// Validate our own output — catches mistakes during development
const check = ErrorResponseSchema.safeParse(payload)
if (!check.success) {
// Construction logic is broken — fall back to the safest possible response
logger.error({ issues: check.error.issues }, 'buildErrorResponse validation failed')
return {
content: [{ type: 'text', text: JSON.stringify({
code: 'INTERNAL_ERROR',
message: 'An internal error occurred.',
retryable: true,
})}],
isError: true,
}
}
return {
content: [{ type: 'text', text: JSON.stringify(payload) }],
isError: true,
}
}
// What the response NEVER contains:
// - stack: err.stack (file paths, line numbers, variable names)
// - detail: pg.detail (column values from constraint violations)
// - upstream: fetchRes.body (raw upstream service response)
// - env: process.env.DB_HOST (infrastructure details)
5. Fail-closed authorization: deny on error
Authorization checks in MCP tool handlers are often asynchronous — they call a token validation service, check a permissions database, or verify a JWT signature. All of these can fail. The JWT library can throw if the secret is malformed. The permissions database can be unreachable. The token validation service can return a 503. What should happen when the authorization check itself fails?
The only secure answer is to deny the request. An authorization check that throws an exception should never be interpreted as an authorization success. Yet this failure mode is easy to introduce accidentally: wrapping the auth check in a try/catch and catching the error without returning a denial, allowing execution to fall through to the tool logic, is a privilege escalation bug. The pattern below ensures that the authorization check fails closed — any exception from the auth check results in an explicit denial, regardless of what the exception was.
import jwt from 'jsonwebtoken'
import { config } from './config.js'
import { Errors } from './error-types.js'
// WRONG — catches error and falls through, accidentally allowing access
async function checkPermissionUnsafe(token: string, permission: string): Promise<void> {
try {
const payload = jwt.verify(token, config.auth.jwtSecret) as { perms: string[] }
if (!payload.perms.includes(permission)) throw Errors.unauthorized()
} catch (err) {
// BUG: if jwt.verify throws (malformed secret, library bug),
// this catch swallows the error and the function returns normally,
// allowing the caller to proceed as if authorization succeeded
if (err instanceof OperationalError) throw err
// programmer error falls through — silent allow!
}
}
// CORRECT — any exception from the auth check becomes an explicit denial
export async function requirePermission(token: string, permission: string): Promise<void> {
let authorized = false
try {
const payload = jwt.verify(token, config.auth.jwtSecret) as { perms?: string[] }
authorized = Array.isArray(payload.perms) && payload.perms.includes(permission)
} catch (err) {
// Log the auth check failure for investigation — do NOT allow
logger.warn({ permission, errName: (err as Error).name }, 'auth check threw — denying')
authorized = false // Explicit: fail closed
}
// Single exit point: authorized must be true or we throw
if (!authorized) {
throw Errors.unauthorized()
}
}
// Usage in tool handler — safe because requirePermission always throws on failure
server.tool('delete_record', { id: z.string().uuid() }, async ({ id }, { token }) => {
try {
await requirePermission(token, 'records:delete') // throws on deny OR on error
await db.deleteRecord(id)
return { content: [{ type: 'text', text: 'Deleted' }] }
} catch (err) {
return handleToolError(err)
}
})
The same fail-closed principle applies to any check that gates access: rate limit checks, tenant isolation checks, IP allowlist checks, and capability checks. In every case, an exception thrown by the check itself must result in denial. A useful mental model is that the authorization system has a single boolean — authorized — that starts as false and can only be set to true by an explicit successful authorization step. Exceptions can never set it to true.
Audit your MCP server's error boundary security
SkillAudit checks for unprotected exception paths, upstream error leakage, and fail-open authorization patterns in MCP server tool handlers.
See pricing