Topic: mcp server stack overflow security

MCP server stack overflow security — recursive processing depth limits for tool arguments

Stack overflow is primarily a denial-of-service vulnerability in MCP servers: a tool argument designed to trigger unbounded recursive processing exhausts the call stack and crashes the server process. In Node.js this surfaces as an uncaught RangeError: Maximum call stack size exceeded; in native addons and WASM it is a segfault from the stack guard page. Either outcome terminates the MCP session and drops any agent work in progress — with no recovery path until the server is restarted.

Recursive processing of user-controlled input depth

MCP servers that perform structural analysis of tool arguments are the primary risk surface. JSON schema validators, code parsers (tree-sitter native bindings), Markdown renderers, XML processors, and custom expression evaluators all recurse on nested structures. The processing depth is determined by the input's nesting depth — and the nesting depth is controlled by whoever generates the tool argument. In an agent workflow, that is the model, which may itself be influenced by prompt injection.

A deeply nested JSON argument is the most portable attack vector because every MCP server accepts JSON-encoded tool arguments by definition. An argument like {"a":{"a":{"a":{"a":...}}}} repeated to 10,000 levels is syntactically valid JSON. Node.js's built-in JSON.parse handles this without issue (it is iterative internally), but any tool handler that recursively walks the resulting object tree — for validation, transformation, or analysis — will hit the call stack limit.

Node.js has a call stack limit of roughly 10,000–15,000 frames depending on the platform and the frame size of each recursive call. A recursive function with five local variables and a few function call arguments per frame uses approximately 200–400 bytes of stack per frame, exhausting the default 984 KB stack limit in roughly 2,500–5,000 frames — well within the nesting depth a single JSON argument can contain. For native code in a libuv worker thread (typically 8 MB stack on Linux), a C recursive descent parser with 100 bytes of local state exhausts the stack in around 80,000 frames — reachable with a crafted XML document or Markdown AST.

The agent-level consequence is significant: unlike a web server that can restart a crashed request handler in isolation, an MCP server process crash ends the entire MCP session. The agent loses access to all tools simultaneously, any in-flight tool results are discarded, and the orchestrator receives a connection error rather than a structured tool response. Recovery requires restarting the server and replaying the agent's session from a checkpoint — if such checkpointing exists.

Depth-limiting recursive descent parsers

The most direct fix is to add an explicit depth limit to every recursive function that processes user-controlled input. The depth parameter propagates through recursive calls and throws a structured error — not an unhandled exception — when the limit is reached:

// VULNERABLE — unbounded recursion on nested expression trees
type ExprNode = { op: string; children: ExprNode[] } | { value: number };

function evaluate(node: ExprNode): number {
    if ('value' in node) return node.value;
    const args = node.children.map(evaluate);  // recurses without depth tracking
    return applyOp(node.op, args);
}

// SAFE — depth parameter limits recursion; structured error on violation
const MAX_EXPR_DEPTH = 50;

function evaluate(node: ExprNode, depth = 0): number {
    if (depth > MAX_EXPR_DEPTH) {
        throw Object.assign(
            new RangeError(`Expression nesting exceeds maximum depth of ${MAX_EXPR_DEPTH}`),
            { code: 'EXPR_TOO_DEEP' }
        );
    }
    if ('value' in node) return node.value;
    const args = node.children.map(child => evaluate(child, depth + 1));
    return applyOp(node.op, args);
}

// In the MCP tool handler — catch and convert to a ToolError:
server.tool('evaluate_expression', schema, async (args) => {
    try {
        const tree = parseExpression(args.expression);
        return { result: evaluate(tree) };
    } catch (err) {
        if (err instanceof RangeError && err.code === 'EXPR_TOO_DEEP') {
            throw new McpError(ErrorCode.InvalidParams, err.message);
        }
        throw err;
    }
});

For tree-sitter native bindings, set a parse timeout to bound the worst-case execution time in addition to structural depth limits. The tree-sitter Node.js binding exposes parser.setTimeoutMicros(5_000_000) (5 seconds) — any parse that exceeds this returns null rather than hanging or crashing. Combine this with a visitor pattern that tracks traversal depth and aborts when the limit is reached.

The depth limit value (50 in the example above) should be chosen based on the realistic maximum nesting for valid inputs to that tool, with headroom for legitimate edge cases but well below the stack exhaustion threshold. For JSON schema validation, a depth of 20–30 covers all real schemas; for Markdown AST traversal, 10–15 covers all real documents. Document the chosen limit in the tool's description so the model can avoid generating inputs that will be rejected.

Iterative reformulation of recursive algorithms

Depth limits are an effective and quick fix, but for algorithms where correctness at any depth is important, an iterative reformulation eliminates the stack exhaustion risk entirely. An iterative implementation maintains an explicit stack on the heap — a JavaScript Array or C++ std::vector — which can grow to any depth limited only by available memory, and where a visited.size > MAX_NODES check provides a controllable ceiling:

// Recursive DFS — vulnerable to stack overflow on deep graphs
function dfsRecursive(root: TreeNode): TreeNode[] {
    const result: TreeNode[] = [];
    function visit(node: TreeNode) {
        result.push(node);
        for (const child of node.children) {
            visit(child);  // call stack grows with depth
        }
    }
    visit(root);
    return result;
}

// Iterative DFS — explicit stack on the heap, no call stack growth
const MAX_NODES = 10_000;

function dfsIterative(root: TreeNode): TreeNode[] {
    const result: TreeNode[] = [];
    const stack: TreeNode[] = [root];
    const visited = new Set<TreeNode>();

    while (stack.length > 0) {
        const node = stack.pop()!;
        if (visited.has(node)) continue;
        visited.add(node);

        if (visited.size > MAX_NODES) {
            throw Object.assign(
                new RangeError(`Tree traversal exceeds ${MAX_NODES} node limit`),
                { code: 'TREE_TOO_LARGE' }
            );
        }

        result.push(node);
        // Push children in reverse so left-most is processed first
        for (let i = node.children.length - 1; i >= 0; i--) {
            stack.push(node.children[i]);
        }
    }
    return result;
}

The iterative formulation handles inputs of any nesting depth without crashing. Its memory consumption is proportional to the breadth of the tree (the maximum number of nodes on the explicit stack at any time), which is bounded by the MAX_NODES check. This is strictly safer than a depth limit for tree structures that are wide rather than deep. For MCP servers processing ASTs or JSON object trees from untrusted sources, the iterative formulation is the recommended long-term pattern.

Testing for stack depth with fuzz input

Depth limit logic must be tested explicitly — normal test fixtures do not exercise it. Add a dedicated test that generates a pathologically deep input and asserts the handler returns a structured error rather than crashing:

// Jest test — verify the tool handler survives a 10,000-level deep JSON argument
import { callTool } from '../src/server';

test('handles deeply nested JSON argument without crashing', async () => {
    // Build a 10,000-deep nested object: {"a":{"a":{"a":...}}}
    let nested: Record<string, unknown> = {};
    let cursor = nested;
    for (let i = 0; i < 10_000; i++) {
        cursor['a'] = {};
        cursor = cursor['a'] as Record<string, unknown>;
    }

    // The tool must either return a ToolError or succeed — never crash the process
    let result: unknown;
    try {
        result = await callTool('analyze_schema', { schema: nested });
    } catch (err) {
        // Structured ToolError is acceptable — uncaught RangeError is not
        expect(err).toBeInstanceOf(McpError);
        expect((err as McpError).code).toBe(ErrorCode.InvalidParams);
        return;
    }

    // If it succeeded, the result should be valid
    expect(result).toBeDefined();
}, 10_000 /* 10s timeout — not expected to be slow, but guards against hangs */);

// Also test that RangeError does NOT propagate uncaught:
test('wraps RangeError as ToolError', async () => {
    const deepArray = JSON.parse('['.repeat(5000) + '1' + ']'.repeat(5000));
    await expect(callTool('process_array', { data: deepArray }))
        .rejects.toBeInstanceOf(McpError);
});

Note that node --stack-size=65536 (increasing the V8 stack to 64 MB) is not a fix — it only moves the crash to a deeper input. An adversary who knows the increased stack size can simply craft a deeper argument. The fix is always a depth limit or an iterative reformulation that does not use the call stack for recursion.

SkillAudit detection

SkillAudit's Security axis checks for three patterns that indicate stack overflow risk in MCP tool handlers. First, it identifies JavaScript and TypeScript functions that call themselves recursively — directly or mutually — without a numeric depth parameter, and that appear in the call graph of any registered tool handler. Second, it scans for calls to third-party parser libraries (tree-sitter, marked, xml2js, ajv, fast-xml-parser) without accompanying timeout, depth, or size configuration applied before the parse call — each library has its own configuration API and SkillAudit checks for its presence. Third, it flags recursive call sites that are not wrapped in try/catch blocks capable of catching RangeError, meaning an uncaught stack overflow would propagate to the MCP server's top-level error handler as an unhandled exception rather than being returned as a structured tool error. The LLM probe sends a 10,000-level deeply nested JSON object to each tool that accepts object or array typed parameters and checks whether the server process survives — returns any response vs. drops the connection. Run a free audit at skillaudit.dev to verify your MCP server handles recursive depth attacks without crashing.