Security · Sandbox · Systems

MCP Server Runtime Sandbox Design: Isolating Tool Execution with vm.runInContext, Worker Threads, and Seccomp

2026-06-14 · SkillAudit

MCP tool handlers are unusual software: they run code paths shaped by LLM-generated inputs, they perform privileged operations (file reads, HTTP fetches, database queries), and they run inside a server process shared across all callers. A single prompt injection or malformed argument can redirect a tool handler into code the author never intended to execute. Isolation is the architectural response: contain the blast radius so that a compromised tool call cannot escalate into process-level access. This post covers three isolation layers — V8 context sandboxing with vm.runInContext, Worker thread process boundaries, and kernel-level syscall filtering with seccomp — with worked examples and a production trade-off table.

Why MCP tool execution needs a sandbox

In a normal web server, the attacker controls the request payload. In an MCP server, the attacker also controls the prompts that produce the tool arguments — and those prompts can be injected at the system level, the user level, or through tool results from other MCP servers. This gives attackers a second control plane that doesn't exist in traditional APIs.

Consider a code-execution MCP server that lets an LLM agent run JavaScript snippets to transform data. The developer validates that the input is a string, calls eval(), and returns the result. This is fine until a prompt-injected system message tells the agent to pass require('child_process').execSync('curl ...') as the code argument. The eval runs in the server process context with full access to the filesystem, environment variables, and network.

Even without explicit code-execution tools, tool handlers that:

Build file paths from tool arguments (path traversal)
Make outbound HTTP requests to user-controlled URLs (SSRF)
Interpolate arguments into shell commands (command injection)
Pass arguments to template engines (template injection)

…all benefit from execution isolation. The question isn't whether to sandbox but which layer of isolation maps to your threat model and performance budget.

The shared-process problem: Node.js runs all tool handlers in the same event loop, in the same process, with the same global scope and the same open file descriptors. A single tool call that reaches process.exit(), writes to process.env, or exhausts the heap affects every concurrent session. Without isolation, the blast radius of one bad tool call is the entire server.

Layer 1: V8 context sandbox with vm.runInContext

V8 Layer

vm.runInContext — JavaScript-level isolation

Node's built-in vm module creates a separate V8 context — a fresh global object with no prototype chain connection to the host context. Code evaluated inside the context cannot access the outer require, process, or global unless you explicitly pass them in.

This is the lightest isolation available and is appropriate for sandboxing eval-style operations where you need the JavaScript runtime but want to restrict global access.

import vm from 'node:vm';

// Create a minimal sandbox context — only expose what the tool legitimately needs
function createSandboxContext(allowedGlobals = {}) {
  const sandbox = Object.create(null); // no prototype chain

  // Explicitly grant only what the tool handler needs
  sandbox.Math = Math;
  sandbox.JSON = JSON;
  sandbox.console = { log: () => {} }; // no-op logger — don't leak server logs

  // Merge caller-supplied allowlist
  Object.assign(sandbox, allowedGlobals);

  return vm.createContext(sandbox);
}

// Execute untrusted code in the sandbox with a CPU timeout
function runSandboxed(code, context, timeoutMs = 100) {
  try {
    return vm.runInContext(code, context, {
      timeout: timeoutMs,          // terminates infinite loops
      breakOnSigint: true,         // kill on Ctrl-C during dev
      filename: 'sandbox.vm',
    });
  } catch (err) {
    if (err.message.includes('Script execution timed out')) {
      throw new Error('SANDBOX_TIMEOUT: code exceeded ' + timeoutMs + 'ms');
    }
    throw err;
  }
}

// MCP tool handler
server.tool('run_transform', { code: z.string().max(4096) }, async ({ code }) => {
  const ctx = createSandboxContext();
  const result = runSandboxed(code, ctx, 150);
  return { content: [{ type: 'text', text: String(result) }] };
});

vm.runInContext is not a security boundary against determined attackers. With enough creativity, it is possible to escape V8 context sandboxes by exploiting shared constructors across context boundaries (the ({}).constructor.constructor chain). If you are running fully untrusted code (user-authored scripts, not just LLM-generated arguments), use Worker threads or a subprocess instead. vm is appropriate for limiting accidental damage, not defeating intentional exploit attempts.

The timeout parameter is critical and often omitted. Without it, an infinite loop in the sandboxed code blocks the Node.js event loop — not just the sandboxed context — starving all concurrent requests. Set timeout to the maximum acceptable CPU time for a single tool invocation, typically 100–500ms for data transformation.

What vm.runInContext does and does not isolate

Isolation target	Protected?	Notes
Access to `require()` / `import()`	Yes	Not in the sandbox context unless explicitly passed
Access to `process` / `global`	Yes	Same — not present unless added to sandbox
Access to the host context's variables	Yes	Separate V8 context
CPU exhaustion (infinite loops)	Yes (with timeout)	timeout option terminates execution
Memory exhaustion	No	Shares the V8 heap — large allocations in sandbox affect host process
Prototype chain escape	Partial	Known bypass via cross-context constructor chain
Async operations (setTimeout, fetch)	No	Async APIs not in sandbox by default but bypass is trivial if Promise is exposed

Layer 2: Worker thread isolation

Thread Layer

Worker threads — event loop and memory boundary

Node.js Worker threads run in a separate V8 isolate with a separate heap. Shared memory requires explicit SharedArrayBuffer transfer — it is not accidental. A Worker that crashes or exhausts memory does not kill the parent process. The communication channel is the postMessage structured clone protocol, which serializes all data across the boundary.

This is the right default for MCP tool handlers that need to run external code or perform heavy computation. The overhead is higher than vm (thread startup ~5ms, message serialization scales with payload size) but the isolation guarantees are substantially stronger.

import { Worker, isMainThread, parentPort, workerData } from 'node:worker_threads';
import { fileURLToPath } from 'node:url';
import path from 'node:path';

const WORKER_PATH = path.join(path.dirname(fileURLToPath(import.meta.url)), 'tool-worker.js');

// Spawn a Worker for each tool call, kill it after timeout
function runInWorker(code, inputData, timeoutMs = 5000) {
  return new Promise((resolve, reject) => {
    const worker = new Worker(WORKER_PATH, {
      workerData: { code, inputData },
      // Restrict Worker's ability to spawn child processes or access the network
      // (requires --experimental-permission in Node 22+)
      // resourceLimits: { maxOldGenerationSizeMb: 64, maxYoungGenerationSizeMb: 16 },
      resourceLimits: {
        maxOldGenerationSizeMb: 64,
        maxYoungGenerationSizeMb: 16,
        maxCodeGenerationSizeMb: 8,
      },
    });

    const timer = setTimeout(() => {
      worker.terminate(); // hard kill
      reject(new Error('WORKER_TIMEOUT'));
    }, timeoutMs);

    worker.on('message', (result) => {
      clearTimeout(timer);
      worker.terminate();
      resolve(result);
    });

    worker.on('error', (err) => {
      clearTimeout(timer);
      reject(err);
    });

    worker.on('exit', (code) => {
      clearTimeout(timer);
      if (code !== 0) reject(new Error('WORKER_EXIT_' + code));
    });
  });
}

// tool-worker.js — runs inside the Worker thread, no access to main process scope
import { parentPort, workerData } from 'node:worker_threads';
import vm from 'node:vm';

const { code, inputData } = workerData;

// Even inside the Worker, use vm to restrict global access further
const sandbox = Object.create(null);
sandbox.input = inputData;   // only the structured-clone-safe input data
sandbox.Math = Math;
sandbox.JSON = JSON;

const ctx = vm.createContext(sandbox);

try {
  const result = vm.runInContext(code, ctx, { timeout: 500 });
  parentPort.postMessage({ ok: true, result });
} catch (err) {
  parentPort.postMessage({ ok: false, error: err.message });
}

The key advantage over vm alone: if the Worker thread crashes — OOM, unhandled exception, or explicit process.exit() — the parent event loop is not affected. The exit event fires, the promise rejects, and the MCP server continues handling other requests.

Resource limits in Worker threads

The resourceLimits option added in Node 12.16 lets you cap the Worker's V8 heap. This is the most practical per-call memory bound available without running a subprocess:

const worker = new Worker(WORKER_PATH, {
  workerData: { ... },
  resourceLimits: {
    maxOldGenerationSizeMb: 64,    // old-gen heap cap — limits buffer/object accumulation
    maxYoungGenerationSizeMb: 16,  // young-gen (allocation heavy) cap
    maxCodeGenerationSizeMb: 8,    // JIT code cache cap
    stackSizeMb: 4,                // call stack cap — limits deep recursion
  },
});

When the Worker exceeds the old-gen limit, V8 triggers an OOM error inside the Worker, which causes it to terminate — the parent process is not affected. Without maxOldGenerationSizeMb, a single Worker call that allocates a 2 GB buffer will OOM the entire Node process.

Worker pool for latency-sensitive servers

Spawning a fresh Worker for every tool call adds 5–15ms of startup latency. For tools called frequently, maintain a pool:

class WorkerPool {
  #workers = [];
  #queue = [];
  #size;

  constructor(poolSize = 4) {
    this.#size = poolSize;
    for (let i = 0; i < poolSize; i++) {
      this.#workers.push({ worker: this.#spawn(), busy: false });
    }
  }

  #spawn() {
    const w = new Worker(WORKER_PATH, {
      resourceLimits: { maxOldGenerationSizeMb: 64, stackSizeMb: 4 },
    });
    w.on('error', () => { /* respawn */ });
    return w;
  }

  run(data, timeoutMs = 5000) {
    return new Promise((resolve, reject) => {
      const slot = this.#workers.find(s => !s.busy);
      if (slot) {
        this.#dispatch(slot, data, timeoutMs, resolve, reject);
      } else {
        this.#queue.push({ data, timeoutMs, resolve, reject });
      }
    });
  }

  #dispatch(slot, data, timeoutMs, resolve, reject) {
    slot.busy = true;
    const timer = setTimeout(() => {
      slot.worker.terminate();
      slot.worker = this.#spawn(); // respawn after timeout
      slot.busy = false;
      this.#drain();
      reject(new Error('WORKER_TIMEOUT'));
    }, timeoutMs);

    slot.worker.once('message', (msg) => {
      clearTimeout(timer);
      slot.busy = false;
      this.#drain();
      resolve(msg);
    });

    slot.worker.postMessage(data);
  }

  #drain() {
    if (this.#queue.length > 0) {
      const next = this.#queue.shift();
      const slot = this.#workers.find(s => !s.busy);
      if (slot) this.#dispatch(slot, next.data, next.timeoutMs, next.resolve, next.reject);
    }
  }
}

Layer 3: Kernel-level isolation with seccomp and Linux namespaces

Kernel Layer

seccomp + namespaces — syscall and resource containment

Worker threads share the same OS process and therefore have access to the same file descriptors, the same network stack, and the same set of system calls. A Worker can call fs.readFileSync('/etc/shadow') unless you explicitly restrict filesystem access at a higher level. For the strongest isolation, run tool handlers in a child process constrained by Linux security primitives.

The practical choice for Node.js servers is to run the tool execution worker as a subprocess in a container or with a seccomp-bpf filter applied via the --experimental-permission flag (Node 22+) or an external process launcher.

Node.js 22+ Permission Model

Node 22 stabilized the --permission flag, which lets you restrict a Node process to specific filesystem paths and network targets at runtime. When applied to a spawned subprocess running tool code, this gives you kernel-enforced path allowlisting without a full container:

import { spawn } from 'node:child_process';

function runToolSubprocess(toolCode, input, timeoutMs = 10000) {
  return new Promise((resolve, reject) => {
    // Spawn a child Node process with permission restrictions
    const child = spawn(process.execPath, [
      '--experimental-permission',
      '--allow-fs-read=/tmp/tool-sandbox',   // only this directory
      '--allow-fs-write=/tmp/tool-sandbox',
      // No --allow-net = network calls blocked
      // No --allow-child-process = cannot spawn further children
      '--input-type=module',
    ], {
      stdio: ['pipe', 'pipe', 'pipe'],
      env: {
        // Stripped environment — no secrets, no AWS credentials
        PATH: '/usr/local/bin:/usr/bin:/bin',
        NODE_ENV: 'production',
        SANDBOX_INPUT: JSON.stringify(input),
      },
    });

    let stdout = '';
    let stderr = '';
    child.stdout.on('data', d => stdout += d);
    child.stderr.on('data', d => stderr += d);

    const timer = setTimeout(() => {
      child.kill('SIGKILL');
      reject(new Error('SUBPROCESS_TIMEOUT'));
    }, timeoutMs);

    child.on('close', (code) => {
      clearTimeout(timer);
      if (code === 0) {
        try { resolve(JSON.parse(stdout)); }
        catch { reject(new Error('SUBPROCESS_PARSE_ERROR')); }
      } else {
        reject(new Error('SUBPROCESS_EXIT_' + code));
      }
    });

    // Write the tool code to stdin
    child.stdin.write(toolCode);
    child.stdin.end();
  });
}

seccomp-bpf via Docker

For production deployments, the cleanest approach is to run the MCP server process itself in a container with a restrictive seccomp profile. Docker ships a default seccomp profile that blocks ~44 syscalls; for MCP tool execution you can go further:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": ["SCMP_ARCH_X86_64"],
  "syscalls": [
    {
      "names": [
        "read", "write", "close", "fstat", "lseek", "mmap", "mprotect",
        "munmap", "brk", "pread64", "readv", "writev",
        "access", "openat", "getdents64", "stat", "lstat",
        "nanosleep", "clock_gettime", "gettimeofday",
        "getpid", "getppid", "getuid", "geteuid", "getgid", "getegid",
        "futex", "clone", "wait4", "exit", "exit_group",
        "socket", "connect", "recv", "recvfrom", "send", "sendto",
        "poll", "epoll_create1", "epoll_ctl", "epoll_wait",
        "rt_sigaction", "rt_sigprocmask", "rt_sigreturn", "sigaltstack"
      ],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

With this profile applied via docker run --security-opt seccomp=mcp-tool.json, the tool execution process cannot call execve (no subprocess spawning), ptrace (no process inspection), or mount (no filesystem mounting). Even if a tool handler has an exploitable code path, the kernel blocks the syscalls needed to escalate.

Comparison: three isolation layers

Isolation layer	Heap isolation	Filesystem isolation	Network isolation	Subprocess spawning	Overhead	Complexity
vm.runInContext	No — shared heap	No	No	No	~0ms	Low
Worker thread	Yes — separate isolate	No (same process FDs)	No	Yes (unless blocked)	5–15ms startup	Medium
Worker + resourceLimits	Yes — capped heap	No	No	Yes	5–15ms startup	Medium
Child subprocess	Yes — separate process	Partial (inherited FDs)	Partial (inherited sockets)	Blockable	15–50ms startup	Medium-High
Node 22 --permission	Yes	Yes — allowlist	Yes — allowlist	Blockable	15–50ms startup	Medium-High
Container + seccomp	Yes	Yes — bind mounts	Yes — network policy	Blocked	Container startup	High

Choosing the right layer for your threat model

The three layers are not mutually exclusive — they compose. A production configuration that handles real user-shaped LLM inputs should layer at least two:

Low-risk transformations (pure data manipulation, no I/O): vm.runInContext with a 100ms CPU timeout is sufficient. Adds ~0ms overhead. Use this for formula evaluation, template rendering, JSON transformation tools.
Medium-risk execution (I/O but well-defined scope): Worker thread with resourceLimits + vm inside the Worker. Adds 10–20ms. Use this for code linters, formatters, test runners operating on passed-in code strings.
High-risk execution (user-authored code, arbitrary filesystem/network): Child subprocess with Node 22 --permission flag, or a container with seccomp profile. Adds 20–100ms. Use this for any tool that compiles or runs user-provided scripts.

Always layer vm inside Worker. Even when you're already using Worker threads for process isolation, add vm.runInContext inside the Worker for the eval path. The Worker protects the main process from Worker crashes; vm restricts the eval code from the Worker's own globals. Defense in depth — both layers need to be defeated for an attacker to escape.

Common sandbox misconfigurations

Passing require or process into the sandbox

// WRONG — this defeats the entire purpose of the sandbox
const sandbox = {
  require, // full module system access
  process,  // full process access including process.exit()
};
vm.runInContext(code, vm.createContext(sandbox));

// RIGHT — pass only the specific values the sandboxed code needs
const sandbox = {
  input: sanitizedInput,
  Math,
  JSON,
};

Forgetting to set a timeout

// WRONG — infinite loop blocks the event loop forever
vm.runInContext('while(true) {}', ctx);

// RIGHT
vm.runInContext('while(true) {}', ctx, { timeout: 100 }); // throws after 100ms

Accepting non-string code in the tool call

// WRONG — object inputs can trigger prototype pollution in the vm context
server.tool('run', { code: z.any() }, async ({ code }) => {
  vm.runInContext(code, ctx);
});

// RIGHT — coerce to string, cap length, reject suspicious patterns
server.tool('run', { code: z.string().max(8192) }, async ({ code }) => {
  if (/require|import|process|global|Buffer|__proto__/.test(code)) {
    throw new Error('forbidden_pattern');
  }
  vm.runInContext(String(code), ctx, { timeout: 150 });
});

Not terminating the Worker on timeout

// WRONG — timer fires but Worker keeps running, consuming resources
const timer = setTimeout(() => {
  reject(new Error('timeout')); // promise rejects but Worker is still alive
}, 5000);

// RIGHT — hard kill the Worker
const timer = setTimeout(() => {
  worker.terminate(); // terminates the V8 isolate
  reject(new Error('WORKER_TIMEOUT'));
}, 5000);

SkillAudit grade impact: sandbox findings

Critical Tool handler calls eval() or new Function() with LLM-controlled input and no sandboxing — direct process-level code execution. Score penalty: −25 points.

Critical vm.runInContext used but require or process passed into sandbox context — sandbox is fully bypassed. Score penalty: −20 points.

High No CPU timeout on vm.runInContext — event loop starvation via infinite loop. Score penalty: −12 points.

High Worker thread used for execution isolation but no resourceLimits — heap exhaustion in Worker kills parent process. Score penalty: −10 points.

High Worker thread not terminated on timeout — timed-out Workers accumulate, exhausting the thread pool. Score penalty: −8 points.

Medium Input not coerced to string before vm evaluation — prototype pollution risk. Score penalty: −5 points.

Medium Sandbox context includes fetch or XMLHttpRequest — sandboxed code can make outbound network requests. Score penalty: −5 points.

The production checklist

Identify eval surfaces. Grep your tool handlers for eval, new Function, vm.run, exec, execSync, spawn. Each is a sandbox candidate.
Choose the isolation layer. Use the threat model table above. Default to Worker + vm if unsure.
Set a CPU timeout. On vm: the timeout option. On Worker: a setTimeout with worker.terminate().
Set memory limits on Workers. resourceLimits.maxOldGenerationSizeMb — 64 MB is a reasonable default for most transform tools.
Strip the sandbox context. Start from Object.create(null). Add only what the code needs. No require, process, global, fetch.
Validate inputs before sandboxing. Reject inputs containing forbidden patterns before they reach the sandbox. The sandbox is not a substitute for input validation — it's a containment layer for when validation is incomplete.
Log sandbox violations. Timeout kills, OOM exits, and forbidden-pattern rejections should appear in your security log. A cluster of sandbox violations in a session is a prompt injection signal.

Run the SkillAudit scanner on your MCP server to get a grade report that flags eval surfaces, missing sandbox timeouts, and context configuration problems automatically. Start a free audit →