Security · Sandbox · Systems
MCP Server Runtime Sandbox Design: Isolating Tool Execution with vm.runInContext, Worker Threads, and Seccomp
MCP tool handlers are unusual software: they run code paths shaped by LLM-generated inputs, they perform privileged operations (file reads, HTTP fetches, database queries), and they run inside a server process shared across all callers. A single prompt injection or malformed argument can redirect a tool handler into code the author never intended to execute. Isolation is the architectural response: contain the blast radius so that a compromised tool call cannot escalate into process-level access. This post covers three isolation layers — V8 context sandboxing with vm.runInContext, Worker thread process boundaries, and kernel-level syscall filtering with seccomp — with worked examples and a production trade-off table.
Why MCP tool execution needs a sandbox
In a normal web server, the attacker controls the request payload. In an MCP server, the attacker also controls the prompts that produce the tool arguments — and those prompts can be injected at the system level, the user level, or through tool results from other MCP servers. This gives attackers a second control plane that doesn't exist in traditional APIs.
Consider a code-execution MCP server that lets an LLM agent run JavaScript snippets to transform data. The developer validates that the input is a string, calls eval(), and returns the result. This is fine until a prompt-injected system message tells the agent to pass require('child_process').execSync('curl ...') as the code argument. The eval runs in the server process context with full access to the filesystem, environment variables, and network.
Even without explicit code-execution tools, tool handlers that:
- Build file paths from tool arguments (path traversal)
- Make outbound HTTP requests to user-controlled URLs (SSRF)
- Interpolate arguments into shell commands (command injection)
- Pass arguments to template engines (template injection)
…all benefit from execution isolation. The question isn't whether to sandbox but which layer of isolation maps to your threat model and performance budget.
The shared-process problem: Node.js runs all tool handlers in the same event loop, in the same process, with the same global scope and the same open file descriptors. A single tool call that reaches process.exit(), writes to process.env, or exhausts the heap affects every concurrent session. Without isolation, the blast radius of one bad tool call is the entire server.
Layer 1: V8 context sandbox with vm.runInContext
vm.runInContext — JavaScript-level isolation
Node's built-in vm module creates a separate V8 context — a fresh global object with no prototype chain connection to the host context. Code evaluated inside the context cannot access the outer require, process, or global unless you explicitly pass them in.
This is the lightest isolation available and is appropriate for sandboxing eval-style operations where you need the JavaScript runtime but want to restrict global access.
import vm from 'node:vm';
// Create a minimal sandbox context — only expose what the tool legitimately needs
function createSandboxContext(allowedGlobals = {}) {
const sandbox = Object.create(null); // no prototype chain
// Explicitly grant only what the tool handler needs
sandbox.Math = Math;
sandbox.JSON = JSON;
sandbox.console = { log: () => {} }; // no-op logger — don't leak server logs
// Merge caller-supplied allowlist
Object.assign(sandbox, allowedGlobals);
return vm.createContext(sandbox);
}
// Execute untrusted code in the sandbox with a CPU timeout
function runSandboxed(code, context, timeoutMs = 100) {
try {
return vm.runInContext(code, context, {
timeout: timeoutMs, // terminates infinite loops
breakOnSigint: true, // kill on Ctrl-C during dev
filename: 'sandbox.vm',
});
} catch (err) {
if (err.message.includes('Script execution timed out')) {
throw new Error('SANDBOX_TIMEOUT: code exceeded ' + timeoutMs + 'ms');
}
throw err;
}
}
// MCP tool handler
server.tool('run_transform', { code: z.string().max(4096) }, async ({ code }) => {
const ctx = createSandboxContext();
const result = runSandboxed(code, ctx, 150);
return { content: [{ type: 'text', text: String(result) }] };
});
vm.runInContext is not a security boundary against determined attackers. With enough creativity, it is possible to escape V8 context sandboxes by exploiting shared constructors across context boundaries (the ({}).constructor.constructor chain). If you are running fully untrusted code (user-authored scripts, not just LLM-generated arguments), use Worker threads or a subprocess instead. vm is appropriate for limiting accidental damage, not defeating intentional exploit attempts.
The timeout parameter is critical and often omitted. Without it, an infinite loop in the sandboxed code blocks the Node.js event loop — not just the sandboxed context — starving all concurrent requests. Set timeout to the maximum acceptable CPU time for a single tool invocation, typically 100–500ms for data transformation.
What vm.runInContext does and does not isolate
| Isolation target | Protected? | Notes |
|---|---|---|
Access to require() / import() | Yes | Not in the sandbox context unless explicitly passed |
Access to process / global | Yes | Same — not present unless added to sandbox |
| Access to the host context's variables | Yes | Separate V8 context |
| CPU exhaustion (infinite loops) | Yes (with timeout) | timeout option terminates execution |
| Memory exhaustion | No | Shares the V8 heap — large allocations in sandbox affect host process |
| Prototype chain escape | Partial | Known bypass via cross-context constructor chain |
| Async operations (setTimeout, fetch) | No | Async APIs not in sandbox by default but bypass is trivial if Promise is exposed |
Layer 2: Worker thread isolation
Worker threads — event loop and memory boundary
Node.js Worker threads run in a separate V8 isolate with a separate heap. Shared memory requires explicit SharedArrayBuffer transfer — it is not accidental. A Worker that crashes or exhausts memory does not kill the parent process. The communication channel is the postMessage structured clone protocol, which serializes all data across the boundary.
This is the right default for MCP tool handlers that need to run external code or perform heavy computation. The overhead is higher than vm (thread startup ~5ms, message serialization scales with payload size) but the isolation guarantees are substantially stronger.
import { Worker, isMainThread, parentPort, workerData } from 'node:worker_threads';
import { fileURLToPath } from 'node:url';
import path from 'node:path';
const WORKER_PATH = path.join(path.dirname(fileURLToPath(import.meta.url)), 'tool-worker.js');
// Spawn a Worker for each tool call, kill it after timeout
function runInWorker(code, inputData, timeoutMs = 5000) {
return new Promise((resolve, reject) => {
const worker = new Worker(WORKER_PATH, {
workerData: { code, inputData },
// Restrict Worker's ability to spawn child processes or access the network
// (requires --experimental-permission in Node 22+)
// resourceLimits: { maxOldGenerationSizeMb: 64, maxYoungGenerationSizeMb: 16 },
resourceLimits: {
maxOldGenerationSizeMb: 64,
maxYoungGenerationSizeMb: 16,
maxCodeGenerationSizeMb: 8,
},
});
const timer = setTimeout(() => {
worker.terminate(); // hard kill
reject(new Error('WORKER_TIMEOUT'));
}, timeoutMs);
worker.on('message', (result) => {
clearTimeout(timer);
worker.terminate();
resolve(result);
});
worker.on('error', (err) => {
clearTimeout(timer);
reject(err);
});
worker.on('exit', (code) => {
clearTimeout(timer);
if (code !== 0) reject(new Error('WORKER_EXIT_' + code));
});
});
}
// tool-worker.js — runs inside the Worker thread, no access to main process scope
import { parentPort, workerData } from 'node:worker_threads';
import vm from 'node:vm';
const { code, inputData } = workerData;
// Even inside the Worker, use vm to restrict global access further
const sandbox = Object.create(null);
sandbox.input = inputData; // only the structured-clone-safe input data
sandbox.Math = Math;
sandbox.JSON = JSON;
const ctx = vm.createContext(sandbox);
try {
const result = vm.runInContext(code, ctx, { timeout: 500 });
parentPort.postMessage({ ok: true, result });
} catch (err) {
parentPort.postMessage({ ok: false, error: err.message });
}
The key advantage over vm alone: if the Worker thread crashes — OOM, unhandled exception, or explicit process.exit() — the parent event loop is not affected. The exit event fires, the promise rejects, and the MCP server continues handling other requests.
Resource limits in Worker threads
The resourceLimits option added in Node 12.16 lets you cap the Worker's V8 heap. This is the most practical per-call memory bound available without running a subprocess:
const worker = new Worker(WORKER_PATH, {
workerData: { ... },
resourceLimits: {
maxOldGenerationSizeMb: 64, // old-gen heap cap — limits buffer/object accumulation
maxYoungGenerationSizeMb: 16, // young-gen (allocation heavy) cap
maxCodeGenerationSizeMb: 8, // JIT code cache cap
stackSizeMb: 4, // call stack cap — limits deep recursion
},
});
When the Worker exceeds the old-gen limit, V8 triggers an OOM error inside the Worker, which causes it to terminate — the parent process is not affected. Without maxOldGenerationSizeMb, a single Worker call that allocates a 2 GB buffer will OOM the entire Node process.
Worker pool for latency-sensitive servers
Spawning a fresh Worker for every tool call adds 5–15ms of startup latency. For tools called frequently, maintain a pool:
class WorkerPool {
#workers = [];
#queue = [];
#size;
constructor(poolSize = 4) {
this.#size = poolSize;
for (let i = 0; i < poolSize; i++) {
this.#workers.push({ worker: this.#spawn(), busy: false });
}
}
#spawn() {
const w = new Worker(WORKER_PATH, {
resourceLimits: { maxOldGenerationSizeMb: 64, stackSizeMb: 4 },
});
w.on('error', () => { /* respawn */ });
return w;
}
run(data, timeoutMs = 5000) {
return new Promise((resolve, reject) => {
const slot = this.#workers.find(s => !s.busy);
if (slot) {
this.#dispatch(slot, data, timeoutMs, resolve, reject);
} else {
this.#queue.push({ data, timeoutMs, resolve, reject });
}
});
}
#dispatch(slot, data, timeoutMs, resolve, reject) {
slot.busy = true;
const timer = setTimeout(() => {
slot.worker.terminate();
slot.worker = this.#spawn(); // respawn after timeout
slot.busy = false;
this.#drain();
reject(new Error('WORKER_TIMEOUT'));
}, timeoutMs);
slot.worker.once('message', (msg) => {
clearTimeout(timer);
slot.busy = false;
this.#drain();
resolve(msg);
});
slot.worker.postMessage(data);
}
#drain() {
if (this.#queue.length > 0) {
const next = this.#queue.shift();
const slot = this.#workers.find(s => !s.busy);
if (slot) this.#dispatch(slot, next.data, next.timeoutMs, next.resolve, next.reject);
}
}
}
Layer 3: Kernel-level isolation with seccomp and Linux namespaces
seccomp + namespaces — syscall and resource containment
Worker threads share the same OS process and therefore have access to the same file descriptors, the same network stack, and the same set of system calls. A Worker can call fs.readFileSync('/etc/shadow') unless you explicitly restrict filesystem access at a higher level. For the strongest isolation, run tool handlers in a child process constrained by Linux security primitives.
The practical choice for Node.js servers is to run the tool execution worker as a subprocess in a container or with a seccomp-bpf filter applied via the --experimental-permission flag (Node 22+) or an external process launcher.
Node.js 22+ Permission Model
Node 22 stabilized the --permission flag, which lets you restrict a Node process to specific filesystem paths and network targets at runtime. When applied to a spawned subprocess running tool code, this gives you kernel-enforced path allowlisting without a full container:
import { spawn } from 'node:child_process';
function runToolSubprocess(toolCode, input, timeoutMs = 10000) {
return new Promise((resolve, reject) => {
// Spawn a child Node process with permission restrictions
const child = spawn(process.execPath, [
'--experimental-permission',
'--allow-fs-read=/tmp/tool-sandbox', // only this directory
'--allow-fs-write=/tmp/tool-sandbox',
// No --allow-net = network calls blocked
// No --allow-child-process = cannot spawn further children
'--input-type=module',
], {
stdio: ['pipe', 'pipe', 'pipe'],
env: {
// Stripped environment — no secrets, no AWS credentials
PATH: '/usr/local/bin:/usr/bin:/bin',
NODE_ENV: 'production',
SANDBOX_INPUT: JSON.stringify(input),
},
});
let stdout = '';
let stderr = '';
child.stdout.on('data', d => stdout += d);
child.stderr.on('data', d => stderr += d);
const timer = setTimeout(() => {
child.kill('SIGKILL');
reject(new Error('SUBPROCESS_TIMEOUT'));
}, timeoutMs);
child.on('close', (code) => {
clearTimeout(timer);
if (code === 0) {
try { resolve(JSON.parse(stdout)); }
catch { reject(new Error('SUBPROCESS_PARSE_ERROR')); }
} else {
reject(new Error('SUBPROCESS_EXIT_' + code));
}
});
// Write the tool code to stdin
child.stdin.write(toolCode);
child.stdin.end();
});
}
seccomp-bpf via Docker
For production deployments, the cleanest approach is to run the MCP server process itself in a container with a restrictive seccomp profile. Docker ships a default seccomp profile that blocks ~44 syscalls; for MCP tool execution you can go further:
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": [
"read", "write", "close", "fstat", "lseek", "mmap", "mprotect",
"munmap", "brk", "pread64", "readv", "writev",
"access", "openat", "getdents64", "stat", "lstat",
"nanosleep", "clock_gettime", "gettimeofday",
"getpid", "getppid", "getuid", "geteuid", "getgid", "getegid",
"futex", "clone", "wait4", "exit", "exit_group",
"socket", "connect", "recv", "recvfrom", "send", "sendto",
"poll", "epoll_create1", "epoll_ctl", "epoll_wait",
"rt_sigaction", "rt_sigprocmask", "rt_sigreturn", "sigaltstack"
],
"action": "SCMP_ACT_ALLOW"
}
]
}
With this profile applied via docker run --security-opt seccomp=mcp-tool.json, the tool execution process cannot call execve (no subprocess spawning), ptrace (no process inspection), or mount (no filesystem mounting). Even if a tool handler has an exploitable code path, the kernel blocks the syscalls needed to escalate.
Comparison: three isolation layers
| Isolation layer | Heap isolation | Filesystem isolation | Network isolation | Subprocess spawning | Overhead | Complexity |
|---|---|---|---|---|---|---|
| vm.runInContext | No — shared heap | No | No | No | ~0ms | Low |
| Worker thread | Yes — separate isolate | No (same process FDs) | No | Yes (unless blocked) | 5–15ms startup | Medium |
| Worker + resourceLimits | Yes — capped heap | No | No | Yes | 5–15ms startup | Medium |
| Child subprocess | Yes — separate process | Partial (inherited FDs) | Partial (inherited sockets) | Blockable | 15–50ms startup | Medium-High |
| Node 22 --permission | Yes | Yes — allowlist | Yes — allowlist | Blockable | 15–50ms startup | Medium-High |
| Container + seccomp | Yes | Yes — bind mounts | Yes — network policy | Blocked | Container startup | High |
Choosing the right layer for your threat model
The three layers are not mutually exclusive — they compose. A production configuration that handles real user-shaped LLM inputs should layer at least two:
- Low-risk transformations (pure data manipulation, no I/O):
vm.runInContextwith a 100ms CPU timeout is sufficient. Adds ~0ms overhead. Use this for formula evaluation, template rendering, JSON transformation tools. - Medium-risk execution (I/O but well-defined scope): Worker thread with
resourceLimits+ vm inside the Worker. Adds 10–20ms. Use this for code linters, formatters, test runners operating on passed-in code strings. - High-risk execution (user-authored code, arbitrary filesystem/network): Child subprocess with Node 22
--permissionflag, or a container with seccomp profile. Adds 20–100ms. Use this for any tool that compiles or runs user-provided scripts.
Always layer vm inside Worker. Even when you're already using Worker threads for process isolation, add vm.runInContext inside the Worker for the eval path. The Worker protects the main process from Worker crashes; vm restricts the eval code from the Worker's own globals. Defense in depth — both layers need to be defeated for an attacker to escape.
Common sandbox misconfigurations
Passing require or process into the sandbox
// WRONG — this defeats the entire purpose of the sandbox
const sandbox = {
require, // full module system access
process, // full process access including process.exit()
};
vm.runInContext(code, vm.createContext(sandbox));
// RIGHT — pass only the specific values the sandboxed code needs
const sandbox = {
input: sanitizedInput,
Math,
JSON,
};
Forgetting to set a timeout
// WRONG — infinite loop blocks the event loop forever
vm.runInContext('while(true) {}', ctx);
// RIGHT
vm.runInContext('while(true) {}', ctx, { timeout: 100 }); // throws after 100ms
Accepting non-string code in the tool call
// WRONG — object inputs can trigger prototype pollution in the vm context
server.tool('run', { code: z.any() }, async ({ code }) => {
vm.runInContext(code, ctx);
});
// RIGHT — coerce to string, cap length, reject suspicious patterns
server.tool('run', { code: z.string().max(8192) }, async ({ code }) => {
if (/require|import|process|global|Buffer|__proto__/.test(code)) {
throw new Error('forbidden_pattern');
}
vm.runInContext(String(code), ctx, { timeout: 150 });
});
Not terminating the Worker on timeout
// WRONG — timer fires but Worker keeps running, consuming resources
const timer = setTimeout(() => {
reject(new Error('timeout')); // promise rejects but Worker is still alive
}, 5000);
// RIGHT — hard kill the Worker
const timer = setTimeout(() => {
worker.terminate(); // terminates the V8 isolate
reject(new Error('WORKER_TIMEOUT'));
}, 5000);
SkillAudit grade impact: sandbox findings
eval() or new Function() with LLM-controlled input and no sandboxing — direct process-level code execution. Score penalty: −25 points.
require or process passed into sandbox context — sandbox is fully bypassed. Score penalty: −20 points.
resourceLimits — heap exhaustion in Worker kills parent process. Score penalty: −10 points.
fetch or XMLHttpRequest — sandboxed code can make outbound network requests. Score penalty: −5 points.
The production checklist
- Identify eval surfaces. Grep your tool handlers for
eval,new Function,vm.run,exec,execSync,spawn. Each is a sandbox candidate. - Choose the isolation layer. Use the threat model table above. Default to Worker + vm if unsure.
- Set a CPU timeout. On vm: the
timeoutoption. On Worker: asetTimeoutwithworker.terminate(). - Set memory limits on Workers.
resourceLimits.maxOldGenerationSizeMb— 64 MB is a reasonable default for most transform tools. - Strip the sandbox context. Start from
Object.create(null). Add only what the code needs. Norequire,process,global,fetch. - Validate inputs before sandboxing. Reject inputs containing forbidden patterns before they reach the sandbox. The sandbox is not a substitute for input validation — it's a containment layer for when validation is incomplete.
- Log sandbox violations. Timeout kills, OOM exits, and forbidden-pattern rejections should appear in your security log. A cluster of sandbox violations in a session is a prompt injection signal.
Run the SkillAudit scanner on your MCP server to get a grade report that flags eval surfaces, missing sandbox timeouts, and context configuration problems automatically. Start a free audit →