Topic: mcp server regex dos
MCP server ReDoS — Regular Expression Denial of Service in tool handlers
Regular Expression Denial of Service (ReDoS) occurs when a regex engine backtracks exponentially on a crafted input. In MCP servers, tool handlers frequently apply regular expressions to LLM-supplied strings — to validate input, parse structured data, or extract information. A LLM that has been prompt-injected can supply an input specifically crafted to trigger catastrophic backtracking, consuming 100% of one CPU core for seconds to minutes per tool call. In a single-threaded Node.js process, this blocks the entire event loop. This page covers which regex patterns are vulnerable, how to detect them, and what to use instead.
How catastrophic backtracking happens
JavaScript's regex engine uses backtracking NFA (Non-deterministic Finite Automaton) evaluation. For most regexes this is fast, but certain patterns cause exponential backtracking: the engine tries every possible combination of matches before concluding that no match exists. The vulnerable pattern family involves overlapping alternations or nested quantifiers where the engine cannot prune the search space.
// Classic vulnerable patterns — test how long each takes on a 30-char crafted input:
// 1. Nested quantifier — (a+)+ or (a*)*
const VULN_EMAIL = /^([a-zA-Z0-9._-]+@)+[a-zA-Z]{2,}$/
// Attack input: 'aaaaaaaaaaaaaaaaaaaaaaaaa@aaaaaaa@'
// Each 'a' before the '@' can be matched by the inner group in multiple ways
// 2. Alternation with overlapping matches — (a|aa)+
const VULN_DOMAIN = /^(https?:\/\/|ftp:\/\/)?([\w\-]+(\.[\w\-]+)+)(\/.*)?$/
// Certain inputs with many dots cause exponential backtracking
// 3. Chained optional groups — (a?){20} followed by a{20}
const VULN_REPEAT = /^(a?){30}a{30}$/
// For an input of 30 'a's that doesn't match, tries 2^30 combinations
// Test a regex for ReDoS vulnerability:
// 1. Try a crafted input that is just inside the expected format
// but fails at the last character (e.g., 30 'a's + '@')
// 2. Time it with console.time / console.timeEnd
// 3. If > 10ms for a 50-char input, it's likely vulnerable
console.time('regex-test')
VULN_EMAIL.test('aaaaaaaaaaaaaaaaaaaaaaaaa@')
console.timeEnd('regex-test')
// Vulnerable regex: may take seconds or minutes for this input
Safe alternatives in MCP tool handlers
// Option 1: Use RE2 (linear-time engine, no backtracking)
// npm install re2
import RE2 from 're2'
// RE2 drop-in for the safe validation patterns below
const emailRe2 = new RE2('^[a-zA-Z0-9._%+\\-]{1,64}@[a-zA-Z0-9.\\-]{1,253}\\.[a-zA-Z]{2,}$')
server.tool('validate_email', 'Validate an email address format', {
email: z.string().max(320), // RFC 5321 max email length
}, async ({ email }) => {
const valid = emailRe2.test(email)
return { content: [{ type: 'text', text: JSON.stringify({ valid, email }) }] }
})
// Option 2: Use safe, non-backtracking regex patterns
// These patterns avoid nested quantifiers and overlapping alternation
// SAFE email format check (not full RFC 5321 — safe for input validation)
const SAFE_EMAIL_RE = /^[a-zA-Z0-9._%+\-]{1,64}@[a-zA-Z0-9.\-]{1,253}\.[a-zA-Z]{2,}$/
// Why safe: no nested quantifiers, no overlapping alternation
// Worst case: linear scan of the input — no backtracking
// SAFE URL format check
const SAFE_URL_RE = /^https?:\/\/[a-zA-Z0-9\-\.]{1,253}(:\d{1,5})?(\/[^\s]*)?$/
// Why safe: the character classes [a-zA-Z0-9\-\.] and [^\s] don't overlap
// with the adjacent literal characters — no catastrophic backtracking path
// SAFE slug/identifier check
const SAFE_SLUG_RE = /^[a-z0-9][a-z0-9\-]{0,253}[a-z0-9]$/
// Option 3: Validate structure first, then apply simple regex to each component
function safeParseEmail(email: string): { local: string; domain: string } | null {
// Split at the last '@' first — no regex needed for structural check
const atIndex = email.lastIndexOf('@')
if (atIndex < 1 || atIndex > 64) return null
const local = email.slice(0, atIndex)
const domain = email.slice(atIndex + 1)
if (!local || !domain) return null
if (domain.length > 253) return null
// Now apply simple, safe regexes to each component separately
if (!/^[a-zA-Z0-9._%+\-]+$/.test(local)) return null
if (!/^[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/.test(domain)) return null
return { local, domain }
}
Worker thread timeout wrapper for unavoidable complex regexes
When a library you depend on uses complex regexes internally (e.g., a markdown parser, URL normalizer, or schema validator), you can't always replace the regex with RE2. Wrap the call in a Worker thread with a timeout to prevent a slow match from blocking the event loop.
import { Worker, isMainThread, parentPort, workerData } from 'worker_threads'
import { setTimeout as setTimeoutPromise } from 'timers/promises'
// Run a potentially-slow regex match in a worker thread with a timeout
function regexMatchWithTimeout(pattern: RegExp, input: string, timeoutMs: number): Promise {
return new Promise((resolve, reject) => {
const worker = new Worker(`
const { parentPort, workerData } = require('worker_threads')
const re = new RegExp(workerData.pattern.source, workerData.pattern.flags)
const result = re.exec(workerData.input)
parentPort.postMessage(result)
`, {
eval: true,
workerData: { pattern, input },
})
const timer = setTimeout(() => {
worker.terminate()
reject(new Error(`Regex match timed out after ${timeoutMs}ms — possible ReDoS`))
}, timeoutMs)
worker.on('message', (result) => {
clearTimeout(timer)
resolve(result)
})
worker.on('error', (err) => {
clearTimeout(timer)
reject(err)
})
})
}
// Usage:
server.tool('parse_log_line', 'Extract fields from a log line', {
logLine: z.string().max(10_000),
}, async ({ logLine }) => {
const COMPLEX_LOG_RE = /^(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?Z?)\s+\[([^\]]+)\]\s+(.+)$/
try {
const match = await regexMatchWithTimeout(COMPLEX_LOG_RE, logLine, 100) // 100ms timeout
if (!match) return { content: [{ type: 'text', text: 'No match' }] }
return { content: [{ type: 'text', text: JSON.stringify({ time: match[1], level: match[2], msg: match[3] }) }] }
} catch (err) {
throw new Error('Log line parsing failed: input may have triggered ReDoS protection')
}
})
What SkillAudit checks
- Nested quantifiers in regexes applied to LLM-supplied input — HIGH; catastrophic backtracking can block the Node.js event loop for seconds per call
- Overlapping alternation in regexes applied to LLM-supplied input — HIGH; similar exponential complexity on crafted inputs
- No input length cap in Zod schema before regex application — WARN; longer inputs amplify backtracking time exponentially
- Complex third-party parser (markdown, URL) applied to unsanitized LLM input without timeout — WARN; library may internally use vulnerable regex patterns
Detecting vulnerable regexes in your codebase
Three tools to scan your MCP server's source for ReDoS-vulnerable regex patterns:
safe-regexnpm package —npx safe-regex '/your/regex/here'; returns true if the regex is safe (linear-time), false if potentially vulnerablevuln-regex-detector— more comprehensive analysis; can be run in CI as a lint step- Manual crafting — for any regex matching repeated groups, test with a string of 30+ characters that nearly matches but fails at the last character. If the test takes >10ms, it's likely vulnerable.
See also
- MCP server input validation — Zod schema validation including length caps before regex application
- MCP server rate limiting — rate limiting tool calls to reduce ReDoS attack frequency
- MCP server timeout security — wrapping tool handlers in wall-clock timeouts
- Limits of static analysis for MCP server security — what static analysis can and can't catch (including ReDoS)
- MCP server security checklist — comprehensive pre-submission checklist
Check your MCP server for ReDoS-vulnerable regex patterns applied to LLM-supplied input.
Run a free audit → How grading works →