Topic: mcp server regex dos

MCP server ReDoS — Regular Expression Denial of Service in tool handlers

Regular Expression Denial of Service (ReDoS) occurs when a regex engine backtracks exponentially on a crafted input. In MCP servers, tool handlers frequently apply regular expressions to LLM-supplied strings — to validate input, parse structured data, or extract information. A LLM that has been prompt-injected can supply an input specifically crafted to trigger catastrophic backtracking, consuming 100% of one CPU core for seconds to minutes per tool call. In a single-threaded Node.js process, this blocks the entire event loop. This page covers which regex patterns are vulnerable, how to detect them, and what to use instead.

How catastrophic backtracking happens

JavaScript's regex engine uses backtracking NFA (Non-deterministic Finite Automaton) evaluation. For most regexes this is fast, but certain patterns cause exponential backtracking: the engine tries every possible combination of matches before concluding that no match exists. The vulnerable pattern family involves overlapping alternations or nested quantifiers where the engine cannot prune the search space.

// Classic vulnerable patterns — test how long each takes on a 30-char crafted input:

// 1. Nested quantifier — (a+)+ or (a*)*
const VULN_EMAIL = /^([a-zA-Z0-9._-]+@)+[a-zA-Z]{2,}$/
// Attack input: 'aaaaaaaaaaaaaaaaaaaaaaaaa@aaaaaaa@'
// Each 'a' before the '@' can be matched by the inner group in multiple ways

// 2. Alternation with overlapping matches — (a|aa)+
const VULN_DOMAIN = /^(https?:\/\/|ftp:\/\/)?([\w\-]+(\.[\w\-]+)+)(\/.*)?$/
// Certain inputs with many dots cause exponential backtracking

// 3. Chained optional groups — (a?){20} followed by a{20}
const VULN_REPEAT = /^(a?){30}a{30}$/
// For an input of 30 'a's that doesn't match, tries 2^30 combinations

// Test a regex for ReDoS vulnerability:
// 1. Try a crafted input that is just inside the expected format
//    but fails at the last character (e.g., 30 'a's + '@')
// 2. Time it with console.time / console.timeEnd
// 3. If > 10ms for a 50-char input, it's likely vulnerable

console.time('regex-test')
VULN_EMAIL.test('aaaaaaaaaaaaaaaaaaaaaaaaa@')
console.timeEnd('regex-test')
// Vulnerable regex: may take seconds or minutes for this input

Safe alternatives in MCP tool handlers

// Option 1: Use RE2 (linear-time engine, no backtracking)
// npm install re2
import RE2 from 're2'

// RE2 drop-in for the safe validation patterns below
const emailRe2 = new RE2('^[a-zA-Z0-9._%+\\-]{1,64}@[a-zA-Z0-9.\\-]{1,253}\\.[a-zA-Z]{2,}$')

server.tool('validate_email', 'Validate an email address format', {
  email: z.string().max(320),  // RFC 5321 max email length
}, async ({ email }) => {
  const valid = emailRe2.test(email)
  return { content: [{ type: 'text', text: JSON.stringify({ valid, email }) }] }
})

// Option 2: Use safe, non-backtracking regex patterns
// These patterns avoid nested quantifiers and overlapping alternation

// SAFE email format check (not full RFC 5321 — safe for input validation)
const SAFE_EMAIL_RE = /^[a-zA-Z0-9._%+\-]{1,64}@[a-zA-Z0-9.\-]{1,253}\.[a-zA-Z]{2,}$/
// Why safe: no nested quantifiers, no overlapping alternation
// Worst case: linear scan of the input — no backtracking

// SAFE URL format check
const SAFE_URL_RE = /^https?:\/\/[a-zA-Z0-9\-\.]{1,253}(:\d{1,5})?(\/[^\s]*)?$/
// Why safe: the character classes [a-zA-Z0-9\-\.] and [^\s] don't overlap
// with the adjacent literal characters — no catastrophic backtracking path

// SAFE slug/identifier check
const SAFE_SLUG_RE = /^[a-z0-9][a-z0-9\-]{0,253}[a-z0-9]$/

// Option 3: Validate structure first, then apply simple regex to each component
function safeParseEmail(email: string): { local: string; domain: string } | null {
  // Split at the last '@' first — no regex needed for structural check
  const atIndex = email.lastIndexOf('@')
  if (atIndex < 1 || atIndex > 64) return null
  const local = email.slice(0, atIndex)
  const domain = email.slice(atIndex + 1)
  if (!local || !domain) return null
  if (domain.length > 253) return null
  // Now apply simple, safe regexes to each component separately
  if (!/^[a-zA-Z0-9._%+\-]+$/.test(local)) return null
  if (!/^[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/.test(domain)) return null
  return { local, domain }
}

Worker thread timeout wrapper for unavoidable complex regexes

When a library you depend on uses complex regexes internally (e.g., a markdown parser, URL normalizer, or schema validator), you can't always replace the regex with RE2. Wrap the call in a Worker thread with a timeout to prevent a slow match from blocking the event loop.

import { Worker, isMainThread, parentPort, workerData } from 'worker_threads'
import { setTimeout as setTimeoutPromise } from 'timers/promises'

// Run a potentially-slow regex match in a worker thread with a timeout
function regexMatchWithTimeout(pattern: RegExp, input: string, timeoutMs: number): Promise {
  return new Promise((resolve, reject) => {
    const worker = new Worker(`
      const { parentPort, workerData } = require('worker_threads')
      const re = new RegExp(workerData.pattern.source, workerData.pattern.flags)
      const result = re.exec(workerData.input)
      parentPort.postMessage(result)
    `, {
      eval: true,
      workerData: { pattern, input },
    })

    const timer = setTimeout(() => {
      worker.terminate()
      reject(new Error(`Regex match timed out after ${timeoutMs}ms — possible ReDoS`))
    }, timeoutMs)

    worker.on('message', (result) => {
      clearTimeout(timer)
      resolve(result)
    })

    worker.on('error', (err) => {
      clearTimeout(timer)
      reject(err)
    })
  })
}

// Usage:
server.tool('parse_log_line', 'Extract fields from a log line', {
  logLine: z.string().max(10_000),
}, async ({ logLine }) => {
  const COMPLEX_LOG_RE = /^(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?Z?)\s+\[([^\]]+)\]\s+(.+)$/
  try {
    const match = await regexMatchWithTimeout(COMPLEX_LOG_RE, logLine, 100)  // 100ms timeout
    if (!match) return { content: [{ type: 'text', text: 'No match' }] }
    return { content: [{ type: 'text', text: JSON.stringify({ time: match[1], level: match[2], msg: match[3] }) }] }
  } catch (err) {
    throw new Error('Log line parsing failed: input may have triggered ReDoS protection')
  }
})

What SkillAudit checks

Detecting vulnerable regexes in your codebase

Three tools to scan your MCP server's source for ReDoS-vulnerable regex patterns:

See also

Check your MCP server for ReDoS-vulnerable regex patterns applied to LLM-supplied input.

Run a free audit → How grading works →