Topic: mcp server xml security

MCP server XML security — XXE, billion laughs, and XPath injection

MCP servers that parse XML — from SOAP APIs, RSS/Atom feeds, Microsoft Office documents, SVG uploads, or LLM-supplied strings — face a distinct class of vulnerabilities that JSON-only servers don't. XML External Entity (XXE) injection lets attackers read arbitrary files from the server's filesystem via a file:// entity; the billion laughs attack exploits recursive entity expansion to exhaust CPU and memory; XPath injection allows LLM-supplied query strings to traverse unauthorized parts of an XML document; and XML signature wrapping bypasses authentication on servers that verify SAML or WS-Security tokens. This page covers the defense for each.

Attack 1: XXE — External Entity Injection

XML parsers that process DOCTYPE declarations and resolve external entities can be made to read arbitrary files. An attacker submits XML that defines an entity pointing to file:///etc/passwd (or any server file), and the parser includes the file contents in the parsed document. The MCP tool then returns those contents to the LLM — which returns them to the attacker via the tool response.

// Attack XML that reads /etc/passwd via XXE:
// <?xml version="1.0"?>
// <!DOCTYPE data [
//   <!ENTITY xxe SYSTEM "file:///etc/passwd">
// ]>
// <data>&xxe;</data>

// WRONG: parsing with default settings (many parsers process external entities by default)
import { XMLParser } from 'fast-xml-parser'

server.tool('parse_config', 'Parse an XML configuration document', {
  xmlContent: z.string().max(100_000),
}, async ({ xmlContent }) => {
  const parser = new XMLParser()  // Default: may process entities
  const result = parser.parse(xmlContent)
  return { content: [{ type: 'text', text: JSON.stringify(result) }] }
})

// CORRECT: disable entity processing and DTD
server.tool('parse_config', 'Parse an XML configuration document', {
  xmlContent: z.string().max(100_000),
}, async ({ xmlContent }) => {
  // fast-xml-parser: disable external entities and DTD processing
  const parser = new XMLParser({
    processEntities: false,   // Don't expand &entities; at all
    ignoreDeclaration: true,  // Ignore XML declaration
    // fast-xml-parser doesn't support DTD by design — but explicitly note it:
    // No DOCTYPE processing supported
  })

  // Pre-flight: reject documents containing DOCTYPE declarations
  if (xmlContent.includes('

Attack 2: Billion laughs — exponential entity expansion DoS

Even without external entities, a specially crafted XML document with nested entity references can cause exponential memory expansion. The "billion laughs" attack defines a chain of entities where each references the previous one multiple times, resulting in exponential growth when expanded. A 1 KB input can expand to gigabytes in memory, causing an OOM crash.

// The billion laughs XML pattern:
// <!DOCTYPE bomb [
//   <!ENTITY lol "lol">
//   <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
//   <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
//   ... (10 levels = 10^10 "lol" strings in memory)
// ]>

// Defense: three layers

// Layer 1: Reject DOCTYPE/ENTITY declarations before parsing
function rejectDangerousXml(xml: string): void {
  if (/ MAX_XML_SIZE_BYTES) {
    throw new Error(`XML input exceeds maximum size of ${MAX_XML_SIZE_BYTES} bytes`)
  }
}

// Layer 3: Parse with a timeout to catch slow parsers
function parseXmlSafely(xml: string): unknown {
  assertXmlSizeLimit(xml)
  rejectDangerousXml(xml)

  // Use fast-xml-parser with entity processing disabled
  const parser = new XMLParser({ processEntities: false })
  return parser.parse(xml)
}

server.tool('parse_xml', 'Parse an XML document', {
  xmlContent: z.string().max(500_000),  // Schema-level cap
}, async ({ xmlContent }) => {
  const result = parseXmlSafely(xmlContent)  // Throws if dangerous
  return { content: [{ type: 'text', text: JSON.stringify(result) }] }
})

Attack 3: XPath injection

MCP servers that execute XPath queries against parsed XML documents, using LLM-supplied strings as part of the XPath expression, are vulnerable to XPath injection. An attacker can craft an XPath expression that traverses outside the intended node scope, accessing parts of the document the tool was not designed to expose.

import xpath from 'xpath'
import { DOMParser } from '@xmldom/xmldom'

// WRONG: LLM-supplied string concatenated into XPath
server.tool('query_xml', 'Query an XML document by field name', {
  xmlContent: z.string().max(100_000),
  fieldName: z.string().max(200),
}, async ({ xmlContent, fieldName }) => {
  const doc = new DOMParser().parseFromString(xmlContent, 'text/xml')
  // Attack: fieldName = "' or '1'='1" — classic XPath injection
  // Or: fieldName = "..//secret-data" — node traversal beyond intended scope
  const nodes = xpath.select(`//record[@field='${fieldName}']`, doc)
  return { content: [{ type: 'text', text: JSON.stringify(nodes) }] }
})

// CORRECT 1: Allowlist-validate the fieldName before interpolating
const SAFE_FIELD_NAME_RE = /^[a-zA-Z][a-zA-Z0-9_]{0,63}$/

server.tool('query_xml', 'Query an XML document by field name', {
  xmlContent: z.string().max(100_000),
  // Schema-level allowlist: only safe identifiers allowed
  fieldName: z.string().regex(SAFE_FIELD_NAME_RE, 'Field name must be an identifier (letters, digits, underscores)'),
}, async ({ xmlContent, fieldName }) => {
  const doc = new DOMParser().parseFromString(xmlContent, 'text/xml')
  // After validation, safe to interpolate — SAFE_FIELD_NAME_RE excludes
  // all XPath operators (' " [ ] / @ * . : |)
  const nodes = xpath.select(`//record[@field='${fieldName}']`, doc)
  return { content: [{ type: 'text', text: JSON.stringify(nodes) }] }
})

// CORRECT 2: Restrict to a closed set of allowed queries (safest)
const ALLOWED_QUERIES: Record = {
  'all_records': '//record',
  'active_records': '//record[@status="active"]',
  'archived_records': '//record[@status="archived"]',
}

server.tool('query_xml', 'Query an XML document', {
  queryName: z.enum(['all_records', 'active_records', 'archived_records']),
}, async ({ xmlContent, queryName }) => {
  const doc = new DOMParser().parseFromString(xmlContent, 'text/xml')
  const xpathExpr = ALLOWED_QUERIES[queryName]  // Fixed expressions only
  const nodes = xpath.select(xpathExpr, doc)
  return { content: [{ type: 'text', text: JSON.stringify(nodes) }] }
})

What SkillAudit checks

See also

Check your MCP server for XXE, billion laughs, and XPath injection findings.

Run a free audit → How grading works →