Topic: mcp server caching security

MCP server caching security — cache poisoning, key collision, and stale credential risks

MCP servers often cache API responses, access control decisions, or expensive computation results to reduce latency. Each caching layer introduces security risks: LLM-controlled cache key injection lets an attacker poison the cache for other users, shared in-memory caches without user scoping leak data across sessions, long-lived access control caches serve stale authorization decisions after a user's permissions change, and unbounded cache growth creates a memory exhaustion DoS vector. This page covers the defense for each.

Attack 1: LLM-controlled cache key injection

If an MCP tool uses a LLM-supplied argument directly as a cache key, an attacker who can influence the LLM's tool calls can poison the cache for other users or cause cache collisions. A cache key like cache.get(args.query) where args.query is LLM-controlled allows the LLM to supply a query string that matches another user's cached entry and read or overwrite it.

import { createHash } from 'crypto'
import { LRUCache } from 'lru-cache'

// WRONG: raw LLM input as cache key — allows key injection and cross-user leakage
const cache = new Map()

server.tool('search', 'Search the knowledge base', {
  query: z.string().max(500),
}, async ({ query }, { session }) => {
  if (cache.has(query)) {  // LLM can supply any string as cache key
    return { content: [{ type: 'text', text: JSON.stringify(cache.get(query)) }] }
  }
  const result = await callSearchApi(query)
  cache.set(query, result)  // Unbounded growth; shared across all users
  return { content: [{ type: 'text', text: JSON.stringify(result) }] }
})

// CORRECT: user-scoped, hashed cache key; bounded LRU cache
const searchCache = new LRUCache({
  max: 1000,          // At most 1000 entries total
  ttl: 5 * 60 * 1000, // Entries expire after 5 minutes
})

function makeCacheKey(userId: string, query: string): string {
  // Hash the user-scoped key: prevents LLM from guessing other users' keys
  return createHash('sha256')
    .update(`user:${userId}:search:${query}`)
    .digest('hex')
}

server.tool('search', 'Search the knowledge base', {
  query: z.string().min(1).max(500),
}, async ({ query }, { session }) => {
  const key = makeCacheKey(session.userId, query)
  const cached = searchCache.get(key)
  if (cached) {
    return { content: [{ type: 'text', text: JSON.stringify(cached) }] }
  }
  const result = await callSearchApi(query)
  searchCache.set(key, result)
  return { content: [{ type: 'text', text: JSON.stringify(result) }] }
})

Attack 2: Cross-user data leakage via shared caches

In-memory caches scoped to a process (not to a user or session) share data across all concurrent users of an HTTP-transport MCP server. If user A fetches a private document and the result is cached without user scoping, user B can retrieve it from the cache by making the same tool call — without ever having access to the original document.

// WRONG: process-level cache with no user scoping
const documentCache = new LRUCache({ max: 500 })

server.tool('get_document', 'Get a document by ID', {
  documentId: z.string().regex(/^[a-zA-Z0-9_-]{1,64}$/),
}, async ({ documentId }) => {
  if (documentCache.has(documentId)) {
    // User B can get user A's private document if documentId is the same
    return { content: [{ type: 'text', text: JSON.stringify(documentCache.get(documentId)) }] }
  }
  const doc = await fetchDocumentWithAccessControl(documentId, session.userId)
  documentCache.set(documentId, doc)  // Stored without user scope
  return { content: [{ type: 'text', text: JSON.stringify(doc) }] }
})

// CORRECT: cache key includes userId — each user has their own cache space
server.tool('get_document', 'Get a document by ID', {
  documentId: z.string().regex(/^[a-zA-Z0-9_-]{1,64}$/),
}, async ({ documentId }, { session }) => {
  // Key includes userId — user B cannot read user A's cached document
  const key = `user:${session.userId}:doc:${documentId}`
  const cached = documentCache.get(key)
  if (cached) {
    return { content: [{ type: 'text', text: JSON.stringify(cached) }] }
  }
  const doc = await fetchDocumentWithAccessControl(documentId, session.userId)
  documentCache.set(key, doc)
  return { content: [{ type: 'text', text: JSON.stringify(doc) }] }
})

Attack 3: Stale access control decisions

MCP servers that cache access control decisions (can this user read this repo? does this user have admin scope?) for long periods serve stale authorization after permissions change. A user whose API key was revoked, who was removed from an org, or whose token's scope was narrowed continues to have tool access until the cache expires. This is an authorization bypass that scales with TTL length.

import { LRUCache } from 'lru-cache'

// Authorization decisions: short TTL, small cache
const authzCache = new LRUCache({
  max: 10_000,
  ttl: 60 * 1000,  // 60 seconds — authz decisions expire quickly
  // After 60 seconds, the next call re-checks with the actual API
})

async function canAccessRepo(userId: string, owner: string, repo: string): Promise {
  const key = `authz:${userId}:repo:${owner}/${repo}`
  const cached = authzCache.get(key)
  if (cached !== undefined) return cached

  // Re-check with the live API — this is the correct call every ≤60s
  const allowed = await checkGitHubRepoAccess(userId, owner, repo)
  authzCache.set(key, allowed)
  return allowed
}

// For sensitive operations (delete, admin actions): skip the cache entirely
async function canAdminRepo(userId: string, owner: string, repo: string): Promise {
  // No cache for admin checks — always call the live API
  return checkGitHubAdminAccess(userId, owner, repo)
}

Attack 4: Memory DoS via unbounded cache growth

A plain JavaScript Map or object used as a cache grows without bound. An attacker who controls the cache key input (directly or via prompt injection) can fill the cache with unique keys, exhausting the process's heap memory. At Node.js's default heap limit (~1.5 GB), this causes an OOM crash.

// WRONG: unbounded Map — grows forever
const resultCache = new Map()

// CORRECT: LRU cache with entry limit and TTL
import { LRUCache } from 'lru-cache'

const resultCache = new LRUCache({
  max: 2000,           // Evict LRU entry when over 2000 entries
  maxSize: 50 * 1024 * 1024,  // Evict when over 50 MB total (if sizeCalculation provided)
  sizeCalculation: (value) => Buffer.byteLength(value, 'utf8'),
  ttl: 10 * 60 * 1000, // Entries expire after 10 minutes regardless of LRU
})

What SkillAudit checks

Raw LLM-supplied argument used directly as Map or cache key — WARN; cache key injection and potential cross-user collisions
Process-level cache (Map/LRUCache) with no userId/sessionId in key construction — HIGH on multi-user HTTP deployments; data leakage across sessions
Access control decision cached with TTL > 5 minutes — WARN; stale AuthZ bypass after permission revocation
Plain Map or object used as cache with no size limit — WARN; memory exhaustion DoS via unique key flooding

Attack 1: LLM-controlled cache key injection

Attack 2: Cross-user data leakage via shared caches

Attack 3: Stale access control decisions

Attack 4: Memory DoS via unbounded cache growth

What SkillAudit checks

See also