Topic: mcp server caching security
MCP server caching security — cache poisoning, key collision, and stale credential risks
MCP servers often cache API responses, access control decisions, or expensive computation results to reduce latency. Each caching layer introduces security risks: LLM-controlled cache key injection lets an attacker poison the cache for other users, shared in-memory caches without user scoping leak data across sessions, long-lived access control caches serve stale authorization decisions after a user's permissions change, and unbounded cache growth creates a memory exhaustion DoS vector. This page covers the defense for each.
Attack 1: LLM-controlled cache key injection
If an MCP tool uses a LLM-supplied argument directly as a cache key, an attacker who can influence the LLM's tool calls can poison the cache for other users or cause cache collisions. A cache key like cache.get(args.query) where args.query is LLM-controlled allows the LLM to supply a query string that matches another user's cached entry and read or overwrite it.
import { createHash } from 'crypto'
import { LRUCache } from 'lru-cache'
// WRONG: raw LLM input as cache key — allows key injection and cross-user leakage
const cache = new Map()
server.tool('search', 'Search the knowledge base', {
query: z.string().max(500),
}, async ({ query }, { session }) => {
if (cache.has(query)) { // LLM can supply any string as cache key
return { content: [{ type: 'text', text: JSON.stringify(cache.get(query)) }] }
}
const result = await callSearchApi(query)
cache.set(query, result) // Unbounded growth; shared across all users
return { content: [{ type: 'text', text: JSON.stringify(result) }] }
})
// CORRECT: user-scoped, hashed cache key; bounded LRU cache
const searchCache = new LRUCache({
max: 1000, // At most 1000 entries total
ttl: 5 * 60 * 1000, // Entries expire after 5 minutes
})
function makeCacheKey(userId: string, query: string): string {
// Hash the user-scoped key: prevents LLM from guessing other users' keys
return createHash('sha256')
.update(`user:${userId}:search:${query}`)
.digest('hex')
}
server.tool('search', 'Search the knowledge base', {
query: z.string().min(1).max(500),
}, async ({ query }, { session }) => {
const key = makeCacheKey(session.userId, query)
const cached = searchCache.get(key)
if (cached) {
return { content: [{ type: 'text', text: JSON.stringify(cached) }] }
}
const result = await callSearchApi(query)
searchCache.set(key, result)
return { content: [{ type: 'text', text: JSON.stringify(result) }] }
})
Attack 2: Cross-user data leakage via shared caches
In-memory caches scoped to a process (not to a user or session) share data across all concurrent users of an HTTP-transport MCP server. If user A fetches a private document and the result is cached without user scoping, user B can retrieve it from the cache by making the same tool call — without ever having access to the original document.
// WRONG: process-level cache with no user scoping
const documentCache = new LRUCache({ max: 500 })
server.tool('get_document', 'Get a document by ID', {
documentId: z.string().regex(/^[a-zA-Z0-9_-]{1,64}$/),
}, async ({ documentId }) => {
if (documentCache.has(documentId)) {
// User B can get user A's private document if documentId is the same
return { content: [{ type: 'text', text: JSON.stringify(documentCache.get(documentId)) }] }
}
const doc = await fetchDocumentWithAccessControl(documentId, session.userId)
documentCache.set(documentId, doc) // Stored without user scope
return { content: [{ type: 'text', text: JSON.stringify(doc) }] }
})
// CORRECT: cache key includes userId — each user has their own cache space
server.tool('get_document', 'Get a document by ID', {
documentId: z.string().regex(/^[a-zA-Z0-9_-]{1,64}$/),
}, async ({ documentId }, { session }) => {
// Key includes userId — user B cannot read user A's cached document
const key = `user:${session.userId}:doc:${documentId}`
const cached = documentCache.get(key)
if (cached) {
return { content: [{ type: 'text', text: JSON.stringify(cached) }] }
}
const doc = await fetchDocumentWithAccessControl(documentId, session.userId)
documentCache.set(key, doc)
return { content: [{ type: 'text', text: JSON.stringify(doc) }] }
})
Attack 3: Stale access control decisions
MCP servers that cache access control decisions (can this user read this repo? does this user have admin scope?) for long periods serve stale authorization after permissions change. A user whose API key was revoked, who was removed from an org, or whose token's scope was narrowed continues to have tool access until the cache expires. This is an authorization bypass that scales with TTL length.
import { LRUCache } from 'lru-cache'
// Authorization decisions: short TTL, small cache
const authzCache = new LRUCache({
max: 10_000,
ttl: 60 * 1000, // 60 seconds — authz decisions expire quickly
// After 60 seconds, the next call re-checks with the actual API
})
async function canAccessRepo(userId: string, owner: string, repo: string): Promise {
const key = `authz:${userId}:repo:${owner}/${repo}`
const cached = authzCache.get(key)
if (cached !== undefined) return cached
// Re-check with the live API — this is the correct call every ≤60s
const allowed = await checkGitHubRepoAccess(userId, owner, repo)
authzCache.set(key, allowed)
return allowed
}
// For sensitive operations (delete, admin actions): skip the cache entirely
async function canAdminRepo(userId: string, owner: string, repo: string): Promise {
// No cache for admin checks — always call the live API
return checkGitHubAdminAccess(userId, owner, repo)
}
Attack 4: Memory DoS via unbounded cache growth
A plain JavaScript Map or object used as a cache grows without bound. An attacker who controls the cache key input (directly or via prompt injection) can fill the cache with unique keys, exhausting the process's heap memory. At Node.js's default heap limit (~1.5 GB), this causes an OOM crash.
// WRONG: unbounded Map — grows forever
const resultCache = new Map()
// CORRECT: LRU cache with entry limit and TTL
import { LRUCache } from 'lru-cache'
const resultCache = new LRUCache({
max: 2000, // Evict LRU entry when over 2000 entries
maxSize: 50 * 1024 * 1024, // Evict when over 50 MB total (if sizeCalculation provided)
sizeCalculation: (value) => Buffer.byteLength(value, 'utf8'),
ttl: 10 * 60 * 1000, // Entries expire after 10 minutes regardless of LRU
})
What SkillAudit checks
- Raw LLM-supplied argument used directly as Map or cache key — WARN; cache key injection and potential cross-user collisions
- Process-level cache (Map/LRUCache) with no userId/sessionId in key construction — HIGH on multi-user HTTP deployments; data leakage across sessions
- Access control decision cached with TTL > 5 minutes — WARN; stale AuthZ bypass after permission revocation
- Plain
Mapor object used as cache with no size limit — WARN; memory exhaustion DoS via unique key flooding
See also
- MCP server role-based access — access control implementation patterns
- MCP server rate limiting — rate limiting to prevent cache-flooding attacks
- MCP server memory leak security — memory management and leak prevention
- MCP server security checklist — comprehensive pre-submission checklist
- Public audit corpus — caching findings across scanned servers
Check your MCP server for cache poisoning, cross-user leakage, and memory DoS findings.
Run a free audit → How grading works →