MCP Server Security — Intrusion Detection & Anomaly

MCP server intrusion detection and anomaly security — behavioral baselines, UEBA for agent sessions, and alerting on tool call deviation

Traditional intrusion detection for web services focuses on network signatures and rate limits. MCP servers require a different approach: the attacker is often an LLM that has been manipulated by prompt injection — its requests are syntactically valid, properly authenticated, within rate limits, and target tools the user legitimately has access to. Detecting this class of attack requires behavioral anomaly detection at the session level: establish what a normal tool call pattern looks like for a given user and workflow, then alert when the current session deviates. This page covers how to build per-session behavioral baselines, score deviations, detect prompt-injection-driven session takeover mid-flight, and wire alerts without overwhelming the operator with false positives.

Attack 1: Prompt-injection-driven session takeover — legitimate user, manipulated agent

The most dangerous class of MCP intrusion does not involve credential theft. The attacker embeds instructions in content the agent reads — a document, a web page, a database record, a Slack message — that redirect the agent's behavior within the user's session. The agent continues to use the user's credentials, follows the user's permission grants, and calls tools the user legitimately has access to. From the server's perspective, all requests are authorized. The signal is behavioral: the agent is doing things the user would never ask it to do, calling tools in sequences that don't match any human workflow.

Session behavioral baselines detect this by modeling the expected tool call graph for a session type. A document-editing session calls read_file, search_files, and write_file. An email-processing session calls list_emails, read_email, and send_email. If a document-editing session calls list_emails then send_email then read_file on /etc/passwd, the session has been hijacked.

// Session behavioral baseline — record tool calls with sequence context
import { createClient } from 'redis';

const redis = createClient({ url: process.env.REDIS_URL });

// Tool call logger middleware — wraps all tool handlers
export function auditingWrapper(toolName, handler) {
  return async (args, context) => {
    const { userId, sessionId } = context;
    const seqKey = `seq:${sessionId}`;
    const seq = await redis.incr(seqKey);
    await redis.expire(seqKey, 3600);

    // Record tool call in session timeline
    await redis.lPush(`session:${sessionId}:calls`, JSON.stringify({
      seq,
      tool: toolName,
      ts: Date.now(),
      argKeys: Object.keys(args || {}),        // which arguments, not their values
      resourceType: inferResourceType(args),    // 'file', 'email', 'shell', etc.
    }));
    await redis.expire(`session:${sessionId}:calls`, 3600);

    // Anomaly check before executing
    const anomalyScore = await scoreAnomaly(sessionId, toolName, args, seq);
    if (anomalyScore >= 90) {
      // CRITICAL: known exfiltration pattern — block immediately
      await emitAlert('CRITICAL', { userId, sessionId, toolName, anomalyScore, args });
      throw new Error('Tool call blocked: anomaly score critical');
    }
    if (anomalyScore >= 70) {
      // HIGH: challenge — require re-auth token or human confirm
      await emitAlert('HIGH', { userId, sessionId, toolName, anomalyScore });
      // In production: redirect to TOTP/WebAuthn challenge flow
    }

    return handler(args, context);
  };
}

function inferResourceType(args) {
  if (!args) return 'unknown';
  const str = JSON.stringify(args).toLowerCase();
  if (str.includes('/etc/') || str.includes('passwd') || str.includes('shadow')) return 'sensitive-system';
  if (str.includes('.env') || str.includes('secret') || str.includes('token')) return 'sensitive-config';
  if (str.match(/\.(sh|bash|py|rb|js)$/)) return 'executable';
  if (str.match(/exec|spawn|shell|cmd/)) return 'shell';
  return 'generic';
}

async function scoreAnomaly(sessionId, toolName, args, seq) {
  const callsJson = await redis.lRange(`session:${sessionId}:calls`, 0, 9); // last 10
  const recentCalls = callsJson.map(j => JSON.parse(j));

  let score = 0;

  // Privileged tool appearing for first time after non-privileged start
  const privilegedTools = new Set(['exec_shell', 'write_file', 'send_email', 'delete_resource']);
  const prevTools = new Set(recentCalls.slice(1).map(c => c.tool)); // exclude current
  if (privilegedTools.has(toolName) && prevTools.size > 0 && !prevTools.has(toolName)) {
    score += 40; // first privileged call after session established
  }

  // Sensitive resource type
  const resourceType = inferResourceType(args);
  if (resourceType === 'sensitive-system') score += 50;
  if (resourceType === 'sensitive-config') score += 35;
  if (resourceType === 'shell') score += 45;
  if (resourceType === 'executable') score += 25;

  // Argument contains known injection/exfil patterns
  const argStr = JSON.stringify(args || {});
  if (argStr.match(/\|\s*curl|wget.*http|nc\s+-/)) score += 60;         // shell pipe to network tool
  if (argStr.match(/base64\s+-d|base64decode/i)) score += 30;            // base64 decode in arg
  if (argStr.match(/\/dev\/tcp|\/proc\/self/)) score += 50;              // Linux special paths
  if (argStr.match(/constructor\.constructor|__proto__|process\.env/)) score += 80; // injection payload

  return Math.min(score, 100);
}

Attack 2: Cross-tenant resource access via tool argument manipulation

When an MCP server serves multiple users or tenants, tool arguments that reference resource IDs are a cross-tenant injection vector. A manipulated agent may call read_document(documentId: "tenant-B-doc-123") while authenticated as a tenant-A user. If the tool handler fetches by ID without verifying ownership, the agent exfiltrates cross-tenant data. The anomaly signal: the resource ID in a tool argument doesn't belong to the authenticated user's tenant.

// Cross-tenant anomaly detection in tool argument scanner
function checkCrossTenantAccess(userId, tenantId, toolName, args) {
  const argStr = JSON.stringify(args || {});

  // Extract any IDs that look like resource references
  const idPattern = /["\']([a-f0-9\-]{8,})["\']|id["\s:]+["\']([^"\']+)["\']/gi;
  const matches = [...argStr.matchAll(idPattern)];

  const anomalies = [];
  for (const match of matches) {
    const candidateId = match[1] || match[2];
    if (!candidateId) continue;

    // Check if this ID contains another tenant's prefix
    // Assumes IDs are prefixed: "tenant-{tenantId}-{resourceId}"
    const idTenantMatch = candidateId.match(/^tenant-([^-]+)-/);
    if (idTenantMatch && idTenantMatch[1] !== tenantId) {
      anomalies.push({
        type: 'cross_tenant_id',
        candidateId,
        claimedTenant: idTenantMatch[1],
        actualTenant: tenantId,
        severity: 'CRITICAL',
      });
    }
  }

  return anomalies;
}

Attack 3: Tool call velocity spike — exfiltration via high-throughput reads

A manipulated agent that has found sensitive files may call read_file hundreds of times in rapid succession to exfiltrate an entire directory tree. Traditional rate limits per minute/hour are too coarse to detect this while it's happening. Intra-session velocity tracking — comparing the current 30-second call rate to the session's baseline rate — catches exfiltration bursts within seconds.

// Sliding window velocity tracker
class SessionVelocityTracker {
  constructor(redis) {
    this.redis = redis;
    this.WINDOW_MS = 30_000;
    this.BASELINE_WINDOW = 5; // first 5 calls establish baseline
  }

  async record(sessionId, toolName) {
    const tsKey = `vel:${sessionId}:${toolName}`;
    const now = Date.now();
    const windowStart = now - this.WINDOW_MS;

    // Add current timestamp, remove expired entries
    await this.redis.zAdd(tsKey, { score: now, value: `${now}` });
    await this.redis.zRemRangeByScore(tsKey, 0, windowStart);
    await this.redis.expire(tsKey, 120);

    const countInWindow = await this.redis.zCard(tsKey);
    return countInWindow;
  }

  async isVelocityAnomaly(sessionId, toolName, currentCount) {
    // Simple threshold: > 20 calls of same tool in 30s is suspicious
    const VELOCITY_THRESHOLD = 20;
    return currentCount > VELOCITY_THRESHOLD;
  }
}

// Wire into tool handlers:
const velocityTracker = new SessionVelocityTracker(redis);

server.tool('read_file', async (args, context) => {
  const count = await velocityTracker.record(context.sessionId, 'read_file');
  if (await velocityTracker.isVelocityAnomaly(context.sessionId, 'read_file', count)) {
    await emitAlert('HIGH', {
      type: 'velocity_spike',
      tool: 'read_file',
      count,
      sessionId: context.sessionId,
    });
    // Throttle rather than block — legitimate bulk operations exist
    await new Promise(resolve => setTimeout(resolve, 500));
  }
  return readFile(args.path, context);
});

SkillAudit findings

The following findings appear in SkillAudit audit reports for MCP servers lacking behavioral anomaly detection:

CRITICAL  No behavioral anomaly detection — prompt-injection session takeover undetectable. The server logs authentication events but not tool call sequences. A manipulated agent calling privileged tools in patterns no human would initiate cannot be detected or blocked in real time. Implement per-session tool call logging with anomaly scoring.

CRITICAL  No cross-tenant resource ID validation in tool arguments. Tool arguments containing resource IDs are not checked against the authenticated user's tenant before the tool handler executes. A manipulated agent can access cross-tenant data by passing IDs belonging to other tenants in tool call arguments.

HIGH  No velocity anomaly detection — bulk exfiltration undetected. The server imposes per-minute rate limits but does not track intra-session tool call velocity. A manipulated agent can read hundreds of files in a burst without triggering any alert. Add per-tool sliding-window velocity tracking with session-level baseline comparison.

HIGH  Sensitive system paths accessible via tool arguments without anomaly flag. Tool arguments referencing /etc/, /proc/, .env, or credential file patterns are not scored as anomalous and do not trigger alerts. A manipulated agent reading sensitive configuration files is indistinguishable from normal operation in the server's logs.

MEDIUM  No first-privileged-tool alert for sessions starting with non-privileged calls. Sessions that begin with read-only tool calls and then escalate to write/exec/send calls do not trigger any anomaly signal. Privilege escalation within a session is a reliable prompt-injection indicator.

Paste a GitHub URL at skillaudit.dev to get a graded report card.