Security Engineering · June 2026 · 12 min read

MCP Server Behavioral Intrusion Detection: Building a Session Anomaly Detector from Scratch

Static analysis and input validation stop the vulnerabilities you anticipated. A behavioral intrusion detector stops the ones you didn't. This post builds a complete session anomaly detector for MCP servers — per-session call baseline, first-privileged-tool detection, sliding-window velocity tracking, and composite anomaly scoring — in a single composable middleware, in under 2ms per tool call.

Why static analysis isn't enough

When SkillAudit audits an MCP server, static analysis reliably catches a defined class of vulnerabilities: SSRF-prone URL construction, command injection via shell execution, unguarded path.join() calls, hardcoded credentials. These are structural — they exist in the code at rest, and a scanner can identify them without running the server.

But a growing class of MCP attacks isn't visible in the source code at all:

None of these appear in the static audit. They require runtime behavioral analysis: watching what the session actually does, building a per-session model of normal behavior, and flagging deviation from it. This is behavioral intrusion detection for MCP servers.

Scope note: This post builds an in-process detector that runs inside your MCP server. It doesn't replace a WAF or network-level SIEM — it's complementary. The advantage is that it has full context: the session identity, the decoded tool call, the argument values, and the result — before the response is sent.

What the detector needs to observe

Before writing code, decide what signals you'll collect. A behavioral detector that observes everything is expensive and noisy. The goal is a minimal signal set that covers the attack classes above with low false-positive rate.

Call velocity

Calls per second and calls per minute for each tool, measured in sliding windows. Spike detection catches automated abuse and DoS attempts.

Tool sequence

Which tools follow which other tools in the session. Exfiltration chains have characteristic sequences even when individual tool calls look normal.

First-time privilege escalation

A session that calls a high-privilege tool for the first time after N minutes of low-privilege calls. Characteristic of session hijacking and prompt injection escalation.

Argument entropy

Base64-encoded or high-entropy string arguments to low-risk tools can indicate in-band exfiltration encoding. Entropy check is O(argument length), not expensive.

Error rate

Tool calls that consistently return errors are characteristic of automated probing — trying path values, testing parameter boundaries. High error rate signals enumeration.

Session age at privilege call

A privileged tool called within the first 5 seconds of a session suggests automated exploitation, not user-initiated workflow.

Data structure: the session record

The detector maintains one in-memory record per active session. Sessions are keyed by session ID (from the MCP auth token or a generated session UUID if your server is stateless). Each record holds the running behavioral state.

// session-store.js
class SessionRecord {
  constructor(sessionId, firstCallAt) {
    this.sessionId = sessionId;
    this.firstCallAt = firstCallAt;
    this.lastCallAt = firstCallAt;
    this.totalCalls = 0;
    this.totalErrors = 0;

    // Tool call counts by tool name
    this.toolCounts = new Map();          // toolName → count

    // Privilege history: set of privileged tools ever called this session
    this.privilegedToolsSeen = new Set();

    // Call sequence: last N tool names (ring buffer, N=20)
    this.recentSequence = [];
    this.SEQUENCE_MAX = 20;

    // Sliding window: timestamps of last 60 calls (for velocity)
    this.callTimestamps = [];
    this.WINDOW_MAX = 60;

    // Previous tool (for sequence pair tracking)
    this.previousTool = null;

    // Suspicious pair counts: "toolA→toolB" → count
    this.sequencePairs = new Map();

    // Anomaly score accumulator (resets on session clear)
    this.anomalyScore = 0;
    this.anomalyEvents = [];
  }

  recordCall(toolName, isPrivileged, isError, timestamp) {
    this.lastCallAt = timestamp;
    this.totalCalls++;
    if (isError) this.totalErrors++;

    // Tool count
    this.toolCounts.set(toolName, (this.toolCounts.get(toolName) ?? 0) + 1);

    // Privileged tool tracking
    if (isPrivileged) this.privilegedToolsSeen.add(toolName);

    // Sliding window (evict entries older than 60s)
    const windowStart = timestamp - 60_000;
    this.callTimestamps.push(timestamp);
    while (this.callTimestamps.length > 0 && this.callTimestamps[0] < windowStart) {
      this.callTimestamps.shift();
    }

    // Sequence ring buffer
    this.recentSequence.push(toolName);
    if (this.recentSequence.length > this.SEQUENCE_MAX) {
      this.recentSequence.shift();
    }

    // Sequence pair
    if (this.previousTool !== null) {
      const pair = `${this.previousTool}→${toolName}`;
      this.sequencePairs.set(pair, (this.sequencePairs.get(pair) ?? 0) + 1);
    }
    this.previousTool = toolName;
  }

  get callsLastMinute() {
    return this.callTimestamps.length;
  }

  get ageMs() {
    return this.lastCallAt - this.firstCallAt;
  }

  get errorRate() {
    return this.totalCalls === 0 ? 0 : this.totalErrors / this.totalCalls;
  }
}

const store = new Map(); // sessionId → SessionRecord

export function getOrCreateSession(sessionId) {
  if (!store.has(sessionId)) {
    store.set(sessionId, new SessionRecord(sessionId, Date.now()));
  }
  return store.get(sessionId);
}

export function clearSession(sessionId) {
  store.delete(sessionId);
}

// Evict sessions idle for more than 2 hours
setInterval(() => {
  const cutoff = Date.now() - 2 * 60 * 60 * 1000;
  for (const [id, rec] of store) {
    if (rec.lastCallAt < cutoff) store.delete(id);
  }
}, 5 * 60 * 1000);

Memory footprint: A SessionRecord for a high-volume session (60 call timestamps, 20-element sequence ring buffer, 50 tool counts, 100 sequence pairs) uses roughly 8–12 KB. At 10,000 concurrent sessions that's 80–120 MB — acceptable for most deployments, but add a hard cap on store.size if your server sees session floods.

The anomaly scoring engine

Each rule contributes an integer to the session's anomalyScore. Scores are additive. An action is taken when the total crosses a threshold. This approach avoids binary trip wires: a single slightly fast call doesn't trigger, but three marginally-anomalous signals in the same session do.

The scoring weights below are starting points. Tune them on your production traffic by collecting sessions and labeling them; then use the distribution of scores on normal traffic to set your actual thresholds.

// anomaly-scorer.js
const THRESHOLDS = {
  // Velocity: calls per 60-second sliding window
  VELOCITY_WARN:   30,   // +5 points above this
  VELOCITY_HIGH:   60,   // +15 points above this
  VELOCITY_CRITICAL: 120, // +40 points above this — block immediately

  // Error rate
  ERROR_RATE_WARN: 0.30, // +8 if >30% errors
  ERROR_RATE_HIGH: 0.60, // +20 if >60% errors

  // First privileged call within N ms of session start
  PRIV_ESCALATION_FAST_MS: 5_000,  // +25 if privileged call within 5s
  PRIV_ESCALATION_LATE_MS: 300_000, // +15 if first priv call after 5 min of low-priv

  // Argument entropy (Shannon entropy threshold for base64-like content)
  ENTROPY_HIGH: 4.5, // bits per char; ASCII text ≈ 3.5, base64 ≈ 5.5–6
};

// Shannon entropy of a string
function shannonEntropy(str) {
  const freq = new Map();
  for (const ch of str) freq.set(ch, (freq.get(ch) ?? 0) + 1);
  const len = str.length;
  let h = 0;
  for (const count of freq.values()) {
    const p = count / len;
    h -= p * Math.log2(p);
  }
  return h;
}

export function scoreCall(session, toolName, isPrivileged, argString, isError, now) {
  const events = [];
  let delta = 0;

  // --- 1. Velocity ---
  const cpm = session.callsLastMinute;
  if (cpm >= THRESHOLDS.VELOCITY_CRITICAL) {
    delta += 40; events.push({ rule: 'velocity_critical', cpm });
  } else if (cpm >= THRESHOLDS.VELOCITY_HIGH) {
    delta += 15; events.push({ rule: 'velocity_high', cpm });
  } else if (cpm >= THRESHOLDS.VELOCITY_WARN) {
    delta += 5;  events.push({ rule: 'velocity_warn', cpm });
  }

  // --- 2. Error rate ---
  if (session.totalCalls >= 5) {  // don't score until we have enough calls
    const er = session.errorRate;
    if (er >= THRESHOLDS.ERROR_RATE_HIGH) {
      delta += 20; events.push({ rule: 'error_rate_high', errorRate: er });
    } else if (er >= THRESHOLDS.ERROR_RATE_WARN) {
      delta += 8;  events.push({ rule: 'error_rate_warn', errorRate: er });
    }
  }

  // --- 3. First-time privileged tool ---
  if (isPrivileged && !session.privilegedToolsSeen.has(toolName)) {
    const sessionAgeMs = now - session.firstCallAt;
    if (sessionAgeMs < THRESHOLDS.PRIV_ESCALATION_FAST_MS) {
      delta += 25; events.push({ rule: 'priv_fast', ageMs: sessionAgeMs, tool: toolName });
    } else if (session.totalCalls >= 10 && sessionAgeMs > THRESHOLDS.PRIV_ESCALATION_LATE_MS) {
      delta += 15; events.push({ rule: 'priv_late_escalation', ageMs: sessionAgeMs, tool: toolName });
    }
  }

  // --- 4. Argument entropy ---
  if (argString && argString.length > 32) {
    const entropy = shannonEntropy(argString);
    if (entropy > THRESHOLDS.ENTROPY_HIGH) {
      delta += 10; events.push({ rule: 'high_entropy_arg', entropy: entropy.toFixed(2), tool: toolName });
    }
  }

  // --- 5. Suspicious tool-chain pair ---
  const EXFIL_PAIRS = new Set([
    'readFile→callUrl', 'readFile→sendEmail',
    'readSecret→callUrl', 'execShell→callUrl',
    'listFiles→readFile', // can chain into exfil
  ]);
  const currentPair = session.previousTool ? `${session.previousTool}→${toolName}` : null;
  if (currentPair && EXFIL_PAIRS.has(currentPair)) {
    delta += 30; events.push({ rule: 'suspicious_sequence', pair: currentPair });
  }

  session.anomalyScore += delta;
  session.anomalyEvents.push(...events.map(e => ({ ...e, at: now, tool: toolName })));

  return { delta, total: session.anomalyScore, events };
}

Middleware: wiring it into your MCP server

The scorer runs inside a thin middleware wrapper that intercepts every tool call, updates the session record, and either passes the call through or blocks it based on the anomaly score threshold.

// behavioral-ids.middleware.js
import { getOrCreateSession, clearSession } from './session-store.js';
import { scoreCall } from './anomaly-scorer.js';
import { createLogger } from './logger.js';

const log = createLogger('behavioral-ids');

// Tools that require elevated privilege (customize per your server)
const PRIVILEGED_TOOLS = new Set([
  'execShell', 'readSecret', 'writeFile', 'callUrl',
  'deleteRecord', 'grantPermission', 'sendEmail',
]);

const SCORE_THRESHOLDS = {
  LOG:   10,   // emit structured log event
  ALERT: 40,   // emit alert + add to response header
  BLOCK: 80,   // reject the call with 403-equivalent
};

export function behavioralIDS(handler) {
  return async function(toolName, args, context) {
    const sessionId = context.sessionId ?? context.connectionId ?? 'anon';
    const now = Date.now();

    const session = getOrCreateSession(sessionId);
    const isPrivileged = PRIVILEGED_TOOLS.has(toolName);

    // Call the underlying handler first so we can track errors
    let result, isError = false;
    try {
      result = await handler(toolName, args, context);
    } catch (err) {
      isError = true;
      result = null;
      // Record the call before re-throwing
      const argString = JSON.stringify(args);
      session.recordCall(toolName, isPrivileged, true, now);
      const { total, events } = scoreCall(session, toolName, isPrivileged, argString, true, now);
      if (total >= SCORE_THRESHOLDS.LOG) {
        log.warn({ sessionId, toolName, total, events }, 'anomaly-score-on-error');
      }
      throw err;
    }

    const argString = JSON.stringify(args);
    session.recordCall(toolName, isPrivileged, isError, now);
    const { total, events } = scoreCall(session, toolName, isPrivileged, argString, isError, now);

    if (total >= SCORE_THRESHOLDS.BLOCK) {
      log.error({ sessionId, toolName, total, events }, 'behavioral-ids-block');
      clearSession(sessionId);
      throw new Error(`Session blocked: anomaly score ${total} exceeds threshold`);
    }

    if (total >= SCORE_THRESHOLDS.ALERT) {
      log.warn({ sessionId, toolName, total, events }, 'behavioral-ids-alert');
      // Optionally: emit to your SIEM / alert webhook here
    } else if (total >= SCORE_THRESHOLDS.LOG) {
      log.info({ sessionId, toolName, total, events }, 'behavioral-ids-log');
    }

    return result;
  };
}
!

Handler-first vs. middleware-first ordering

The middleware above calls the underlying handler before scoring. This means a malicious call gets executed before it's blocked — which sounds backwards. The alternative (score first, execute second) has a different problem: you're scoring based on the session state from the previous call, not the current one, so the first call of an attack chain always gets through.

The pragmatic approach is to score after execution for most rules, but add a pre-execution check for velocity-critical: if session.callsLastMinute >= VELOCITY_CRITICAL before the call, block immediately without executing. This is the one case where the risk of executing (DoS from volume alone) outweighs the diagnostic value of letting it proceed.

First-privileged-tool detection in depth

The privilege escalation signal is the most valuable single indicator in the detector. The reasoning: every legitimate MCP workflow establishes what tools it needs in the first few calls. An agent tasked with "summarize these files" calls readFile and listFiles from the start. It doesn't call readFile for ten minutes and then suddenly call execShell.

When that pattern appears — normal low-privilege calls, then an unexpected first use of a high-privilege tool — one of two things happened:

  1. The original user task changed mid-session (low signal, normal product behavior).
  2. A prompt injection in a tool response redirected the agent to a new objective requiring different tools (high signal, attack indicator).

Distinguishing them requires additional context you may not have in the detector. The practical approach: trigger an alert (not a block) on late privilege escalation, then review those sessions in your audit log. Over time you'll identify which tool transitions are legitimate in your product (task context can shift) and which never legitimately occur (e.g., a file-summarizer tool that calls sendEmail after reading files is always suspicious).

You can refine the signal by adding a tool classification taxonomy to your server — categorize each tool as read_only, write_local, network_egress, privilege_change, or destructive — and track which categories appear in the session. A jump from read_only-only sessions to network_egress is a sharper signal than tool-name matching alone.

Sliding-window velocity: the right implementation

A common mistake is implementing velocity as a simple counter that resets on a fixed interval (e.g., every minute). This creates a burst window: an attacker can call 59 times at the end of minute 1 and 59 times at the start of minute 2 without triggering a 60/min threshold, getting 118 calls through in 2 seconds.

The callTimestamps ring buffer in the session record above implements a proper sliding window. The count of entries in the buffer is always the number of calls in the last 60 real seconds, not in the last calendar minute. This eliminates the burst-straddling attack and gives accurate rate measurement regardless of call timing.

The tradeoff is memory: the buffer stores one timestamp per call, with a maximum of 60 entries (since entries older than 60 seconds are evicted). At 8 bytes per timestamp (64-bit integer), that's 480 bytes per session for the velocity window. Entirely manageable.

Per-tool velocity vs. aggregate velocity: The implementation above measures aggregate calls per minute across all tools. For a more precise detector, maintain per-tool sliding windows — a server where listFiles is called 60 times in a minute is suspicious; a server that handles 60 different tool calls in a minute from a power user may not be. The memory cost is (tool count) × (window size) per session instead of just (window size) per session.

Argument entropy: detecting in-band encoding

In-band exfiltration via tool arguments works like this: an injected prompt instructs the LLM to base64-encode sensitive data and pass it as an argument to an innocuous-looking tool — perhaps as a label, description, or metadata field. The tool is legitimate; the argument content is the malicious exfiltration payload.

Shannon entropy is an efficient detector for this. Base64-encoded text has entropy around 5.5–6 bits per character (high uniformity across the 64-character alphabet). Normal human-readable text has entropy around 3.5–4 bits per character. A string argument longer than 32 characters with entropy above 4.5 is worth flagging — not necessarily blocking on its own, but adding to the composite score.

The threshold matters. You'll get false positives on UUIDs (entropy ≈ 4.8–5.0), cryptographic nonces, and hash strings. Reduce false positives by excluding arguments whose type is declared as uuid or hash in your MCP schema — check the tool's inputSchema type annotation before running the entropy check.

Suspicious sequence pairs: the exfiltration chain signal

The EXFIL_PAIRS set in the scorer hard-codes known-bad tool sequences. This is simple and effective for the common cases (readFile→callUrl is rarely legitimate), but it requires manual maintenance as you add new tools.

A more robust approach: classify every tool by whether it reads local state, whether it makes network egress calls, and whether it handles sensitive categories (credentials, PII, encryption keys). Flag any session that transitions from a sensitive-category read to a network egress call within N calls — regardless of specific tool names. This rule generalizes across server versions and tool renames.

You can also learn the normal sequence pairs from production traffic. Collect previousTool→currentTool pairs from non-flagged sessions over a week, build a frequency distribution, and flag pairs that appear in fewer than 0.1% of sessions. This statistical baseline automatically covers normal product behavior without manual enumeration.

Composite scoring: tuning and false-positive management

The scoring weights in this implementation are reasonable starting points, not gospel. Before deploying to production, run the detector in log-only mode (scores computed and logged but thresholds never triggered) against your production traffic for a week.

Then look at the score distribution:

Score 0–9
~85% of sessions
Score 10–39
~12% — review these
Score 40–79
~2.5% — alert
Score 80+
~0.5% — block

If 15% of sessions are scoring in the alert band, your weights are too aggressive — reduce the error-rate or entropy weights first (they have the highest false-positive risk). If nothing is scoring above 10, your weights are too conservative — increase velocity thresholds or add more sequence pairs.

The goal is a distribution where the block threshold catches sessions that also appear suspicious in your audit log on manual review. If you're blocking sessions that look legitimate in the audit log, your threshold is too low. If you're not blocking sessions that later turn out to be attacks, your weights are too low.

Performance budget

A behavioral detector that adds 50ms per tool call is a non-starter for a production MCP server. The implementation above is designed to be fast:

OperationComplexityTypical latency
getOrCreateSession()O(1) Map lookup<0.01ms
recordCall()O(W) window eviction (W ≤ 60)<0.1ms
scoreCall() — velocity, error rate, privilegeO(1)<0.05ms
scoreCall() — entropy checkO(|arg|) string scan<0.5ms for 1KB arg
scoreCall() — sequence pair lookupO(1) Set lookup<0.01ms
Total overhead per call<1ms typical, <2ms worst-case

The only variable-cost operation is entropy scoring. For large argument strings (over 4KB), cap the entropy check at the first 512 characters — a high-entropy exfiltration payload will show its pattern in the first 512 bytes, and checking all of a 100KB argument adds meaningless latency.

What SkillAudit checks in a server audit

When SkillAudit audits an MCP server for intrusion detection readiness, it checks for the structural preconditions that make a behavioral detector deployable:

CRITICAL −20 No session identity on tool calls — calls are anonymous, making per-session baseline impossible. Detector cannot function without session ID propagation.
HIGH −16 No tool call logging at all — behavioral analysis requires an event stream. If calls aren't logged, post-hoc forensic analysis of attacks is also impossible.
HIGH −14 Tool schema has no privilege classification — all tools treated as equivalent, which eliminates the privilege escalation signal (the most reliable attack indicator).
MEDIUM −10 No rate limiting at any layer — even without a behavioral detector, a hard rate limit prevents the worst velocity-based abuse.
MEDIUM −8 Session records not cleared on explicit logout — stale session records accumulate in memory, and their behavioral baseline may carry over if session IDs are reused.

You can audit your own server's intrusion detection posture at SkillAudit — the behavioral analysis checks run as part of the standard security audit alongside static SSRF, injection, and credential checks.

Complete integration example

Putting the three modules together with a minimal MCP server stub:

// server.js — minimal MCP server with behavioral IDS
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { behavioralIDS } from './behavioral-ids.middleware.js';
import { z } from 'zod';
import fs from 'fs/promises';

const server = new McpServer({ name: 'example-server', version: '1.0.0' });

// Raw tool handler (unwrapped)
async function rawReadFileTool(args) {
  const filePath = args.path;
  // ... safe path validation (see mcp-server-path-normalization-security) ...
  return { content: await fs.readFile(filePath, 'utf8') };
}

// Wrap with behavioral IDS
const secureReadFileTool = behavioralIDS(rawReadFileTool);

server.tool(
  'readFile',
  'Read a file from the allowed directory',
  { path: z.string().describe('Path within /data/') },
  async (args, context) => {
    return secureReadFileTool('readFile', args, context);
  }
);

const transport = new StdioServerTransport();
await server.connect(transport);

Next step: Once the in-process detector is running, export the anomaly events to your logging infrastructure (Pino structured logs → CloudWatch Logs Insights, or Loki if self-hosted). Build a query that shows session IDs with score > 40 over the last 24 hours — that's your daily security review queue. See the audit logging pipeline post for the structured log format that makes these queries efficient.

Summary

A behavioral intrusion detector for MCP servers is four components working together:

  1. Session record — per-session in-memory state: call timestamps (sliding window), tool counts, privileged-tool set, sequence ring buffer, anomaly score accumulator.
  2. Anomaly scorer — five rules (velocity, error rate, privilege escalation, argument entropy, sequence pairs) each contributing weighted points to a composite score.
  3. Threshold handler — log at 10, alert at 40, block at 80. Tunable per traffic pattern.
  4. Middleware wrapper — intercepts every tool call, updates session state, applies scorer, enforces thresholds — in under 2ms.

The most valuable single signal is first-time privilege escalation: a session that establishes a low-privilege pattern for minutes and then suddenly calls a high-privilege tool for the first time. This pattern catches prompt-injection-driven tool abuse that looks invisible to static analysis.

For further reading, see the companion reference pages: MCP server intrusion detection and anomaly security, session fixation and hijacking, and the zero-trust architecture deep-dive for how behavioral detection fits into a layered security model.