Topic: mcp server audit logging

MCP server audit logging — what to log, what to redact, and how to route it

Audit logging in an MCP server has a constraint that doesn't exist in conventional API servers: stdout is captured by the agent. Anything written to stdout on a stdio-transport server becomes part of the MCP protocol stream — it may be echoed back to the LLM, appear in the agent's conversation context, or trigger downstream tool calls. A console.log(`Searching with query: ${query}`) in a tool handler is not just a noisy log line; it is a potential credential echo and a prompt-injection surface. Getting logging right requires understanding this constraint before picking a log format.

What to log

Every MCP tool invocation should produce one structured log event with these fields:

{
  "ts": "2026-05-31T14:22:01.432Z",    // ISO 8601, millisecond precision
  "level": "info",
  "event": "tool_call",
  "tool": "search",                    // tool name — safe to log
  "arg_keys": ["query", "limit"],      // argument NAMES only, never values
  "caller": "claude-code/3.2.1",       // agent user agent if available
  "duration_ms": 234,
  "status": "ok",                      // ok | error | timeout | rate_limited
  "error_code": null                   // error type if status != ok
}

The arg_keys field is the critical design choice. Log argument names, not values. Argument values may contain user PII, injected instructions, or the value that triggered an SSRF — none of which should be in your log store where it can become a secondary exfiltration surface. The argument names alone are sufficient for forensic "which tool, with what parameters shape" analysis.

What not to log

Argument values — includes query strings, URLs, file paths, user-provided text. If you need full argument logging for debugging, gate it behind an explicit DEBUG_LOG_ARGS=true env var that is never set in production, and redact the log output before it reaches any persistent store.
API response bodies — downstream API responses may contain PII, financial data, or credentials the vendor echoed back in the error payload. Log the response status code and size, not the body.
Credentials of any form — env-var names that look like secrets (VENDOR_API_KEY), token values in error messages, Authorization header contents. Use a structured logger with a redact list and test it: send a request that includes a fake token string and verify it does not appear in the log output.
Stack traces to stdout — on stdio-transport, an unhandled exception that writes a stack trace to stdout corrupts the JSON-RPC framing and may expose internal paths. Catch all unhandled rejections and write structured error events to stderr instead.

Layer 1 — structured logger with redaction

import pino from 'pino';

// All log output goes to stderr — never stdout on stdio-transport servers
const log = pino({
  level: process.env.LOG_LEVEL ?? 'info',
  redact: {
    paths: [
      'token', 'apiKey', 'api_key', 'secret', 'password', 'authorization',
      'credentials', 'access_token', 'refresh_token',
      '*.token', '*.api_key', '*.secret'
    ],
    remove: true   // remove the field entirely rather than replacing with [REDACTED]
  }
}, pino.destination(2));   // fd 2 = stderr

export function logToolCall(tool: string, argKeys: string[], startMs: number, status: string, errorCode?: string) {
  log.info({
    event: 'tool_call',
    tool,
    arg_keys: argKeys,
    duration_ms: Date.now() - startMs,
    status,
    error_code: errorCode ?? null
  });
}

pino's redact option is path-based — it removes matching fields before serialization, so the value never enters the string encoding step. Test it by passing a known-pattern token as an argument in your integration tests and asserting it does not appear in the parsed log output.

Layer 2 — tool handler instrumentation

// Wrap all tool handlers with the same instrumentation decorator
function withAuditLog<T extends Record<string, unknown>>(
  toolName: string,
  handler: (args: T) => Promise<CallToolResult>
) {
  return async (args: T): Promise<CallToolResult> => {
    const start = Date.now();
    try {
      const result = await handler(args);
      logToolCall(toolName, Object.keys(args), start, 'ok');
      return result;
    } catch (err) {
      const code = err instanceof McpError ? err.code.toString() : 'internal_error';
      logToolCall(toolName, Object.keys(args), start, 'error', code);
      // Re-throw as a clean MCP error — never expose raw error to agent
      throw new McpError(ErrorCode.InternalError, `Tool ${toolName} failed.`);
    }
  };
}

// Usage:
server.tool('search', withAuditLog('search', async ({ query, limit }) => {
  // handler logic
}));

Layer 3 — stderr log routing in production

On stdio-transport, the MCP host (Claude Code, Cursor, Windsurf) typically captures stderr separately from the JSON-RPC stdout stream. But "captured separately" doesn't mean "shipped to your SIEM" — most agent clients discard stderr or surface it only as debug output visible to the developer. For production audit trail, you need to route stderr explicitly:

Systemd unit: add StandardError=journal to the [Service] section. Stderr goes to journald, queryable with journalctl -u your-mcp-server.
Docker: log driver captures both stdout (protocol stream) and stderr (your audit log). Use --log-driver=awslogs or --log-driver=fluentd for SIEM routing. Note: with Docker's default JSON file driver, both streams are interleaved — specify a structured log format in pino so your SIEM can distinguish protocol frames from audit events.
PM2 / Node process manager: use pm2 start server.js --merge-logs false to keep stdout and stderr in separate files (out.log vs error.log). Ship error.log to your log store; out.log is the raw protocol stream and should never be shipped.

Layer 4 — SIEM integration and alerting

The structured log events from Layer 1 are designed to be queryable. Useful SIEM rules for MCP audit logs:

High-frequency caller alert — >100 tool calls from the same caller field within 60 seconds. May indicate a runaway agent loop or an automated prompt-injection attack driving the server.
Error rate spike — status == "error" rate exceeds 20% of calls in a 5-minute window. Indicates either an upstream API problem or a probing pattern (attacker trying different argument shapes to find an SSRF path).
Unknown argument key — an argument key appears in arg_keys that is not in the tool's declared input schema. Should be impossible if the MCP SDK is validating against the schema; if it appears, the schema validation is bypassed.
Long duration outlier — duration_ms > 10000 on a tool that normally completes in <500ms. May indicate a DNS rebinding or SSRF where the server is waiting on a slow attacker-controlled endpoint.

Run a SkillAudit to check your logging posture

The SkillAudit engine's credentials axis includes a check for process.env and os.environ reads that flow into tool handler return values or log statements — the stdout-capture vector. The security axis checks for unhandled exception handlers that write to stdout. Paste your GitHub URL at skillaudit.dev. Results in 60 seconds.