MCP Server Security — Distributed Tracing

MCP server distributed tracing security — OpenTelemetry span leakage, trace ID correlation oracle, baggage header injection, and sampling bypass

Adding OpenTelemetry distributed tracing to an MCP server is standard practice for production observability — but it introduces security risks that are easy to miss. Tool call arguments (file paths, user queries, API parameters) flow into span attributes and end up in your trace backend readable by anyone with OTLP access. Trace IDs propagated to clients become cross-user correlation oracles. The W3C baggage header carries arbitrary key-value pairs that downstream microservices may act on as trusted context. And sampling configuration that discards high-volume traces can silence exactly the security events you most need to see. This page covers each class with Node.js patterns to contain the risk.

Attack 1: Span attribute leakage — tool arguments in trace backends

OpenTelemetry auto-instrumentation for Node.js HTTP and Express will capture request bodies as span attributes if you configure requestHook to include them. MCP tool handlers that call span.setAttribute('tool.args', JSON.stringify(args)) for debugging push sensitive data — file paths, user queries, API keys in tool arguments, extracted document content — into the trace backend. Jaeger, Grafana Tempo, and Honeycomb are typically shared-access tools across engineering teams, making trace data a high-value secondary data store that often lacks the same access controls as the primary database.

// DANGEROUS: raw tool arguments as span attributes
import { trace } from '@opentelemetry/api';

async function handleReadFileTool(args) {
  const span = trace.getActiveSpan();
  span?.setAttribute('tool.args', JSON.stringify(args));       // WRONG: leaks file paths
  span?.setAttribute('tool.result', JSON.stringify(result));   // WRONG: leaks file contents
  span?.setAttribute('user.query', args.query);                 // WRONG: leaks user PII

  const content = await fs.readFile(args.path, 'utf8');
  return content;
}

// -----------------------------------------------------------------------

// SAFE: attribute allowlist with scrubbing

const SAFE_SPAN_ATTRIBUTES = new Set([
  'tool.name',
  'tool.success',
  'tool.duration_ms',
  'mcp.session_id',     // opaque session ID, not user-identifying
  'mcp.request_id',
]);

function setSafeSpanAttributes(span, toolName, args, success) {
  span?.setAttribute('tool.name', toolName);
  span?.setAttribute('tool.success', success);

  // Log arg key names (schema) but never values
  const argKeys = Object.keys(args || {}).sort().join(',');
  span?.setAttribute('tool.arg_schema', argKeys);

  // Explicit safe attributes for specific tools (no free-form values)
  if (toolName === 'read_file') {
    // Log only the file extension, not the full path
    const ext = args.path?.split('.').pop()?.slice(0, 10) ?? 'unknown';
    span?.setAttribute('tool.file_ext', ext);
  }
}

Attack 2: Trace ID as cross-user correlation oracle

When an MCP server returns the traceparent or a custom X-Trace-Id header in tool call responses, and those trace IDs are stored in a shared trace backend, a curious user can query Jaeger or Tempo by trace ID to find trace records from other users' sessions — if the trace backend doesn't enforce per-user access control (most don't by default). Even without direct trace backend access, trace IDs allow correlation: if Alice and Bob both call a shared MCP server tool, and both receive a trace ID in the response, an attacker who can observe both IDs can confirm they share the same session-level attributes (same trace, different spans) — leaking that two API calls belong to the same user session.

// DANGEROUS: returning traceparent in tool response
import { context, propagation } from '@opentelemetry/api';

function buildToolResponse(result) {
  const carrier = {};
  propagation.inject(context.active(), carrier);
  return {
    result,
    _trace: carrier.traceparent,  // WRONG: exposes trace ID to clients
  };
}

// -----------------------------------------------------------------------

// SAFE: internal-only trace IDs, no propagation to clients

function buildToolResponse(result) {
  // Return only the tool result — no tracing metadata
  return { result };
}

// For debugging: log trace IDs server-side only, never in response bodies
async function handleToolCall(toolName, args, sessionId) {
  const span = trace.getActiveSpan();
  const traceId = span?.spanContext()?.traceId;

  // Log correlation data server-side
  logger.info('tool_call', { traceId, toolName, sessionId });

  const result = await executeTool(toolName, args);
  return { result }; // traceId stays server-side
}

Attack 3: Baggage header injection into downstream services

The W3C baggage header (baggage: userId=alice,region=us-east) is part of the OTel context propagation standard. It carries arbitrary key-value pairs across service boundaries. If your MCP server propagates inbound baggage headers from tool call requests into downstream HTTP calls (e.g., calls to an internal API, a database proxy, or another microservice), and those downstream services read baggage values and act on them — for routing, authorization, feature flags, or audit logging — an attacker who can control the baggage header can inject values that affect downstream behavior. Some service mesh implementations route requests based on baggage values; others use them for A/B testing or canary rollouts.

// DANGEROUS: propagating inbound baggage into downstream calls
import { propagation, context, W3CBaggagePropagator } from '@opentelemetry/api';

async function callDownstreamApi(endpoint, toolArgs, inboundHeaders) {
  // Extract OTel context from inbound request — including attacker-controlled baggage
  const inboundCtx = propagation.extract(context.active(), inboundHeaders);

  const outboundHeaders = {};
  // Inject the full context (including inbound baggage) into downstream call — WRONG
  propagation.inject(inboundCtx, outboundHeaders);

  const response = await fetch(endpoint, {
    method: 'POST',
    headers: { ...outboundHeaders, 'Content-Type': 'application/json' },
    body: JSON.stringify(toolArgs),
  });
  return response.json();
}

// -----------------------------------------------------------------------

// SAFE: strip inbound baggage, add only server-generated values

import { context, trace, propagation, baggageEntryMetadataFromString } from '@opentelemetry/api';

async function callDownstreamApiSafe(endpoint, toolArgs, sessionId, userId) {
  // Start with current trace context (span IDs for distributed tracing)
  // but NOT the inbound baggage
  let outboundCtx = context.active();

  // Add only server-controlled baggage values
  const baggage = propagation.createBaggage({
    'mcp.session_id': { value: sessionId },
    'mcp.service': { value: 'skillaudit-mcp' },
    // userId goes as a hashed identifier only, not raw
    'mcp.user_hash': { value: hashUserId(userId) },
  });

  outboundCtx = propagation.setBaggage(outboundCtx, baggage);

  const outboundHeaders = {};
  propagation.inject(outboundCtx, outboundHeaders);

  return fetch(endpoint, {
    method: 'POST',
    headers: { ...outboundHeaders, 'Content-Type': 'application/json' },
    body: JSON.stringify(toolArgs),
  }).then(r => r.json());
}

Attack 4: Sampling bypass disabling security observability

Production MCP servers often use a fractional sampler — e.g., sample 10% of traces — to control trace backend storage costs. The risk: security-critical events (authentication failures, tool call rejections, anomaly detections) are sampled at the same 10% rate and 90% of them disappear from the trace backend. An attacker who generates many failed auth attempts will have most of them silently discarded, making the pattern invisible to security monitoring. The fix is a head-based parent sampler that bumps specific event types to 100% sampling regardless of the base rate.

// DANGEROUS: flat fractional sampler — security events sampled at same rate as normal traffic
import { TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base';

// 10% sampling loses 90% of auth failures, anomalies, and 4xx tool calls
const sampler = new TraceIdRatioBasedSampler(0.1);

// -----------------------------------------------------------------------

// SAFE: composite sampler — 100% for security events, base rate for normal traffic

import {
  ParentBasedSampler,
  TraceIdRatioBasedSampler,
  SamplingDecision,
} from '@opentelemetry/sdk-trace-base';
import { SpanKind, Attributes } from '@opentelemetry/api';

class SecurityAwareSampler {
  constructor(baseRate = 0.1) {
    this._base = new TraceIdRatioBasedSampler(baseRate);
  }

  shouldSample(context, traceId, name, spanKind, attributes, links) {
    // Always sample security-critical spans
    const isSecurityEvent =
      attributes['security.event'] === true ||
      attributes['http.status_code'] >= 400 ||
      name.includes('auth') ||
      name.includes('reject') ||
      name.includes('anomaly') ||
      attributes['tool.success'] === false;

    if (isSecurityEvent) {
      return {
        decision: SamplingDecision.RECORD_AND_SAMPLED,
        attributes: { 'sampling.reason': 'security_event' },
      };
    }

    return this._base.shouldSample(context, traceId, name, spanKind, attributes, links);
  }

  toString() { return 'SecurityAwareSampler'; }
}

// Register in SDK setup:
const sdk = new NodeSDK({
  sampler: new ParentBasedSampler({
    root: new SecurityAwareSampler(0.1),
  }),
});

SkillAudit findings for distributed tracing

SkillAudit Findings — Distributed Tracing

CRITICAL−20 pts: Raw tool arguments including file paths and user queries set as span attributes. Trace backend readable by internal team members exposes all tool call inputs.

HIGH−16 pts: Inbound baggage header from tool call requests propagated unmodified into downstream HTTP calls. Attacker controls baggage values seen by downstream services.

HIGH−14 pts: traceparent or trace ID returned in tool call response bodies or headers. Cross-user correlation via shared trace backend is possible.

MEDIUM−8 pts: Fractional trace sampler discards security events at the same rate as normal traffic. Auth failures and anomaly spans silently disappear from observability backend.

Run a free SkillAudit scan to check whether your MCP server's tracing configuration leaks sensitive data in span attributes, returns trace IDs to clients, or propagates inbound baggage into downstream calls. The scanner instruments a sample trace context and observes what reaches your OTLP exporter endpoint. See also: secrets rotation patterns for managing OTel collector authentication credentials, and audit log integrity for ensuring tracing data is tamper-evident.