MCP Server Security · Performance APIs · Server-Timing Header

MCP server Server-Timing header security — CDN edge metadata exfiltration, database latency leakage, GraphQL resolver timing, and cache tier disclosure

The Server-Timing HTTP response header (RFC 7809) is a performance telemetry channel from server to browser, surfaced as PerformanceResourceTiming.serverTiming[] in client-side JavaScript. Every metric has a name, description, and duration. CDN edges inject routing metadata (Cloudflare PoP identifier, Fastly cache tier, Akamai edge node); backend frameworks inject SQL query counts and wall-clock durations; GraphQL servers inject per-resolver timing with full resolver call trees. An MCP tool running in the same browsing context reads all these metrics from every response without any permission gate — constructing a detailed map of backend infrastructure topology, query complexity, and real-time system load.

Server-Timing metric sources in the wild

SourceTypical metric namesWhat it discloses
CloudflarecfRequestDuration, cfCacheStatus, cfRayEdge request duration, cache HIT/MISS/BYPASS/REVALIDATED, Ray ID (encodes PoP and data center)
Fastlymiss, hit, pass, fetch, synthCache tier traversal path, origin fetch duration, synthetic response indicator
Akamaiedge_time, origin_time, cache_statusEdge vs. origin latency split, cache tier, Akamai PoP identifier in description
Django Debug Toolbar (prod leak)sql, cache, templateSQL query count and total duration, Django cache backend hit/miss count, template render time
Rails rack-mini-profilertotal, sql, redisTotal request duration, SQL query count and duration, Redis operation count
Apollo GraphQL Servergraphql-parsing, graphql-validation, graphql-execution, resolver:User.profilePer-resolver execution times with full resolver path (TypeName.fieldName), parsing/validation overhead
Hasuraexecute, planSQL planning and execution time, distinguishes cached vs. planned queries
Express / Node.js customauth, db, cache, renderApplication-defined middleware timing — developers use these names freely, often leaking operation semantics

Reading Server-Timing metrics from all responses

The serverTiming property on PerformanceResourceTiming entries is available for all same-origin responses (and cross-origin responses that opt in via Timing-Allow-Origin: *). A single observer captures metrics from every API call, page navigation, image, and script load:

const metrics = [];

new PerformanceObserver(list => {
  for (const entry of list.getEntries()) {
    if (!entry.serverTiming?.length) continue;
    metrics.push({
      url: entry.name,         // request URL
      metrics: entry.serverTiming.map(st => ({
        name: st.name,           // e.g. "cfCacheStatus", "sql", "resolver:User.orders"
        desc: st.description,    // e.g. "HIT", "4 queries", "12ms"
        dur:  st.duration        // numeric duration in ms
      }))
    });
  }
  navigator.sendBeacon('/c', JSON.stringify(metrics));
}).observe({ type: 'resource', buffered: true });

The buffered: true flag delivers all resource entries accumulated since page load — the tool retroactively reads Server-Timing metrics from the page's initial HTML request, the login API call, and every background fetch that occurred before the tool was injected.

Attack 1: CDN infrastructure mapping via Cloudflare and Fastly headers

Cloudflare automatically injects Server-Timing metrics into every response when the Server-Timing header is enabled in the Speed settings. The cfRay metric description encodes the Cloudflare Ray ID, which identifies the specific Cloudflare PoP (Point of Presence) and data center that served the request. By reading these metrics across multiple API requests, an MCP tool builds a map of the CDN routing topology:

// From Cloudflare:
// Server-Timing: cfRequestDuration;dur=127, cfCacheStatus;desc=HIT, cfRay;desc=7f3a2b4c8d9e1234-IAD
// IAD = Cloudflare Dulles (Washington DC) data center
// HIT = cached at edge, no origin request
// 127ms = total edge request duration

// From Fastly:
// Server-Timing: miss;dur=0, fetch;dur=234
// miss: cache tier missed (new resource or evicted)
// fetch: 234ms round-trip to origin

// Interpretation:
// - The site uses Cloudflare, routes US-East traffic through IAD
// - Authentication API calls are not cached (cache status: BYPASS)
// - Static assets are cached (HIT) but API responses hit origin
// → Origin server is likely in US-East region; 127ms edge latency suggests origin in same region

Cache status reveals content sensitivity. cfCacheStatus: BYPASS on an API endpoint confirms that the endpoint is not cached — typically because it contains user-specific or authentication-gated content. An attacker infers which endpoints carry sensitive user data from the cache status alone, without reading the response body.

Attack 2: Database query profiling via framework timing headers

Backend frameworks that enable timing middleware in production inadvertently expose their database query behavior to client-side scripts. Django Debug Toolbar, when accidentally left active in production, and rack-mini-profiler in Rails with misconfigured authorization, emit SQL metrics directly to the browser:

// Django (Debug Toolbar enabled in production):
// Server-Timing: sql;dur=45.2;desc="12 queries", cache;dur=3.1;desc="4 hits 1 miss"
// Discloses: 12 SQL queries executed, 45.2ms total, 4/5 cache operations hit

// Rails rack-mini-profiler:
// Server-Timing: total;dur=312, sql;dur=89;desc="7 queries", redis;dur=12;desc="3 ops"
// Discloses: 312ms request, 7 SQL queries in 89ms, 3 Redis operations in 12ms

// Express custom timing middleware (common pattern):
// Server-Timing: db;dur=234;desc="users collection scan"
// Discloses: 234ms database operation described as a collection scan → potential N+1 or missing index

A high SQL query count (>10) combined with high duration (>500ms) on a page suggests an N+1 query pattern. This is architectural intelligence that helps an attacker time load-based attacks or identify which endpoints are expensive enough to cause availability issues under load.

Attack 3: GraphQL resolver timing as data structure inference

Apollo Server's apollo-server-plugin-response-cache and the built-in ApolloServerPluginUsageReporting can inject per-resolver timing metrics into Server-Timing. Each resolver is named with the pattern TypeName.fieldName, exposing the complete GraphQL schema structure:

// Apollo Server with timing plugin:
// Server-Timing: graphql-parsing;dur=2.1, graphql-validation;dur=1.8,
//   resolver:Query.currentUser;dur=45.2,
//   resolver:User.orders;dur=234.5,
//   resolver:Order.lineItems;dur=89.3,
//   resolver:LineItem.product;dur=12.1

// Inferences:
// 1. Schema has types: Query, User, Order, LineItem, Product
// 2. User.orders took 234ms → likely a database query, not cached
// 3. Order.lineItems took 89ms → secondary database query (N+1 risk)
// 4. The query fetched: currentUser → orders → lineItems → product
//    → this is a shopping/e-commerce application with a specific data model

// Combined with the query URL (/graphql), this exposes the full data
// access pattern without reading the response body.

GraphQL schema inference from resolver timing is particularly valuable because many GraphQL APIs don't expose introspection in production but still emit resolver names in Server-Timing.

Attack 4: Real-time system load inference

By sampling Server-Timing duration metrics across multiple requests over time, an MCP tool builds a time-series of backend performance that reveals system load patterns, maintenance windows, and scaling events:

// Sampling auth API over 60 seconds:
// t=0s:   db;dur=12ms  → system at normal load
// t=10s:  db;dur=287ms → database under load (backup job? batch process?)
// t=20s:  db;dur=891ms → high load or connection pool saturation
// t=30s:  db;dur=45ms  → load reduced (batch job completed?)
// t=40s:  db;dur=18ms  → normal

// Inference: database batch jobs run at :00 seconds of some interval
// Attack opportunity: time requests to coincide with db load for cache-timing amplification
// or submit resource-intensive operations during the high-load window

SkillAudit findings for Server-Timing header access

HIGH
Server-Timing metric harvesting via PerformanceResourceTiming.serverTiming — Any access to entry.serverTiming in a PerformanceObserver or getEntriesByType() call that sends the metric values over the network. Exfiltrates CDN routing metadata, backend query counts, and infrastructure topology without reading response bodies.
HIGH
GraphQL resolver timing exfiltration — Reading serverTiming metric names matching the resolver:TypeName.fieldName pattern to infer GraphQL schema structure and data access patterns from Apollo or Hasura responses.
MEDIUM
CDN cache status enumeration for data sensitivity mapping — Reading cfCacheStatus or Fastly cache tier metrics to identify which API endpoints serve user-specific vs. public content, mapping the authenticated-data surface without reading response bodies.
MEDIUM
Database query count/duration monitoring for load pattern analysis — Sampling SQL duration metrics over time to identify batch job schedules, connection pool saturation thresholds, and high-load windows suitable for timing attacks.

Defense

Related: Resource Timing API security · Performance Timeline deep dive · User Timing API security

Scan your MCP server for Server-Timing data exfiltration risks

Paste a GitHub URL. Get a graded security report in 60 seconds.

Run free audit →