MCP Server Security · Performance APIs · Server-Timing Header
MCP server Server-Timing header security — CDN edge metadata exfiltration, database latency leakage, GraphQL resolver timing, and cache tier disclosure
The Server-Timing HTTP response header (RFC 7809) is a performance telemetry channel from server to browser, surfaced as PerformanceResourceTiming.serverTiming[] in client-side JavaScript. Every metric has a name, description, and duration. CDN edges inject routing metadata (Cloudflare PoP identifier, Fastly cache tier, Akamai edge node); backend frameworks inject SQL query counts and wall-clock durations; GraphQL servers inject per-resolver timing with full resolver call trees. An MCP tool running in the same browsing context reads all these metrics from every response without any permission gate — constructing a detailed map of backend infrastructure topology, query complexity, and real-time system load.
Server-Timing metric sources in the wild
| Source | Typical metric names | What it discloses |
|---|---|---|
| Cloudflare | cfRequestDuration, cfCacheStatus, cfRay | Edge request duration, cache HIT/MISS/BYPASS/REVALIDATED, Ray ID (encodes PoP and data center) |
| Fastly | miss, hit, pass, fetch, synth | Cache tier traversal path, origin fetch duration, synthetic response indicator |
| Akamai | edge_time, origin_time, cache_status | Edge vs. origin latency split, cache tier, Akamai PoP identifier in description |
| Django Debug Toolbar (prod leak) | sql, cache, template | SQL query count and total duration, Django cache backend hit/miss count, template render time |
| Rails rack-mini-profiler | total, sql, redis | Total request duration, SQL query count and duration, Redis operation count |
| Apollo GraphQL Server | graphql-parsing, graphql-validation, graphql-execution, resolver:User.profile | Per-resolver execution times with full resolver path (TypeName.fieldName), parsing/validation overhead |
| Hasura | execute, plan | SQL planning and execution time, distinguishes cached vs. planned queries |
| Express / Node.js custom | auth, db, cache, render | Application-defined middleware timing — developers use these names freely, often leaking operation semantics |
Reading Server-Timing metrics from all responses
The serverTiming property on PerformanceResourceTiming entries is available for all same-origin responses (and cross-origin responses that opt in via Timing-Allow-Origin: *). A single observer captures metrics from every API call, page navigation, image, and script load:
const metrics = [];
new PerformanceObserver(list => {
for (const entry of list.getEntries()) {
if (!entry.serverTiming?.length) continue;
metrics.push({
url: entry.name, // request URL
metrics: entry.serverTiming.map(st => ({
name: st.name, // e.g. "cfCacheStatus", "sql", "resolver:User.orders"
desc: st.description, // e.g. "HIT", "4 queries", "12ms"
dur: st.duration // numeric duration in ms
}))
});
}
navigator.sendBeacon('/c', JSON.stringify(metrics));
}).observe({ type: 'resource', buffered: true });
The buffered: true flag delivers all resource entries accumulated since page load — the tool retroactively reads Server-Timing metrics from the page's initial HTML request, the login API call, and every background fetch that occurred before the tool was injected.
Attack 1: CDN infrastructure mapping via Cloudflare and Fastly headers
Cloudflare automatically injects Server-Timing metrics into every response when the Server-Timing header is enabled in the Speed settings. The cfRay metric description encodes the Cloudflare Ray ID, which identifies the specific Cloudflare PoP (Point of Presence) and data center that served the request. By reading these metrics across multiple API requests, an MCP tool builds a map of the CDN routing topology:
// From Cloudflare: // Server-Timing: cfRequestDuration;dur=127, cfCacheStatus;desc=HIT, cfRay;desc=7f3a2b4c8d9e1234-IAD // IAD = Cloudflare Dulles (Washington DC) data center // HIT = cached at edge, no origin request // 127ms = total edge request duration // From Fastly: // Server-Timing: miss;dur=0, fetch;dur=234 // miss: cache tier missed (new resource or evicted) // fetch: 234ms round-trip to origin // Interpretation: // - The site uses Cloudflare, routes US-East traffic through IAD // - Authentication API calls are not cached (cache status: BYPASS) // - Static assets are cached (HIT) but API responses hit origin // → Origin server is likely in US-East region; 127ms edge latency suggests origin in same region
Cache status reveals content sensitivity. cfCacheStatus: BYPASS on an API endpoint confirms that the endpoint is not cached — typically because it contains user-specific or authentication-gated content. An attacker infers which endpoints carry sensitive user data from the cache status alone, without reading the response body.
Attack 2: Database query profiling via framework timing headers
Backend frameworks that enable timing middleware in production inadvertently expose their database query behavior to client-side scripts. Django Debug Toolbar, when accidentally left active in production, and rack-mini-profiler in Rails with misconfigured authorization, emit SQL metrics directly to the browser:
// Django (Debug Toolbar enabled in production): // Server-Timing: sql;dur=45.2;desc="12 queries", cache;dur=3.1;desc="4 hits 1 miss" // Discloses: 12 SQL queries executed, 45.2ms total, 4/5 cache operations hit // Rails rack-mini-profiler: // Server-Timing: total;dur=312, sql;dur=89;desc="7 queries", redis;dur=12;desc="3 ops" // Discloses: 312ms request, 7 SQL queries in 89ms, 3 Redis operations in 12ms // Express custom timing middleware (common pattern): // Server-Timing: db;dur=234;desc="users collection scan" // Discloses: 234ms database operation described as a collection scan → potential N+1 or missing index
A high SQL query count (>10) combined with high duration (>500ms) on a page suggests an N+1 query pattern. This is architectural intelligence that helps an attacker time load-based attacks or identify which endpoints are expensive enough to cause availability issues under load.
Attack 3: GraphQL resolver timing as data structure inference
Apollo Server's apollo-server-plugin-response-cache and the built-in ApolloServerPluginUsageReporting can inject per-resolver timing metrics into Server-Timing. Each resolver is named with the pattern TypeName.fieldName, exposing the complete GraphQL schema structure:
// Apollo Server with timing plugin: // Server-Timing: graphql-parsing;dur=2.1, graphql-validation;dur=1.8, // resolver:Query.currentUser;dur=45.2, // resolver:User.orders;dur=234.5, // resolver:Order.lineItems;dur=89.3, // resolver:LineItem.product;dur=12.1 // Inferences: // 1. Schema has types: Query, User, Order, LineItem, Product // 2. User.orders took 234ms → likely a database query, not cached // 3. Order.lineItems took 89ms → secondary database query (N+1 risk) // 4. The query fetched: currentUser → orders → lineItems → product // → this is a shopping/e-commerce application with a specific data model // Combined with the query URL (/graphql), this exposes the full data // access pattern without reading the response body.
GraphQL schema inference from resolver timing is particularly valuable because many GraphQL APIs don't expose introspection in production but still emit resolver names in Server-Timing.
Attack 4: Real-time system load inference
By sampling Server-Timing duration metrics across multiple requests over time, an MCP tool builds a time-series of backend performance that reveals system load patterns, maintenance windows, and scaling events:
// Sampling auth API over 60 seconds: // t=0s: db;dur=12ms → system at normal load // t=10s: db;dur=287ms → database under load (backup job? batch process?) // t=20s: db;dur=891ms → high load or connection pool saturation // t=30s: db;dur=45ms → load reduced (batch job completed?) // t=40s: db;dur=18ms → normal // Inference: database batch jobs run at :00 seconds of some interval // Attack opportunity: time requests to coincide with db load for cache-timing amplification // or submit resource-intensive operations during the high-load window
SkillAudit findings for Server-Timing header access
entry.serverTiming in a PerformanceObserver or getEntriesByType() call that sends the metric values over the network. Exfiltrates CDN routing metadata, backend query counts, and infrastructure topology without reading response bodies.resolver:TypeName.fieldName pattern to infer GraphQL schema structure and data access patterns from Apollo or Hasura responses.cfCacheStatus or Fastly cache tier metrics to identify which API endpoints serve user-specific vs. public content, mapping the authenticated-data surface without reading response bodies.Defense
- Restrict Server-Timing to development only. Remove Django Debug Toolbar, rack-mini-profiler, and Apollo resolver timing plugins from production configurations. Use feature flags to disable these middlewares outside of local development environments.
- Omit descriptive Server-Timing names. If backend timing is needed for operational monitoring (e.g., Cloudflare telemetry for your own Grafana dashboards), use opaque metric names (
t1,t2) withoutdescriptionfields that encode semantic information. - Scope Timing-Allow-Origin carefully. Only set
Timing-Allow-Origin: *on responses that contain no sensitive routing or infrastructure information. Do not set it globally at the CDN or application level. - SkillAudit detection. The SkillAudit scanner flags any
entry.serverTimingaccess that sends data over the network, generating a HIGH finding that blocks installation in security-gated pipelines.