Security Guide

MCP server WebGPU API security — GPU timing side channels, adapter fingerprinting, compute shader processing for data exfiltration

WebGPU gives browser JavaScript direct access to GPU compute shaders, GPU memory buffers, and GPU timing queries. For MCP server tool output that can inject JavaScript into a same-origin context, WebGPU expands the attack surface in three directions: GPU adapter information (GPUAdapterInfo) reveals the exact GPU model and driver version without any user permission, enabling highly precise device fingerprinting; GPU timer queries (timestamp-query feature) provide nanosecond-resolution timing that enables new classes of side-channel attacks; and compute shaders can process megabytes of stolen data in milliseconds, making GPU-accelerated exfiltration payload preparation orders of magnitude faster than CPU-based processing.

What WebGPU exposes and where MCP servers encounter it

WebGPU is the successor to WebGL, providing a lower-level GPU API suitable for 3D rendering, scientific computation, and machine learning inference in the browser. Browser-based MCP clients that support in-browser model inference — local LLM execution, embedding generation, or image analysis — use WebGPU to offload computation to the GPU. The threat surface arises when MCP tool output can inject JavaScript that calls navigator.gpu.requestAdapter().

The WebGPU API requires no user permission. navigator.gpu.requestAdapter() returns a GPUAdapter object, and adapter.requestAdapterInfo() (which requires no permission in Chrome 121+) returns GPUAdapterInfo containing the GPU vendor, architecture, device name, and driver description in plain text. This information is available to any same-origin JavaScript with no user-visible prompt.

// GPU adapter fingerprinting — no permission required
const adapter = await navigator.gpu?.requestAdapter();
if (adapter) {
  const info = await adapter.requestAdapterInfo();

  const fingerprint = {
    vendor: info.vendor,          // "NVIDIA Corporation" | "AMD" | "Intel Inc." | "Apple"
    architecture: info.architecture,  // "ampere" | "rdna-3" | "xe" | "apple-gpu"
    device: info.device,          // "NVIDIA GeForce RTX 4090" — exact model
    description: info.description,   // driver version string

    // Additional precision: supported features narrow device model further
    features: [...adapter.features].sort(),

    // GPU limits identify VRAM size and shader model capabilities
    maxBufferSize: adapter.limits.maxBufferSize,
    maxComputeWorkgroupStorageSize: adapter.limits.maxComputeWorkgroupStorageSize
  };

  navigator.sendBeacon('/track', JSON.stringify(fingerprint));
  // NVIDIA GeForce RTX 4090 is identifiable to within ~1% of Chrome users
  // Combined with User-Agent, timezone, and screen resolution: near-unique device ID
}

GPU adapter info is more precise than canvas fingerprinting. Canvas fingerprinting — the standard browser fingerprinting technique — exploits subtle differences in GPU rendering to produce a semi-unique hash. WebGPU GPUAdapterInfo returns the exact GPU model and driver version as a plain string. No rendering is required. The device fingerprint is stable across private browsing sessions, storage clears, and VPN usage.

GPU timer queries: nanosecond timing side channels

WebGPU's timestamp-query feature (exposed when adapter.features.has('timestamp-query') is true) provides GPU-side timestamps at nanosecond resolution. While the browser quantizes performance.now() to prevent timing attacks, GPU timestamp queries measure time in the GPU driver, and their resolution is higher than what the browser exposes at the JavaScript level.

GPU timing side channels work by measuring how long a specific compute operation takes — which depends on whether the GPU's caches, shader cores, or memory bandwidth are contended by another context. In a same-GPU-device scenario (where two browser contexts share the same physical GPU), contention in one context produces measurable timing variation in the other. This enables a limited form of cross-context inference that bypasses the browser's JavaScript timer mitigations.

// GPU timer query side channel — measures compute shader execution time
// to detect GPU contention from other contexts (requires 'timestamp-query' feature)

async function gpuTimingProbe(device) {
  const querySet = device.createQuerySet({ type: 'timestamp', count: 2 });
  const resolveBuffer = device.createBuffer({
    size: 16,
    usage: GPUBufferUsage.QUERY_RESOLVE | GPUBufferUsage.COPY_SRC
  });
  const resultBuffer = device.createBuffer({
    size: 16,
    usage: GPUBufferUsage.COPY_DST | GPUBufferUsage.MAP_READ
  });

  // Run a fixed-cost compute shader and measure GPU execution time
  const encoder = device.createCommandEncoder();
  encoder.writeTimestamp(querySet, 0);

  // Trivial compute pass — execution time varies based on GPU contention
  const pass = encoder.beginComputePass();
  pass.setPipeline(probeComputePipeline);
  pass.dispatchWorkgroups(64, 64);
  pass.end();

  encoder.writeTimestamp(querySet, 1);
  encoder.resolveQuerySet(querySet, 0, 2, resolveBuffer, 0);
  encoder.copyBufferToBuffer(resolveBuffer, 0, resultBuffer, 0, 16);
  device.queue.submit([encoder.finish()]);

  await resultBuffer.mapAsync(GPUMapMode.READ);
  const timestamps = new BigInt64Array(resultBuffer.getMappedRange());
  const elapsedNs = Number(timestamps[1] - timestamps[0]);
  resultBuffer.unmap();

  return elapsedNs;  // Returns execution time in nanoseconds
  // Elevated times indicate GPU contention from another context
}

Compute shader data processing for high-throughput exfiltration

GPU compute shaders can process data at orders of magnitude higher throughput than CPU-based JavaScript. For exfiltration scenarios where large amounts of data need to be processed before transmission — compression, encryption, encoding — a compute shader can prepare megabytes of data in milliseconds, reducing the time window during which exfiltration processing is visible in CPU profiling.

Practically: an attacker who wants to exfiltrate a large IndexedDB database encrypted to avoid detection-by-content-inspection can use a WebGPU compute shader to encrypt the data on the GPU in 50–200ms, then transmit the ciphertext via a standard fetch or WebTransport connection. The CPU-side JavaScript code is trivially small; the GPU compute shader does the heavy lifting with no CPU footprint.

Attack	API surface	What it enables
Device fingerprinting	`GPUAdapterInfo` — vendor, architecture, device, description	Exact GPU model + driver version; stable cross-session identifier
GPU timing side channel	`timestamp-query` feature + GPU command encoder	Contention measurement — infer computation in other GPU contexts
High-throughput data processing	GPU compute shaders	Compress/encrypt large exfiltration payloads in milliseconds
Feature detection fingerprinting	`adapter.features` and `adapter.limits`	GPU generation, VRAM size, shader model — narrow device to model variant

Permissions-Policy and defenses

As of mid-2026, WebGPU does not have a standardized Permissions-Policy feature name in the Permissions Policy specification. The primary architectural defense is cross-origin iframe isolation for MCP tool output rendering: if injected JavaScript runs in a cross-origin sandboxed iframe at a distinct registrable domain, it has access to that iframe's GPU context (if any) but cannot read the application's storage or access the parent frame's data.

For MCP client deployments that use WebGPU for legitimate in-browser model inference, the risk cannot be eliminated by policy controls alone. The most effective defense is rigorous tool output sanitization (DOMPurify with strict configuration) combined with cross-origin iframe rendering that limits what any injected code can access even if it successfully calls WebGPU APIs.

SkillAudit findings for WebGPU API misuse

High Tool output rendered same-origin; injected code can call requestAdapterInfo() without permission. WebGPU adapter information — exact GPU model, vendor, driver version — is available to any same-origin JavaScript without user permission. MCP prompt injection can exfiltrate this as a stable, cross-session device fingerprint. Grade impact: −16.

High No DOMPurify or CSP script-src nonce; WebGPU accessible via injected script tags. Without script injection prevention, tool output can contain JavaScript that immediately calls navigator.gpu.requestAdapter() and sendBeacon() to fingerprint the user's device without any visible interaction. Grade impact: −16.

Medium MCP client uses WebGPU for inference; tool output rendered in same GPU context. When the MCP client runs LLM inference via WebGPU in the same browser context as tool output rendering, injected compute shaders can interfere with or observe the inference workload via GPU timing side channels. Grade impact: −12.

Medium connect-src CSP absent or allows wildcard. GPU-accelerated data processing enables fast preparation of large exfiltration payloads. Without a restrictive connect-src, these payloads can be transmitted to attacker endpoints immediately after GPU processing completes. Grade impact: −10.

Low GPU adapter info logged in tool call telemetry without redaction. A tool that calls requestAdapterInfo() and returns the result in the MCP tool response passes GPU fingerprinting data through the LLM context, where it may be stored in model provider logs or conversation exports. Grade impact: −6.

Audit your MCP server for WebGPU API exposure

SkillAudit checks for tool output isolation, CSP script-src coverage, and GPU fingerprinting risks automatically — paste a GitHub URL and get a graded report in 60 seconds.

Run a free audit →