Security Guide
MCP server Web Neural Network API security — NPU/GPU backend fingerprinting, ML inference timing side channels, hardware identification
The Web Neural Network API (navigator.ml) exposes browser access to the device's neural processing hardware — NPU, GPU, or CPU fallback. navigator.ml.createContext() returns an MLContext for whichever backend the device provides, and NPU backend availability is a high-precision hardware identifier: only Apple Silicon (M1+), Qualcomm Snapdragon X Elite, and Intel Core Ultra CPUs currently provide NPU backends to WebNN. ML inference timing side channels enable detection of LLM inference activity in adjacent browser contexts sharing the same hardware. No Permissions-Policy directive controls WebNN access.
WebNN hardware backend enumeration: NPU as device fingerprint
The WebNN API allows JavaScript to request specific hardware backends via the deviceType option: 'npu', 'gpu', or 'cpu'. The browser returns an MLContext if the requested backend is available, or throws NotSupportedError if it is not. Probing all three backends reveals whether the device has an NPU, which GPU architecture is available, and which CPU path the browser falls back to.
This enumeration is particularly valuable as a fingerprinting primitive because NPU availability is extremely hardware-specific. In 2026, the devices that provide a browser-accessible NPU backend are: Apple M1/M2/M3/M4 family (including MacBook, Mac Mini, iPad Pro, iPhone via WKWebView), Qualcomm Snapdragon X Elite and X Plus Copilot+ PCs, and Intel Core Ultra (Meteor Lake) CPUs. All other devices return NotSupportedError for deviceType: 'npu'. This is a three-way hardware split that narrows device identity to a specific platform family.
// WebNN hardware backend fingerprint — no permission required
async function webnnFingerprint() {
const results = {};
const backends = ['npu', 'gpu', 'cpu'];
for (const deviceType of backends) {
try {
const context = await navigator.ml.createContext({ deviceType });
results[deviceType] = {
available: true,
// contextType reveals additional backend detail in some implementations
type: context.constructor.name
};
} catch (e) {
results[deviceType] = {
available: false,
error: e.name // 'NotSupportedError' | 'SecurityError'
};
}
}
// Interpretation:
// npu: true → Apple Silicon, Snapdragon X Elite, or Intel Core Ultra
// npu: false + gpu: true → older Intel/AMD desktop/laptop GPU
// npu: false + gpu: false + cpu: true → server, VM, or old device
// npu: true and platform navigator.platform === 'MacIntel' → Apple Silicon (Rosetta)
navigator.sendBeacon('/track', JSON.stringify({
webnn: results,
platform: navigator.platform,
timestamp: Date.now()
}));
}
NPU availability is one of the most precise device-type fingerprints available without a permission prompt. Fewer than 15% of browser-capable devices in 2026 have an NPU accessible to WebNN. Within that 15%, the NPU generation (M1 vs M4 vs Snapdragon X vs Intel Core Ultra) can be discriminated further by ML inference throughput measurements. This fingerprint is stable, cannot be spoofed without hardware modification, and persists across private browsing sessions.
ML inference timing side channels
When injected JavaScript runs an ML graph on the WebNN GPU or NPU backend, the execution time depends on whether the same hardware accelerator is being used concurrently by another browser context. In an MCP client that runs on-device LLM inference (a common pattern for privacy-preserving local AI assistants in 2026), the MCP client itself continuously uses the NPU or GPU for token generation. Injected WebNN inference in a tool output context that shares the same hardware will experience contention, producing timing variations that can detect whether LLM inference is actively running.
This is a novel attack vector specific to AI-native MCP deployments: the very hardware infrastructure that makes local LLM inference possible (dedicated NPU) creates a timing oracle that tool output injections can exploit to infer the application's internal computational state.
| Attack | WebNN surface | What it reveals |
|---|---|---|
| Hardware backend fingerprint | createContext({deviceType}) probe |
NPU/GPU/CPU availability — device platform family (Apple, Qualcomm, Intel, other) |
| NPU generation timing | ML graph execution throughput | Discriminates M1 from M4, Snapdragon X from Intel Core Ultra by TOPS measurement |
| LLM inference detection | NPU contention timing side channel | Detects whether the MCP client is actively running LLM inference on the same hardware |
| Cross-context ML workload inference | GPU backend contention measurement | Detects other tabs running image generation, embedding computation, or video ML |
Permissions-Policy gap and defenses
As of mid-2026, the WebNN specification does not define a Permissions-Policy feature name. The API is available to any same-origin JavaScript that calls navigator.ml.createContext(). The architectural defense for MCP deployments is cross-origin sandboxed iframe rendering: tool output JavaScript running in a cross-origin context cannot access the application's data to exfiltrate, even if WebNN hardware enumeration remains possible within the sandboxed context. For MCP clients that run on-device LLM inference via WebNN, additional monitoring of ML context creation in the same origin is advisable.
SkillAudit findings for WebNN API exposure
Audit your MCP server for WebNN hardware fingerprinting risks
SkillAudit checks for tool output isolation, ML API exposure, and hardware fingerprinting attack surfaces — paste a GitHub URL and get a graded security report in 60 seconds.
Run a free audit →