· 19 min read · Compliance · Healthcare

MCP Server Security for Healthcare: HIPAA, PHI Protection, and Agentic AI Tool Access

Healthcare organizations are deploying AI agents that call MCP servers to query EHR APIs, retrieve FHIR resources, and pull clinical decision-support data. Every one of those MCP servers that handles Protected Health Information is now HIPAA Business Associate Agreement territory — with breach notification obligations, Technical Safeguard requirements, and audit logging mandates that most MCP server authors never anticipated.

63%
MCP servers log request arguments in plaintext
$10.9M
Avg. HIPAA breach cost (IBM 2025)
180
Days HIPAA audit log retention minimum
0
MCP servers in SkillAudit corpus with HIPAA-explicit logging

Why MCP servers are a HIPAA problem

HIPAA's Security Rule applies to Electronic Protected Health Information (ePHI) — any individually identifiable health information stored or transmitted electronically. The rule was written in an era of database servers and web applications. It predates large language models, tool-calling agents, and the Model Context Protocol entirely.

But the law does not care about the vintage of the technology. If a system stores, transmits, or handles ePHI on behalf of a Covered Entity, it is a Business Associate under 45 CFR 160.103, and HIPAA's Technical Safeguards at 45 CFR 164.312 apply to it in full.

An MCP server that exposes a tool like get_patient_record(mrn: string), query_ehr(patient_id: string, date_range: object), or search_clinical_notes(patient: string, keywords: string[]) is — by definition — handling ePHI. The tool arguments contain PHI. The tool responses contain PHI. The server's logs, if they capture arguments, contain PHI. Its in-memory state during a call contains PHI.

The BAA gap. Most healthcare organizations have BAAs with their EHR vendors, their cloud providers, and their SaaS tools. Very few have evaluated whether their AI infrastructure — including the MCP server layer — needs its own BAA treatment. If an MCP server is a third-party component deployed by a vendor, that vendor is almost certainly a Business Associate that needs a signed BAA before going live with any PHI-adjacent tool.

The PHI-in-tool-arguments problem

Standard MCP server implementations inherit from whatever logging framework is standard for their language ecosystem. In Node.js, that typically means console.log or a structured logger like Winston or Pino, both of which by default will serialize and log the entire request context — including tool call arguments.

The result is that a tool call like:

// Typical MCP tool handler in Node.js server.tool("query_patient", { patient_id: z.string(), date_range: z.object({ start: z.string(), end: z.string() }) }, async (args) => { // logger.info("Tool called", args) <-- THIS logs PHI to stdout const result = await ehrClient.getRecords(args.patient_id, args.date_range); // return result <-- result may also contain PHI });

...will write patient identifiers (MRNs, names, DOBs, procedure codes) to whatever the default log sink is — often a container stdout that flows to a central logging service with no PHI handling controls, no encryption at rest, and no 6-year retention management.

This is not a theoretical risk. In our scan of the SkillAudit corpus, 63% of MCP servers with database or API integration tools include request argument logging at info level or above. For servers in the healthcare domain specifically, that is a per-call PHI disclosure event.

HIPAA Technical Safeguards: what applies to MCP servers

45 CFR 164.312 specifies five categories of Technical Safeguards. Here is how each maps to MCP server architecture:

164.312(a)(1) Access Control

Unique user identification, automatic logoff, encryption and decryption. For MCP servers: every tool call must be authenticated to a specific user (not a shared service account). Sessions must time out. PHI returned from tools must be encrypted in transit. The common pattern of a single shared API key for all agent tool calls fails this requirement if multiple human users are represented by those agents.

164.312(a)(2)(i) Unique User Identification

Each user must have a unique identifier tracked through access. An AI agent acting on behalf of multiple users — the common architecture for multi-patient workflows — must pass per-call user context to the MCP server, not use a single agent identity. The MCP server must propagate this identity to the underlying EHR API and include it in all audit records.

164.312(b) Audit Controls

Hardware, software, and procedural mechanisms that record and examine activity in information systems that contain or use ePHI. This is the most operationally intensive requirement for MCP servers. Every tool call that reads or writes PHI must be logged with: timestamp, user identifier, tool name, resource identifier accessed, outcome (success/failure). Logs must be protected from modification and retained for a minimum of 6 years (the HIPAA records retention standard). Plain-text argument logging that captures PHI is not compliant — it creates an uncontrolled PHI store.

164.312(c)(1) Integrity

Protect ePHI from improper alteration or destruction. MCP tool responses containing PHI must be tamper-evident. If a tool call returns clinical data and that data is cached anywhere in the agent pipeline, the cache must enforce integrity. This is especially relevant for multi-step agents that may pass PHI through several tool calls — each hop is a potential integrity failure point.

164.312(e)(1) Transmission Security

Guard against unauthorized access to ePHI transmitted over an electronic communications network. MCP servers must use TLS 1.2 minimum (TLS 1.3 preferred) for all connections. Certificate validation must not be disabled. Connections to upstream EHR APIs must also enforce TLS with certificate pinning or at minimum hostname verification. A sandboxed MCP server that allows outbound HTTP connections to non-TLS endpoints fails this requirement immediately.

The minimum necessary standard and MCP tool design

45 CFR 164.514(d) — the "minimum necessary" standard — requires that covered entities limit uses and disclosures of PHI to the minimum amount necessary to accomplish the intended purpose. This creates a direct design constraint on MCP tool interfaces.

A tool like get_patient_record(patient_id) that returns the entire patient record — demographics, visit history, medications, billing data, social history — when the calling agent only needs the most recent HbA1c result is not minimum-necessary compliant, even if the agent discards the rest of the response before sending it to the LLM context.

The reason: PHI was accessed beyond the minimum necessary scope. The EHR system logged a full-record access. The MCP server received a full-record response. The agentic system had an ephemeral window where PHI was in memory. Each of those events is a disclosure.

Minimum-necessary-compliant MCP tool design looks like this:

Audit logging requirements in practice

164.312(b) is the requirement teams most often underbuild. The OCR (Office for Civil Rights) in its audit guidance specifies that compliant audit logs must support answering: who accessed what PHI, when, from where, and with what result?

For MCP servers, this means every tool call must emit a structured audit event — separate from the application log — containing:

// HIPAA-compliant audit event structure for an MCP tool call { "event_type": "phi_access", "timestamp": "2026-06-14T14:23:11.456Z", // ISO 8601, UTC "user_id": "user_8a7f3c2d", // unique, durable user identifier "agent_session_id": "sess_b4e9f1a0", // the LLM session that initiated this call "tool_name": "get_latest_lab_result", "resource_type": "Observation", // FHIR resource type or equivalent "resource_id": "obs-mrn-ref:encrypted", // patient ref — tokenized, NOT raw MRN "access_scope": ["loinc:4548-4"], // what was requested "outcome": "success", "response_record_count": 1, "duration_ms": 234, "source_ip": "10.0.4.22", "mcp_server_version": "1.4.2" }

Critical: the resource_id field must not contain the raw patient MRN or SSN. Audit logs are long-retained (minimum 6 years) and the PHI inside them creates their own compliance surface. Use a tokenized patient reference that can be resolved back to the MRN only by an authorized administrator with access to the token map.

Application logs — where the MCP framework logs tool call arguments for debugging — must be either scrubbed of PHI before writing or restricted to a separate, PHI-classified log pipeline with its own access controls, encryption, and retention management. Running the same logging pipeline for debug logs and HIPAA audit logs is a compliance gap.

Access control: per-user authorization at the MCP layer

HIPAA's access control standard requires that each user have only the access they need for their role. In healthcare AI deployments, this is complicated by the agentic model: the LLM makes the tool call, not the human user — yet the access must still be scoped to the human user's authorization.

The correct implementation pattern is:

  1. The human user authenticates to the agent application using their SSO/IdP credentials (SAML, OIDC).
  2. The agent application exchanges that assertion for a short-lived, scoped access token bound to the specific user and their clinical role.
  3. Every MCP tool call from that agent session includes the user-bound token in the Authorization header.
  4. The MCP server validates the token on every call (not just at session start — see checklist item #1), extracts the user and role claims, and checks those claims against the tool being called.
  5. The EHR API call is made with the user-specific token, not a service account credential.

This ensures the EHR's own access controls (RBAC, break-the-glass policies, care team restrictions) remain in effect even when access comes via an AI agent. A hospitalist agent should not be able to access psychiatry notes any more than the hospitalist human user can.

Many community MCP servers for EHR integration use a static API key for all calls — a single credential stored in environment variables that grants broad access regardless of who the calling user is. This is not HIPAA-compliant for multi-user deployments. It is also exactly what our secrets management audit flags as a credential hygiene failure.

Encryption requirements

HIPAA requires encryption of ePHI at rest and in transit. For MCP servers:

Component Encryption requirement Common failure mode Severity
MCP server transport TLS 1.2+ for all connections (including local stdio connections to HTTP-based upstream APIs) Development-mode HTTP used in production; TLS verification disabled for internal hosts Critical
Tool argument caching Any cache that stores PHI-containing arguments must be encrypted at rest Redis cache without encryption-at-rest; memcached without TLS Critical
Audit log storage Audit logs containing tokenized PHI references must be encrypted at rest Log files on unencrypted filesystem; CloudWatch without KMS key High
Temporary files Tool handlers that write temp files (e.g., for processing large responses) must write to encrypted volumes /tmp writes on unencrypted instances High
Container image Credentials in image layers must be removed; base images must be from trusted registries Hardcoded API keys in Dockerfile ENV instructions Critical
Backup/snapshot EBS snapshots, database backups containing PHI must be encrypted RDS automated backups without encryption; manual EBS snapshots without KMS Medium

Breach notification and MCP tool calls

HIPAA's Breach Notification Rule (45 CFR 164.400–414) requires Covered Entities and Business Associates to notify affected individuals, HHS, and sometimes the media within 60 days of discovering a breach of unsecured PHI.

For MCP servers, the most likely breach scenarios are:

Prompt injection leading to PHI disclosure. An attacker-controlled document in the agent's context contains an injection payload that causes the LLM to call an MCP tool with a different patient's identifier than intended, or to exfiltrate a PHI-containing tool response to an external URL. This is a HIPAA breach. It is also the scenario our prompt injection post covers in depth.

Log exfiltration. Application logs containing PHI in tool call arguments are exfiltrated (through misconfigured S3 bucket ACLs, exposed Elasticsearch clusters, or compromised log aggregation credentials). If the logs contain PHI and they reach an unauthorized party, it is a breach — regardless of whether the MCP server itself was compromised.

Credential compromise leading to bulk PHI access. If the static API key used by an MCP server for EHR access is leaked (through a public git commit, an exposed environment variable, a misconfigured secret manager), an attacker can call the EHR API directly without going through the MCP server. The MCP server's access controls are irrelevant if the underlying credentials are exposed.

Unauthorized agent access. Multi-tenant agent deployments where user A's agent can call tools on user B's behalf — a scope validation failure — constitute a breach of user B's PHI even if both users are patients at the same institution.

The 60-day clock. The breach notification clock starts when the Business Associate discovers the breach — not when the Covered Entity learns of it. If an MCP server vendor discovers a breach on day 1, they must notify the Covered Entity in time for the CE to make their 60-day notification. Best practice: BAAs should require Business Associates to notify within 10 business days of discovery. MCP server vendors should have incident response procedures that meet this timeline.

What a HIPAA-compliant MCP server looks like

Combining the above requirements, a HIPAA-compliant MCP server for healthcare must implement:

What SkillAudit flags in healthcare MCP server audits

SkillAudit's automated scan cannot verify organizational safeguards (workforce training, BAA existence, incident response procedures) — those require a manual assessment. But the automated scan detects the technical signals that indicate HIPAA-relevant risk:

An MCP server with F or D grades in any of these axes is not ready for healthcare deployment. C-grade servers require manual review to determine whether the flagged patterns actually handle PHI or are confined to non-PHI data paths. A-grade servers are the starting point for a healthcare HIPAA review — they still require the organizational safeguards layer, but they clear the basic technical hurdles.

Healthcare organizations evaluating AI vendor MCP servers should request a SkillAudit report as part of the BAA due-diligence process. The automated report does not replace a full HIPAA risk analysis, but it surfaces the issues that would turn a clean vendor assessment into a compliance finding before the server reaches production.

← Related
MCP Server Secrets Management: Vault, AWS Secrets Manager, and the Credential Lifecycle
Related →
The MCP Server Security Review Checklist: 50 Questions for Teams