Topic: mcp server secrets management security

MCP server secrets management security — Vault integration, dynamic secrets, secret zero bootstrap

Static credentials in MCP server environment variables are a ticking clock: they never rotate, they persist in shell history and CI logs, and one compromised deployment means a credential that stays valid indefinitely. The harder problem is secret zero — how does the MCP server get its first credential to fetch all the others? This page covers the full stack: AppRole auth for the bootstrap, dynamic database credentials with 1-hour TTL, Kubernetes ServiceAccount token projection for container workloads, and a TTL-based secret cache class that handles re-fetch before a credential expires in flight.

What SkillAudit's secrets axis checks

Static VAULT_TOKEN in env — a VAULT_TOKEN environment variable that is a long-lived root or orphan token rather than a short-TTL AppRole or workload identity token. Flagged HIGH because a leaked static root token grants full Vault access with no expiry.
Module-level credential const without TTL — const DB_PASS = await fetchSecret() at module initialization time, never re-fetched. Flagged WARN because the in-memory credential outlives the Vault lease, leaving the server using an invalid credential until the next restart.
DATABASE_URL with inline password in env — DATABASE_URL=postgres://user:password@host/db in any tracked .env or config file. Flagged HIGH; the password never rotates and is visible in any environment variable dump.
No rotation path in handler code — tool handlers that catch database authentication errors (28P01 in PostgreSQL) but do not trigger a secret re-fetch. Flagged WARN; when Vault rotates the database password the server has no recovery path without a restart.

The secret zero bootstrap problem

Every secrets management architecture bottoms out at a credential the process must have before it can fetch any other credential — the "secret zero." For MCP servers, this is usually a Vault token or cloud IAM credential. The most common (and most dangerous) pattern is a long-lived static VAULT_TOKEN passed in via environment variable:

// DANGEROUS: static root Vault token in environment variable
// This token never expires, has full Vault access, and is visible
// in process listings, CI logs, and any environment dump.

import vault from 'node-vault';

const client = vault({
  apiVersion: 'v1',
  endpoint: process.env.VAULT_ADDR,
  token: process.env.VAULT_TOKEN,   // long-lived root/orphan token — NEVER DO THIS
});

// Later in a tool handler:
export async function getDbPassword() {
  const { data } = await client.read('secret/data/db-password');
  return data.data.password;
}

The failure modes: the root token appears in ps auxe output, in any debugger that dumps environment variables, in CI pipeline logs when the deployment command is printed, and in any crash report. If Vault audit logging is enabled, every secret read is attributed to a root token that cannot be tied to a specific deployment or revoked selectively.

Solution: AppRole authentication for secret zero

AppRole splits the initial credential into two parts: a roleId (non-secret, embeddable in the container image or config) and a secretId (secret, delivered out-of-band via a protected environment variable or a one-time delivery mechanism). Neither half alone grants Vault access. Both together issue a short-TTL token that can be renewed or replaced without touching the other half.

// SAFE: AppRole authentication — secret zero via roleId + secretId split
// roleId can be in config (non-secret); secretId is protected env var with short TTL

import vault from 'node-vault';

const vaultBaseClient = vault({
  apiVersion: 'v1',
  endpoint: process.env.VAULT_ADDR,
  // No token here — we authenticate via AppRole below
});

let vaultToken: string | null = null;
let tokenExpiresAt = 0;

async function getAuthenticatedVaultClient() {
  const now = Date.now();
  // Renew the token 60 seconds before it expires
  if (vaultToken && now < tokenExpiresAt - 60_000) {
    return vault({ apiVersion: 'v1', endpoint: process.env.VAULT_ADDR, token: vaultToken });
  }

  // Authenticate via AppRole: roleId (non-secret) + secretId (protected env, short TTL)
  const roleId = process.env.VAULT_ROLE_ID;       // non-sensitive: can be in image or config
  const secretId = process.env.VAULT_SECRET_ID;   // sensitive: injected once, short TTL

  if (!roleId || !secretId) {
    throw new Error('VAULT_ROLE_ID and VAULT_SECRET_ID must be set');
  }

  const result = await vaultBaseClient.approleLogin({ role_id: roleId, secret_id: secretId });
  vaultToken = result.auth.client_token;
  // Vault returns TTL in seconds; store expiry as epoch ms
  tokenExpiresAt = now + result.auth.lease_duration * 1000;

  return vault({ apiVersion: 'v1', endpoint: process.env.VAULT_ADDR, token: vaultToken });
}

export async function readSecret(path: string): Promise<Record<string, string>> {
  const client = await getAuthenticatedVaultClient();
  const { data } = await client.read(path);
  return data.data;
}

For cloud-native workloads, Vault Agent auto-auth is even better: the Vault Agent sidecar uses the EC2 instance identity document (AWS), the GCE metadata service (GCP), or a Kubernetes ServiceAccount JWT to authenticate without any application-level credential at all. The Agent writes a token to a Unix socket or file that the MCP server reads — the MCP server code never handles the initial bootstrap credential.

Static env vars vs dynamic secrets with TTL rotation

Environment-variable database credentials are single-value, never-rotating, process-restart-required to change. Vault's database secrets engine issues short-TTL credentials that are unique per MCP server instance and automatically revoked when the lease expires or the server is decommissioned.

// DANGEROUS: static DATABASE_URL with embedded password in env
// Password is shared across all instances, never rotates, visible in env dumps

import { Pool } from 'pg';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  // e.g. postgres://appuser:SuperSecretPass123@db.internal:5432/appdb
  // This password: never rotates, same on all instances, visible in ps output
});

export async function queryDb(sql: string, params: unknown[]) {
  const client = await pool.connect();
  try {
    return await client.query(sql, params);
  } finally {
    client.release();
  }
}

// SAFE: Vault dynamic database credentials with 1-hour TTL and lease renewal
// Each MCP server instance gets unique credentials; automatically revoked on shutdown

import { Pool } from 'pg';
import { readSecret } from './vault-client.js';

interface DbCreds {
  username: string;
  password: string;
  leaseId: string;
  leaseExpiresAt: number;
}

let pool: Pool | null = null;
let currentCreds: DbCreds | null = null;

async function fetchDynamicDbCreds(): Promise<DbCreds> {
  // Vault database secrets engine: each call issues a unique username + password
  // with a configurable TTL (here configured as 1 hour on the Vault role)
  const client = await getAuthenticatedVaultClient();
  const result = await client.read('database/creds/mcp-app-role');

  return {
    username: result.data.username,
    password: result.data.password,
    leaseId: result.lease_id,
    leaseExpiresAt: Date.now() + result.lease_duration * 1000,
  };
}

async function renewLease(leaseId: string): Promise<number> {
  // Renew a Vault lease before it expires — keeps the DB credential valid
  const client = await getAuthenticatedVaultClient();
  const result = await client.write('sys/leases/renew', {
    lease_id: leaseId,
    increment: 3600,  // request another 1-hour extension
  });
  return result.lease_duration;
}

export async function getDbPool(): Promise<Pool> {
  const now = Date.now();
  // Re-fetch credentials 5 minutes before they expire
  const needsRefresh = !currentCreds || now > currentCreds.leaseExpiresAt - 300_000;

  if (needsRefresh) {
    const creds = await fetchDynamicDbCreds();
    currentCreds = creds;

    // Destroy old pool (drains existing connections) and create new one
    if (pool) await pool.end();
    pool = new Pool({
      host: process.env.DB_HOST,
      port: 5432,
      database: process.env.DB_NAME,
      user: creds.username,
      password: creds.password,
      ssl: { rejectUnauthorized: true },
    });

    // Schedule a lease renewal at the 50-minute mark
    const renewIn = creds.leaseExpiresAt - now - 600_000;
    setTimeout(async () => {
      try {
        await renewLease(creds.leaseId);
      } catch {
        // Renewal failed — pool will be replaced at next getDbPool() call
      }
    }, Math.max(renewIn, 0));
  }

  return pool!;
}

// In tool handler:
// const pool = await getDbPool();
// const result = await pool.query('SELECT ...', [params]);

Secret zero in containers — Kubernetes ServiceAccount token projection

Container deployments that pass VAULT_TOKEN via docker run -e or Kubernetes env.value fields expose the token in every deployment manifest, Helm value, CI pipeline, and anyone with kubectl describe pod access. The Kubernetes-native pattern is ServiceAccount token projection with the Vault Agent sidecar injector — no application-level token handling at all.

// DANGEROUS: passing VAULT_TOKEN via container environment variable
// Visible in: kubectl describe pod, CI pipeline output, Helm values, audit logs

// In Docker Compose or Kubernetes manifest (NEVER DO THIS):
// environment:
//   - VAULT_TOKEN=hvs.CAESIKN9...  ← static root token in plaintext manifest

// In Node.js code:
const client = vault({ token: process.env.VAULT_TOKEN }); // DANGEROUS

// SAFE: Vault Agent sidecar injector — Node.js app reads a token file,
// never handles bootstrap credentials directly.

// Kubernetes pod spec with Vault Agent annotations:
// metadata:
//   annotations:
//     vault.hashicorp.com/agent-inject: "true"
//     vault.hashicorp.com/role: "mcp-server"
//     vault.hashicorp.com/agent-inject-secret-db: "database/creds/mcp-app-role"
//     vault.hashicorp.com/agent-inject-template-db: |
//       {{- with secret "database/creds/mcp-app-role" -}}
//       DB_USER={{ .Data.data.username }}
//       DB_PASS={{ .Data.data.password }}
//       {{- end }}

// The Vault Agent sidecar uses the Kubernetes ServiceAccount JWT at:
//   /var/run/secrets/kubernetes.io/serviceaccount/token
// to authenticate to Vault and write rendered secrets to:
//   /vault/secrets/db

// In Node.js — read the file the Agent wrote, no Vault SDK needed:
import { readFileSync, watchFile } from 'fs';
import { parse as parseDotenv } from 'dotenv';

function readAgentSecret(name: string): Record<string, string> {
  const content = readFileSync(`/vault/secrets/${name}`, 'utf8');
  return parseDotenv(content);
}

// Watch for secret rotation (Vault Agent rewrites the file on renewal):
let dbCreds = readAgentSecret('db');
watchFile('/vault/secrets/db', { interval: 15_000 }, () => {
  dbCreds = readAgentSecret('db');
  // Trigger pool reconnect with new credentials
  invalidateDbPool();
});

export function getDbCredentials() {
  return { user: dbCreds.DB_USER, password: dbCreds.DB_PASS };
}

The ServiceAccount token at /var/run/secrets/kubernetes.io/serviceaccount/token is automatically projected by Kubernetes and rotated every hour. It is scoped to a specific Vault role that grants only the minimum permissions the MCP server needs. No static token appears in any manifest, CI pipeline, or environment variable.

Module-level secret caching that outlives rotation

A common pattern that defeats rotation: fetching the secret once at module initialization time and caching it in a module-level const. When Vault rotates the credential, the in-process copy is stale but the code never re-fetches it — the server silently uses an expired credential until the next restart.

// DANGEROUS: module-level credential const — never re-fetches after rotation
// When Vault rotates the DB password at 01:00, this value goes stale.
// The server throws authentication errors until manually restarted.

const DB_PASS = await fetchSecret('database/creds/mcp-app-role');
// ^ evaluated ONCE at import time, cached forever in module scope

const pool = new Pool({ password: DB_PASS });  // uses stale credential post-rotation

export async function queryDb(sql: string, params: unknown[]) {
  // Will fail after Vault rotates the password — no recovery path
  return pool.query(sql, params);
}

// SAFE: SecretCache class with TTL-based invalidation and proactive re-fetch

interface CachedSecret {
  value: string;
  fetchedAt: number;
  ttlMs: number;
}

class SecretCache {
  private cache = new Map<string, CachedSecret>();
  private pending = new Map<string, Promise<string>>();

  constructor(
    private readonly fetcher: (name: string) => Promise<string>,
    private readonly defaultTtlMs = 45 * 60 * 1000,  // 45 min default
  ) {}

  async get(name: string, ttlMs = this.defaultTtlMs): Promise<string> {
    const cached = this.cache.get(name);
    const now = Date.now();

    // Return cached value if it has more than 10% TTL remaining
    if (cached && now < cached.fetchedAt + cached.ttlMs * 0.9) {
      return cached.value;
    }

    // Deduplicate concurrent fetches for the same key
    const existing = this.pending.get(name);
    if (existing) return existing;

    const promise = this.fetcher(name).then(value => {
      this.cache.set(name, { value, fetchedAt: Date.now(), ttlMs });
      this.pending.delete(name);
      return value;
    }).catch(err => {
      this.pending.delete(name);
      // On re-fetch failure, serve the stale value if we have one rather than crashing
      if (cached) {
        console.warn(`Secret re-fetch failed for ${name}, using stale value:`, err.message);
        return cached.value;
      }
      throw err;
    });

    this.pending.set(name, promise);
    return promise;
  }

  invalidate(name: string) {
    this.cache.delete(name);
  }
}

// One instance per process — not per request
const secrets = new SecretCache(
  async (name) => {
    const result = await readSecret(`secret/data/${name}`);
    return result.value;
  },
  45 * 60 * 1000,  // 45 min TTL matches Vault lease of 1 hour (re-fetch at 75% of lease)
);

export async function getDbPassword(): Promise<string> {
  return secrets.get('db-password');
}

// On DB auth failure (Vault may have rotated mid-lease), invalidate and retry:
export async function queryWithRotationRecovery(sql: string, params: unknown[]) {
  try {
    const pool = await getDbPool();
    return await pool.query(sql, params);
  } catch (err: any) {
    // PostgreSQL auth failure code
    if (err.code === '28P01' || err.code === '28000') {
      secrets.invalidate('db-password');
      // Rebuild pool with fresh credentials, retry once
      const pool = await getDbPool();
      return pool.query(sql, params);
    }
    throw err;
  }
}

Vault Agent auto-auth with cloud workload identity (AWS / GCP)

For EC2 or Cloud Run deployments that don't use Kubernetes, Vault Agent can authenticate using the cloud instance identity document — no static bootstrap credential of any kind. The Agent runs as a sidecar process and writes tokens/secrets to a Unix socket or local file that the MCP server reads.

// vault-agent.hcl — runs alongside the MCP server process (no Node.js code needed
// for the bootstrap; the Agent handles auth against the EC2 instance identity doc)

// auto_auth {
//   method "aws" {
//     mount_path = "auth/aws"
//     config = {
//       type = "iam"
//       role = "mcp-server-role"
//     }
//   }
//   sink "file" {
//     config = {
//       path = "/run/vault/token"
//       mode = 0640
//     }
//   }
// }
//
// template {
//   source      = "/etc/vault-templates/db-creds.tpl"
//   destination = "/run/vault/db-creds"
//   perms       = "0640"
//   command     = "pkill -HUP node || true"  // signal MCP server to reload creds
// }

// In Node.js: read from the file the Agent wrote, no Vault SDK calls needed
import { readFileSync } from 'fs';
import { createServer } from 'http';
import process from 'process';

let credentials = loadCredentials();

function loadCredentials() {
  try {
    const raw = readFileSync('/run/vault/db-creds', 'utf8');
    const lines = Object.fromEntries(
      raw.trim().split('\n').map(l => l.split('=') as [string, string])
    );
    return { user: lines.DB_USER, password: lines.DB_PASS };
  } catch {
    throw new Error('Vault Agent credentials not available — is the agent running?');
  }
}

// Reload on SIGHUP (sent by vault-agent template command above)
process.on('SIGHUP', () => {
  credentials = loadCredentials();
  console.log('Credentials reloaded from Vault Agent');
});

export function getCurrentDbCredentials() {
  return credentials;
}

What SkillAudit checks in this area

Static VAULT_TOKEN in env — searches for VAULT_TOKEN in .env, Compose files, Kubernetes manifests, and CI configs. Cross-references with whether the value appears to be a long-lived root/orphan token (starts with hvs.) vs a short-TTL token. Flagged HIGH.
Module-level credential const without TTL — AST analysis looking for top-level const or let declarations initialized from await fetchSecret(), await getSecret(), or similar patterns outside of a class or function body. Flagged WARN; the value never re-fetches.
DATABASE_URL with embedded password in env — regex match on DATABASE_URL, POSTGRES_URL, MYSQL_URL values containing ://user:password@ pattern in tracked files. Flagged HIGH.
No rotation recovery path in handler code — tool handlers that perform database operations but have no catch block that handles PostgreSQL error code 28P01 (invalid password) or equivalent. Flagged WARN; the server has no path to recover from a mid-rotation failure without a restart.

Scan your MCP server for static vault tokens, module-level credential caching, and missing rotation paths.

Run a free audit → How grading works →