Topic: mcp server business logic security

MCP server business logic security — state machine and flow abuse in AI tool handlers

Business logic vulnerabilities are flaws in the intended flow of an application — not in the implementation of a specific operation, but in the sequence, preconditions, and invariants that govern how operations relate to each other. MCP servers introduce a new dimension to this class: the AI model can invoke tools in any order it chooses, at any frequency, potentially in parallel. A flow that a human user would follow linearly — create cart, add items, pay, fulfill — an AI model might execute non-linearly, repetitively, or in reverse. If the tools don't enforce the intended state machine server-side, the model (or a prompt-injection attack shaping its behavior) can exploit the gaps.

Quick reference

Enforce state transitions server-side, not just in the tool description: If fulfillOrder can only be called after chargePayment, the server must verify payment status in the database before executing fulfillment — the tool description alone is not an enforcement mechanism.
Use idempotency keys for multi-step workflows: If the AI retries a tool call due to a timeout or ambiguity, the server should produce the same result rather than double-executing. Idempotency keys prevent double-charge and double-fulfillment bugs.
Rate-limit tools that have financial or quota impact: A tool that charges a credit card should be callable at most once per order, not unlimited times per session. Enforce this at the session level in the server, not just in the tool description.
Design tools with explicit precondition checks: Every tool that depends on prior state should query that state from the database, not trust session-level variables that the AI or a previous tool call might have set incorrectly.

Why AI-driven tool invocation creates new business logic risks

In a traditional web application, business logic flow is enforced partly by the UI — the user sees a form, fills it out, submits it, and the next step. The server enforces the rules, but the UI also constrains what calls are even possible at each step. The AI model has no such constraint: it sees a list of available tools and can invoke any of them at any point in a session.

This creates several categories of flow abuse that don't exist (or are much harder) in human-driven applications:

Step skipping: Invoking a tool that should only be available after a prior tool has run. For example, calling fulfillOrder without having called chargePayment.
State machine bypass: Invoking a tool multiple times to drive a state machine into an inconsistent state. For example, calling approveExpense from two concurrent sessions to get an expense approved twice.
Price manipulation via sequencing: Adding items to a cart after price has been locked, or applying a discount code multiple times by interleaving calls with the checkout flow.
Quota exhaustion by a different path: Using an administrative tool to reset a usage counter, then running the rate-limited tool again — if the counter is stored in a session variable rather than in the database.

Three business logic enforcement patterns

1. Server-side state verification before every dependent operation

// VULNERABLE: fulfillment tool trusts that payment has occurred
// because the AI "should" have called chargePayment first
server.tool('fulfillOrder', z.object({
  orderId: z.string(),
}), async ({ orderId }) => {
  // No verification that payment exists — trusts call sequence
  await fulfillOrder(orderId)
  return { fulfilled: true }
})

// SAFE: verify payment status from the database before every fulfillment
server.tool('fulfillOrder', z.object({
  orderId: z.string(),
}), async ({ orderId }) => {
  const order = await db.orders.findUnique({
    where: { id: orderId },
    select: { status: true, paymentId: true, paymentStatus: true },
  })

  if (!order) return { error: 'Order not found' }
  if (order.paymentStatus !== 'captured') {
    return { error: `Cannot fulfill order: payment status is ${order.paymentStatus}` }
  }
  if (order.status !== 'paid') {
    return { error: `Cannot fulfill order: order status is ${order.status}` }
  }

  await fulfillOrder(orderId)
  return { fulfilled: true }
})

2. Idempotency keys for financial and quota-consuming operations

// Pattern: idempotency key prevents double-execution on retry
server.tool('chargePayment', z.object({
  orderId: z.string(),
  amountCents: z.number().int().positive(),
  idempotencyKey: z.string().uuid(),  // caller-supplied unique key per attempt
}), async ({ orderId, amountCents, idempotencyKey }) => {
  // Check if this key was already used
  const existing = await db.payments.findUnique({
    where: { idempotencyKey },
  })

  if (existing) {
    // Return the previous result without charging again
    return {
      chargeId: existing.chargeId,
      status: existing.status,
      idempotent: true,
    }
  }

  // First invocation: execute the charge
  const charge = await stripe.charges.create({
    amount: amountCents,
    currency: 'usd',
    metadata: { orderId, idempotencyKey },
  }, { idempotencyKey })

  // Record the result so retries return it
  await db.payments.create({
    data: { orderId, idempotencyKey, chargeId: charge.id, status: charge.status },
  })

  return { chargeId: charge.id, status: charge.status, idempotent: false }
})

3. Session-level tool call rate limiting

// Pattern: rate limit financially impactful tools per session
// Stored in Redis or database — not in-memory session state
// (in-memory state can be bypassed by reconnecting)

async function checkSessionRateLimit(
  sessionId: string,
  toolName: string,
  maxPerSession: number,
): Promise<{ allowed: boolean; count: number }> {
  const key = `session:${sessionId}:tool:${toolName}`
  const count = await redis.incr(key)

  if (count === 1) {
    // First call — set expiry for session lifetime
    await redis.expire(key, 3600)  // 1 hour
  }

  return { allowed: count <= maxPerSession, count }
}

server.tool('chargePayment', chargeSchema, async ({ orderId, ... }, ctx) => {
  const { allowed, count } = await checkSessionRateLimit(
    ctx.sessionId,
    'chargePayment',
    1,  // at most 1 charge per session — idempotency key handles retries
  )

  if (!allowed) {
    return { error: `chargePayment has been called ${count} times this session; use idempotencyKey to retry` }
  }

  // proceed with charge
})

What SkillAudit checks

Financial or quota-consuming tools with no database-level precondition check — WARN; if fulfillment, charge, or send tools can be invoked without a server-side state assertion, step-skipping is possible
Absence of idempotency handling in charge/debit/send tools — WARN; AI models retry on ambiguity; without idempotency, retries cause double-execution of irreversible operations
State stored in session variables or tool arguments rather than database — INFO; session state is not a reliable enforcement mechanism because the model (or attacker) can manipulate what the model reports; database is authoritative
No rate limiting on tools that have financial, quota, or permission-escalation effects — WARN; prompt injection attacks that loop tool calls can exhaust quotas or repeatedly trigger financial operations