Human-in-the-Loop Agent Approvals: A Mastra Pattern

Agents that can send emails, finalize invoices, or publish content need a gate between “the model decided to act” and “the action happened.” The standard approach is a system prompt instruction: “always confirm with the user before sending email.” It reads like a rule. It runs like a suggestion.

Most of the time the model confirms. Some of the time it doesn’t. The failure rate isn’t high enough to catch in testing, but it’s high enough to matter in production, where a skipped confirmation means an email sent to the wrong person or an invoice finalized with the wrong amount.

The problem is structural. When you put the approval decision in the prompt, the model is doing two jobs at once: deciding what to do, and deciding whether to ask first. Both decisions are negotiable. The model can reason its way past either one.

The version that holds moves the second decision out of the model entirely, enforcing human-in-the-loop approval at the framework level instead of the prompt level.

Framework-Enforced vs. Prompt-Instructed Approval in Mastra

Mastra is a TypeScript framework for building AI agents and applications. Its approval primitive, requireApproval, operates at the tool definition level. When a tool has requireApproval: true, the framework intercepts the tool call before execution. The model doesn’t get a choice about whether to pause. The runtime halts, surfaces the pending action, and waits for a human signal before proceeding.

Here’s what a gated tool looks like:

import { createTool } from "@mastra/core/tools";
import { z } from "zod";

const sendEmail = createTool({
  id: "sendEmail",
  description: "Send an email to a recipient",
  inputSchema: z.object({
    to: z.string().email(),
    subject: z.string(),
    body: z.string(),
  }),
  requireApproval: true,
  execute: async ({ to, subject, body }) => {
    // This only runs after human approval
    const result = await emailService.send({ to, subject, body });
    return { messageId: result.id, status: "sent" };
  },
});

The execute function never fires until a human approves it. The model can’t reason past requireApproval: true because it never sees the flag. It’s framework infrastructure, not a prompt constraint.

Compare this to the prompt-based version:

You are a helpful assistant. Before sending any email,
always ask the user to confirm the recipient, subject,
and body. Do not send without explicit confirmation.

This works until the model decides the context is clear enough to skip the confirmation. Or until a prompt injection overrides it. Or until you update the system prompt and accidentally weaken the constraint. The prompt version is a convention. The framework version is a gate.

How the Approval Flow Works

When an agent calls a tool with requireApproval: true, Mastra’s runtime emits a tool-call-approval chunk on the response stream. This chunk contains everything the approver needs to make a decision:

// The chunk surfaced to the approval handler
// runId is accessed separately via stream.runId, not inside this chunk
{
  type: "tool-call-approval",
  payload: {
    toolCallId: "call_abc123",
    toolName: "sendEmail",
    args: {
      to: "[email protected]",
      subject: "Q2 Invoice - Project Atlas",
      body: "Hi Sarah, attached is the invoice for..."
    }
  }
}

The agent’s execution suspends at this point. It doesn’t continue generating tokens, calling other tools, or falling back to “I’ll assume that’s approved.” It waits.

On the other side, your application receives the pending approval and collects a decision. The resume call is straightforward:

// Approve — tool executes, agent continues
await agent.approveToolCall({
  runId: stream.runId,
  toolCallId: "call_abc123",
});

// Decline — tool doesn't execute, agent is told why
await agent.declineToolCall({
  runId: stream.runId,
  toolCallId: "call_abc123",
});

After approval, the tool’s execute function runs and the agent continues with the result. After decline, the agent receives a signal that the action was rejected and can adjust its approach.

What the Approval Event Contains

The approval chunk carries the tool name, tool call ID, and full args object. For the sendEmail example, that means the pending approval includes the recipient, subject, and body because those are the tool arguments.

This is where the framework boundary matters. Mastra gives you the pause, the pending tool call, and the approve or decline mechanism. It does not enrich the payload, inspect your business objects, or decide how a human should review the action. The primitive is intentionally small: execution stops before the tool runs, and your application can approve or decline the exact tool call.

That makes tool design part of using approval gates. If a gated tool takes explicit arguments, the approval event is easy to inspect:

{
  toolName: "sendEmail",
  args: {
    to: "[email protected]",
    subject: "Q2 Invoice - Project Atlas",
    body: "Hi Sarah, attached is the invoice for..."
  }
};

If a gated tool hides the real action behind a generic payload, the framework can’t fix that. A tool named stripeApiExecute with a raw JSON body is harder to reason about than a tool named finalizeInvoice with narrow, typed arguments. Mastra will still pause either call, but the second one gives you a clearer approval event.

The model proposes the action. The framework pauses execution. The application decides whether to approve or decline that exact tool call.

Anthropic’s data from the Claude Code auto mode launch found that users approve 93% of permission prompts. That rate isn’t inherently a problem, but it becomes one when approvals are too vague to catch the actions that should be rejected. Framework enforcement gives you the hard stop. Clear tool boundaries make the stop useful.

Deciding What to Gate

Not every tool needs an approval gate. The question I ask for each tool is simple. If the agent gets this wrong, can I undo it?

I sort tools into three tiers based on that answer.

Tier	Risk level	Approval	Examples
Read-only	Low	None	Database queries, calendar reads, file reads
Reversible writes	Medium	Optional	Creating drafts, updating internal state, labeling emails
Irreversible actions	High	Required	Sending emails, finalizing invoices, publishing content, external API mutations

This isn’t about making the agent less capable. It’s about matching the cost of a mistake to the cost of a review. Low-tier tools run at machine speed. High-tier tools run at human-decision speed, which is the appropriate speed for actions you can’t take back.

For conditional gating, where the same tool needs approval in some contexts but not others, Mastra’s suspend() primitive inside the execute function gives you that control at runtime:

const deployService = createTool({
  id: "deployService",
  // ...
  execute: async ({ service, environment, isDryRun }, { suspend }) => {
    if (!isDryRun) {
      // Pause and wait for human approval before a real deploy
      await suspend({ service, environment });
    }
    // Dry runs proceed without pausing
    // ...
  },
});

This lets you gate the same tool differently based on context. Skip the pause for dry runs, always suspend for real mutations, or gate based on the sensitivity of the specific action.

The Invisible Break: Supervisors Routing Around Gates

There’s a failure mode that doesn’t show up in unit tests.

In a multi-agent architecture where a supervisor delegates to domain-specific agents, the supervisor learns over time which sub-agents are “fast” and which are “slow.” An agent with requireApproval on its tools is an agent that sometimes takes minutes instead of milliseconds from the supervisor’s perspective. The supervisor isn’t reasoning about governance. It’s pattern-matching on latency.

The result is predictable. The supervisor starts routing around gated agents. Instead of delegating to the invoicing agent (which has approval gates), it tries to handle the task itself using ungated tools, or it delegates to a different agent that can accomplish something similar without the approval step. Your governance is still in place. It’s just not being exercised.

This is hard to catch because the system still works. Tasks still get completed. The logs show successful runs. The only signal is what’s absent: the approval events you expected to see aren’t firing. The gate is still there. The traffic is going around it.

Three defenses:

Monitor gate exercise rates. Track how often each gated tool’s approval event fires. A tool that goes from 20 approvals per week to zero didn’t become unnecessary. Something changed in the routing.

Constrain the supervisor’s tool access. If the supervisor shouldn’t be able to send emails directly, don’t give it email tools. Force the delegation path through the agent that has the gates. This is the least-privilege principle from tool scoping applied to the routing layer.

Audit traces for governance bypasses. Periodically review agent traces to verify that high-risk actions are flowing through the gated paths. Observability tools like Arize Phoenix or Mastra’s built-in tracing make this practical. You’re looking for the action being performed without the expected approval event preceding it.

Layering Approval Gates with Guardrail Processors

Approval gates are one layer of defense. Mastra’s processor system adds another.

A processor is middleware that runs at defined points in an agent’s execution loop. Where an approval gate asks “does a human approve this specific action?”, a processor asks “is this action structurally allowed at all?” Processors catch policy violations that no human should need to evaluate. Combining the two creates a defense-in-depth stack.

import type { Processor, ProcessOutputStepArgs } from "@mastra/core/processors";

const emailGuardrail: Processor = {
  id: "emailGuardrail",
  processOutputStep: async ({ toolCalls, abort }) => {
    const emailCall = toolCalls?.find(tc => tc.toolName === "sendEmail");
    if (emailCall) {
      // Validate the email isn't going to a restricted domain
      const restricted = ["competitor.com", "internal-only.com"];
      const args = emailCall.args as { to: string };
      const domain = args.to.split("@")[1];
      if (restricted.includes(domain)) {
        abort(`Emails to ${domain} are restricted`);
      }
    }
  },
};

The processor fires after each LLM step but before tool execution. If it calls abort(), the tool call never reaches the approval gate. This handles the class of actions that should never happen regardless of human approval. Sending emails to blocked domains, invoicing amounts above a threshold without secondary review, publishing content during a freeze window. These are policy violations, not judgment calls.

The stack in order:

Processor (pre-execution): Is this action structurally allowed?
Approval gate: Does a human approve this specific action?
Tool execution: Perform the action.
Processor (post-execution): Validate the result.

Each layer catches a different class of failure. The processor catches policy violations. The approval gate catches judgment calls. Neither replaces the other.

Human-in-the-Loop Is a Spectrum

Blocking approval gates are the right pattern for irreversible, high-stakes actions. They’re the wrong pattern for everything else.

There are three modes of human-in-the-loop, and each fits different situations:

Blocking (in-the-loop). The agent pauses mid-execution and waits for human input. Use this for the irreversible tier from above. The cost of a wrong action is higher than the cost of making the agent wait.

Post-processing (draft review). The agent generates output and a human reviews it before it’s finalized. Use this for content creation, report generation, or any workflow where the agent produces a draft. The agent doesn’t pause; the human reviews asynchronously.

Deferred (async feedback). The agent completes the task and collects feedback later. Use this for low-stakes actions where speed matters more than pre-approval: opening a PR, suggesting a refactor, triaging support tickets.

The guiding question for each tool. What’s the cost of the agent getting this wrong, and what’s the cost of making a human wait? The ratio picks the mode.

Agents don’t sleep. Humans do. In any HITL architecture, humans become the throughput bottleneck. Putting a blocking gate on every action turns your agent into an interactive script that needs constant babysitting. The tier table tells you where each tool belongs.

What Approval Data Tells You

Every approval event is a data point. Over time, the pattern of approvals, rejections, and modifications tells you where the agent’s judgment is calibrated and where it isn’t.

A tool with a 99% approval rate and no modifications is a candidate for removing the gate. The human review isn’t adding signal. A tool with a 70% approval rate and frequent modifications to the arguments is telling you the agent’s reasoning about that action needs work. Better instructions, more constrained input schemas, or additional context in the tool description are all ways to close that gap.

This reframes the cost of human-in-the-loop. The approval step isn’t friction. It’s an annotation workflow. Each human decision (approve, reject, or modify) is ground truth data about what the agent should have done. Over time, this builds the evidence base for either relaxing the gate or tightening the agent’s behavior.

The path from human-gated to autonomous isn’t a single switch. It’s a gradual transition backed by data. The agent earns trust through a track record of correct decisions, verified by the approval log.

Getting Started

If you’re adding approval gates to a Mastra agent, the sequence that matters is less about the code and more about the decisions around it:

Start with your most expensive mistake. Which tool, if the agent got it wrong tomorrow, would cost you the most to fix? Gate that one first.
Keep gated tools explicit. Mastra surfaces the tool name and arguments in the approval event. A narrow tool with typed arguments is easier to approve than a generic mutation tool with an opaque payload.
Set up gate exercise monitoring before you need it. By the time you notice a supervisor routing around your gates, the bypass pattern is already established.
Schedule a monthly approval data review. The approval rates and modification patterns tell you which gates to relax and which agent instructions to tighten.

The primitives are straightforward. The discipline is in deciding which actions earn autonomy and which ones require a human in the loop.

If you’re building agents and want a quick read on where your governance stands, I put together a free Agent Governance Scorecard. It’s 30 yes/no questions across four dimensions: tool access and blast radius, observability, human-in-the-loop checkpoints, and defense in depth. Takes about ten minutes, gives you an instant score, and tells you which layer to fix first. Grab the scorecard here.