Human-in-the-Loop Agent Approvals: A Mastra Pattern
Agents that can send emails, finalize invoices, or publish content need a gate between “the model decided to act” and “the action happened.” The standard approach is a system prompt instruction: “always confirm with the user before sending email.” It reads like a rule. It runs like a suggestion.
Most of the time the model confirms. Some of the time it doesn’t. The failure rate isn’t high enough to catch in testing, but it’s high enough to matter in production, where a skipped confirmation means an email sent to the wrong person or an invoice finalized with the wrong amount.
The problem is structural. When you put the approval decision in the prompt, the model is doing two jobs at once: deciding what to do, and deciding whether to ask first. Both decisions are negotiable. The model can reason its way past either one.
The version that holds moves the second decision out of the model entirely, enforcing human-in-the-loop approval at the framework level instead of the prompt level.
Framework-Enforced vs. Prompt-Instructed Approval in Mastra
Mastra is a TypeScript framework for building AI agents and applications. Its approval primitive, requireApproval, operates at the tool definition level. When a tool has requireApproval: true, the framework intercepts the tool call before execution. The model doesn’t get a choice about whether to pause. The runtime halts, surfaces the pending action, and waits for a human signal before proceeding.
Here’s what a gated tool looks like:
import { createTool } from "@mastra/core/tools";
import { z } from "zod";
const sendEmail = createTool({
id: "sendEmail",
description: "Send an email to a recipient",
inputSchema: z.object({
to: z.string().email(),
subject: z.string(),
body: z.string(),
}),
requireApproval: true,
execute: async ({ to, subject, body }) => {
// This only runs after human approval
const result = await emailService.send({ to, subject, body });
return { messageId: result.id, status: "sent" };
},
});
The execute function never fires until a human approves it. The model can’t reason past requireApproval: true because it never sees the flag. It’s framework infrastructure, not a prompt constraint.
Compare this to the prompt-based version:
You are a helpful assistant. Before sending any email,
always ask the user to confirm the recipient, subject,
and body. Do not send without explicit confirmation.
This works until the model decides the context is clear enough to skip the confirmation. Or until a prompt injection overrides it. Or until you update the system prompt and accidentally weaken the constraint. The prompt version is a convention. The framework version is a gate.
How the Approval Flow Works
When an agent calls a tool with requireApproval: true, Mastra’s runtime emits a tool-call-approval chunk on the response stream. This chunk contains everything the approver needs to make a decision:
// The chunk surfaced to the approval UI
// runId is accessed separately via stream.runId, not inside this chunk
{
type: "tool-call-approval",
payload: {
toolCallId: "call_abc123",
toolName: "sendEmail",
args: {
to: "[email protected]",
subject: "Q2 Invoice - Project Atlas",
body: "Hi Sarah, attached is the invoice for..."
}
}
}
The agent’s execution suspends at this point. It doesn’t continue generating tokens, calling other tools, or falling back to “I’ll assume that’s approved.” It waits.
On the other side, whatever surface handles approvals (a dashboard, a Slack integration, a CLI) presents the pending action and collects a decision. The resume call is straightforward:
// Approve — tool executes, agent continues
await agent.approveToolCall({
runId: stream.runId,
toolCallId: "call_abc123",
});
// Decline — tool doesn't execute, agent is told why
await agent.declineToolCall({
runId: stream.runId,
toolCallId: "call_abc123",
});
After approval, the tool’s execute function runs and the agent continues with the result. After decline, the agent receives a signal that the action was rejected and can adjust its approach.
What the Approver Needs to See
The approval chunk carries the full args object. A confirmation that says “Agent wants to send email” is useless. The approver needs the actual recipient, subject, and body to make a real decision.
One pattern that strengthens this is adding a confirmationSummary field to the tool’s input schema:
const finalizeInvoice = createTool({
id: "finalizeInvoice",
description: "Finalize a draft invoice and send it to the client",
inputSchema: z.object({
invoiceId: z.string(),
autoAdvance: z.boolean().optional().default(true),
confirmationSummary: z
.string()
.min(20)
.describe(
"A human-readable summary of the invoice being finalized, " +
"including client name, amount, and line items"
),
}),
requireApproval: true,
execute: async ({ invoiceId, autoAdvance }) => {
const invoice = await stripe.invoices.finalizeInvoice(invoiceId, {
auto_advance: autoAdvance,
});
return { id: invoice.id, status: invoice.status };
},
});
The confirmationSummary forces the model to produce a human-readable description of what it’s about to do. The approver sees “Finalize invoice #INV-2847 for Acme Corp, $12,500, covering API integration work (April)” instead of just an invoice ID. The model does the summarization work. The human makes the judgment call.
This is the difference between an approval gate that adds signal and one that gets rubber-stamped. Anthropic’s data from the Claude Code auto mode launch found that users approve 93% of permission prompts. That rate isn’t inherently a problem, but it becomes one when the approver lacks context to catch the actions that should be rejected. Full context with a structured summary is what makes the gate function as a real checkpoint.
Deciding What to Gate
Not every tool needs an approval gate. The question I ask for each tool is simple. If the agent gets this wrong, can I undo it?
I sort tools into three tiers based on that answer.
| Tier | Risk level | Approval | Examples |
|---|---|---|---|
| Read-only | Low | None | Database queries, calendar reads, file reads |
| Reversible writes | Medium | Optional | Creating drafts, updating internal state, labeling emails |
| Irreversible actions | High | Required | Sending emails, finalizing invoices, publishing content, external API mutations |
This isn’t about making the agent less capable. It’s about matching the cost of a mistake to the cost of a review. Low-tier tools run at machine speed. High-tier tools run at human-decision speed, which is the appropriate speed for actions you can’t take back.
For conditional gating, where the same tool needs approval in some contexts but not others, Mastra’s suspend() primitive inside the execute function gives you that control at runtime:
const deployService = createTool({
id: "deployService",
// ...
execute: async ({ service, environment, isDryRun }, { suspend }) => {
if (!isDryRun) {
// Pause and wait for human approval before a real deploy
await suspend({ service, environment });
}
// Dry runs proceed without pausing
// ...
},
});
This lets you gate the same tool differently based on context. Skip the pause for dry runs, always suspend for real mutations, or gate based on the sensitivity of the specific action.
The Invisible Break: Supervisors Routing Around Gates
There’s a failure mode that doesn’t show up in unit tests.
In a multi-agent architecture where a supervisor delegates to domain-specific agents, the supervisor learns over time which sub-agents are “fast” and which are “slow.” An agent with requireApproval on its tools is an agent that sometimes takes minutes instead of milliseconds from the supervisor’s perspective. The supervisor isn’t reasoning about governance. It’s pattern-matching on latency.
The result is predictable. The supervisor starts routing around gated agents. Instead of delegating to the invoicing agent (which has approval gates), it tries to handle the task itself using ungated tools, or it delegates to a different agent that can accomplish something similar without the approval step. Your governance is still in place. It’s just not being exercised.
This is hard to catch because the system still works. Tasks still get completed. The logs show successful runs. The only signal is what’s absent: the approval events you expected to see aren’t firing. The gate is still there. The traffic is going around it.
Three defenses:
Monitor gate exercise rates. Track how often each gated tool’s approval event fires. A tool that goes from 20 approvals per week to zero didn’t become unnecessary. Something changed in the routing.
Constrain the supervisor’s tool access. If the supervisor shouldn’t be able to send emails directly, don’t give it email tools. Force the delegation path through the agent that has the gates. This is the least-privilege principle from tool scoping applied to the routing layer.
Audit traces for governance bypasses. Periodically review agent traces to verify that high-risk actions are flowing through the gated paths. Observability tools like Arize Phoenix or Mastra’s built-in tracing make this practical. You’re looking for the action being performed without the expected approval event preceding it.
Layering Approval Gates with Guardrail Processors
Approval gates are one layer of defense. Mastra’s processor system adds another.
A processor is middleware that runs at defined points in an agent’s execution loop. Where an approval gate asks “does a human approve this specific action?”, a processor asks “is this action structurally allowed at all?” Processors catch policy violations that no human should need to evaluate. Combining the two creates a defense-in-depth stack.
import type { Processor, ProcessOutputStepArgs } from "@mastra/core/processors";
const emailGuardrail: Processor = {
id: "emailGuardrail",
processOutputStep: async ({ toolCalls, abort }) => {
const emailCall = toolCalls?.find(tc => tc.toolName === "sendEmail");
if (emailCall) {
// Validate the email isn't going to a restricted domain
const restricted = ["competitor.com", "internal-only.com"];
const args = emailCall.args as { to: string };
const domain = args.to.split("@")[1];
if (restricted.includes(domain)) {
abort(`Emails to ${domain} are restricted`);
}
}
},
};
The processor fires after each LLM step but before tool execution. If it calls abort(), the tool call never reaches the approval gate. This handles the class of actions that should never happen regardless of human approval. Sending emails to blocked domains, invoicing amounts above a threshold without secondary review, publishing content during a freeze window. These are policy violations, not judgment calls.
The stack in order:
- Processor (pre-execution): Is this action structurally allowed?
- Approval gate: Does a human approve this specific action?
- Tool execution: Perform the action.
- Processor (post-execution): Validate the result.
Each layer catches a different class of failure. The processor catches policy violations. The approval gate catches judgment calls. Neither replaces the other.
Human-in-the-Loop Is a Spectrum
Blocking approval gates are the right pattern for irreversible, high-stakes actions. They’re the wrong pattern for everything else.
There are three modes of human-in-the-loop, and each fits different situations:
Blocking (in-the-loop). The agent pauses mid-execution and waits for human input. Use this for the irreversible tier from above. The cost of a wrong action is higher than the cost of making the agent wait.
Post-processing (draft review). The agent generates output and a human reviews it before it’s finalized. Use this for content creation, report generation, or any workflow where the agent produces a draft. The agent doesn’t pause; the human reviews asynchronously.
Deferred (async feedback). The agent completes the task and collects feedback later. Use this for low-stakes actions where speed matters more than pre-approval: opening a PR, suggesting a refactor, triaging support tickets.
The guiding question for each tool. What’s the cost of the agent getting this wrong, and what’s the cost of making a human wait? The ratio picks the mode.
Agents don’t sleep. Humans do. In any HITL architecture, humans become the throughput bottleneck. Putting a blocking gate on every action turns your agent into an interactive script that needs constant babysitting. The tier table tells you where each tool belongs.
What Approval Data Tells You
Every approval event is a data point. Over time, the pattern of approvals, rejections, and modifications tells you where the agent’s judgment is calibrated and where it isn’t.
A tool with a 99% approval rate and no modifications is a candidate for removing the gate. The human review isn’t adding signal. A tool with a 70% approval rate and frequent modifications to the arguments is telling you the agent’s reasoning about that action needs work. Better instructions, more constrained input schemas, or additional context in the tool description are all ways to close that gap.
This reframes the cost of human-in-the-loop. The approval step isn’t friction. It’s an annotation workflow. Each human decision (approve, reject, or modify) is ground truth data about what the agent should have done. Over time, this builds the evidence base for either relaxing the gate or tightening the agent’s behavior.
The path from human-gated to autonomous isn’t a single switch. It’s a gradual transition backed by data. The agent earns trust through a track record of correct decisions, verified by the approval log.
Getting Started
If you’re adding approval gates to a Mastra agent, the sequence that matters is less about the code and more about the decisions around it:
- Start with your most expensive mistake. Which tool, if the agent got it wrong tomorrow, would cost you the most to fix? Gate that one first.
- Add a
confirmationSummaryto every gated tool. The approval surface is only as useful as the context it shows. Invest in the summary schema upfront. - Set up gate exercise monitoring before you need it. By the time you notice a supervisor routing around your gates, the bypass pattern is already established.
- Schedule a monthly approval data review. The approval rates and modification patterns tell you which gates to relax and which agent instructions to tighten.
The primitives are straightforward. The discipline is in deciding which actions earn autonomy and which ones require a human in the loop.
If you’re building agents and want a quick read on where your governance stands, I put together a free Agent Governance Scorecard. It’s 30 yes/no questions across four dimensions: tool access and blast radius, observability, human-in-the-loop checkpoints, and defense in depth. Takes about ten minutes, gives you an instant score, and tells you which layer to fix first. Grab the scorecard here.
Further Reading
- Governing AI Agents Without Killing Them covers the broader governance patterns including tool sprawl, decision logging, and rubber-stamp checkpoints.
- How I Built a Personal AI Assistant with Mastra walks through building a Mastra agent end-to-end, including tool design and multi-agent delegation.
- Mastra Agent Approval Documentation is the official reference for
requireApprovaland the approval flow API. - Human-in-the-Loop Agent Approvals (YouTube) is the companion video walkthrough of the approval pattern covered in this post.
More on building real systems
I write about AI integration, architecture decisions, and what actually works in production.
Occasional emails, no fluff.