Your AI Assistant Doesn't Need a Bigger Model. It Needs Colleagues

May 5, 2026 · 14:05

agent-architecture multi-agent mastra supervisor-pattern local-llm

Your AI assistant doesn’t need a bigger model — it needs colleagues. One agent doing CRM lookups, code review, meeting prep, and sales drafting at the same time will be mediocre at all four. The fix isn’t a larger context window or a more capable LLM. It’s specialization.

The supervisor + specialists pattern

This video walks through the multi-agent supervisor pattern in Mastra using the production system that runs my consulting business — my AI assistant, Emma. Same model across the whole fleet (Qwen3.6-27B running locally on Ollama), nine distinct agent identities, and a supervisor agent that delegates work via tool calls. No orchestration framework. No external service. Just Mastra’s agents: property and a routing table.

Tool sprawl is the real bottleneck

The inflection point wasn’t a plan — it was a failure. A single agent with too many tools started picking the wrong tool for the job. Not hallucinating, not hitting limits. Just wrong tool, repeatedly. I cover three rubrics for when to extract a new agent: domain size, the “would you hire a human for this role?” test, and whether the domain deserves its own memory. I also cover the over-delegation trap — going from one agent to fifteen and having to claw back.

Structural trust, not prompted trust

The ninth agent in the fleet (Envoy) is deliberately locked out of everything personal so my Practical AI community can interact with it safely. That isolation is enforced by tool access, not prompt instructions — which is the only kind of trust boundary that actually holds up in production.

If you’re designing agentic systems and want help with the architecture, see how I work with teams on AI agents. For the prerequisite patterns — single-agent foundations and persistent workspaces — see Build Your Own AI Agent from Scratch and I Gave My AI Agent Access to My Second Brain.

Building an AI agent?

I help teams design and ship agentic systems — from architecture to production.

See how I can help

More on this topic

Stop Giving Your Agent Every Tool

Stop Giving Your Agent Every Tool

Large tool catalogs break agent context. Tool search fixes that by letting agents discover and load only what they need.

Stop Letting AI Agents Run the Whole Workflow

Stop Letting AI Agents Run the Whole Workflow

One inbox agent should not classify, research, score, route, and draft replies in one loose loop.

Harness Engineering: 4 Levers to Diagnose Any AI Agent

Harness Engineering: 4 Levers to Diagnose Any AI Agent

Most agent failures aren't model failures. They're harness failures. Here's the 4-lever framework I use to diagnose what broke.

Building Approval Gates AI Agents Can't Route Around

Building Approval Gates AI Agents Can't Route Around

How to wire human-in-the-loop on tool calls — and why system prompt instructions like "always ask before sending" don't actually hold.

Get new videos and posts by email

Weekly videos on AI engineering, plus deeper dives in the newsletter.

Occasional emails, no fluff.

Powered by Buttondown