How I Built a Personal AI Assistant with Mastra
Most “AI assistants” are just chatbots with a context window. Ask them something, get an answer. Ask again later, and they have no idea who you are.
That’s not an assistant. That’s a search engine with a personality.
I wanted something different. I wanted an agent that:
- Researches people before my meetings
- Reminds me to follow up
- Remembers context across conversations
- Acts on its own when events happen
So I built one. Here’s how it works.
The Goal
I meet with a lot of people: founders, potential clients, partners. Before each meeting, I want to know who I’m talking to: their background, their company, what they’ve been working on. After each meeting, I follow up at the right time.
Doing this manually doesn’t scale. I needed an agent that handles it for me.
Architecture Overview
The system has five main components:
- Communication: Slack as the primary interface, built on a platform-agnostic SDK
- Tools: Composable functions for research, scheduling, and data retrieval
- Memory: Four types (message history, semantic recall, working memory, observational) so the agent remembers across conversations
- Webhooks: Event-driven triggers that let the agent react to Cal.com bookings automatically
- Task Scheduling: Time-delayed task execution for follow-ups and reminders
Each component does one thing. Together, they form an agent that acts, remembers, and follows up.
1. Tools: What the Agent Can Do
Tools are the agent’s capabilities. Each tool is a function the agent can call when it decides to use it.
// src/mastra/tools/cal-com.ts
import { createTool } from "@mastra/core/tools";
import { z } from "zod";
export const getUpcomingEvents = createTool({
id: "getUpcomingEvents",
description: "Get upcoming meetings from Cal.com for the specified date range",
inputSchema: z.object({
startDate: z.string(),
endDate: z.string(),
}),
outputSchema: z.object({
title: z.string(),
startTime: z.string(),
endTime: z.string(),
attendees: z.array(z.string()),
}),
execute: async ({ startDate, endDate }) => {
const events = await fetchCalComEvents(startDate, endDate);
return events.map((e) => ({
title: e.title,
startTime: e.start,
endTime: e.end,
attendees: e.attendees.map((a) => a.email),
}));
},
});
The tool description is the contract. The agent reads the description and decides when to call the tool. Clear descriptions = reliable tool usage.
I defined tools for Cal.com (scheduling), Exa (web research), and Slack (messaging):
| Tool | What it does |
|---|---|
getUpcomingEvents | Fetch meetings from Cal.com |
searchVault | Search my contacts and notes |
researchPerson | Research a person/company via Exa |
postToSlack | Post to the assistant channel |
scheduleTask | Schedule a follow-up task |
Each tool does one thing. Simple, composable functions the agent can combine.
2. Memory: The Differentiator
Most agents have no memory. Ask them something, they answer. Ask again later, they start fresh.
In How AI Agents Remember Things, I covered the conceptual taxonomy: episodic memory for events and interactions, semantic memory for stable facts and preferences. Mastra maps those concepts to four concrete memory types you configure when building an agent.
| Mastra Type | What it does | Conceptual equivalent |
|---|---|---|
| Message history | Keeps recent messages in context within a conversation | Episodic (in-session) |
| Semantic recall | Retrieves relevant messages from past conversations by meaning | Episodic (cross-session) |
| Working memory | Persistent structured data: your name, preferences, goals | Semantic (stable facts) |
| Observational memory | Background summarization to keep the context window small over time | Session compaction |
Message history (lastMessages) is the short-term layer. It keeps the last N messages in context so the agent can follow the conversation thread. The agent can reference something you said three messages ago without you repeating it. Ten messages works well for most conversational flows.
Semantic recall is the long-term retrieval layer. It uses vector embeddings to search across all past conversations by meaning, not keywords. When you say “remember that thing about the Cal.com integration,” the agent encodes your query into a vector and finds the closest matches from past messages. You configure topK (how many matches to retrieve) and messageRange (how many surrounding messages to include for context). I used LibSQL for the vector store and FastEmbed for local embeddings, so the entire pipeline runs without external API calls.
Working memory is the persistent layer. It’s a structured scratchpad the agent updates over time as it learns about you. Unlike message history and semantic recall, which store raw messages, working memory stores distilled facts: your name, your role, your preferences. You define a template, and the agent fills it in as it picks up information from conversations. This is what makes the agent feel like it knows you, even in a brand new thread.
Observational memory uses background Observer and Reflector agents to maintain a dense observation log that replaces raw message history as it grows. I haven’t wired this up yet, but it solves the context window problem: as conversations get long, you can’t keep everything in context. Observational memory compresses it down without losing the long-term thread.
Here’s how the first three are configured:
// src/mastra/agents/meeting-assistant.ts
import { Agent } from '@mastra/core/agent';
import { Memory } from '@mastra/memory';
import { LibSQLVector } from '@mastra/libsql';
import { fastembed } from '@mastra/fastembed';
const memory = new Memory({
// Vector store for semantic recall
vector: new LibSQLVector({
id: "memory-vector",
url: "file:./mastra.db",
}),
// Local embedding model, no API key needed
embedder: fastembed,
options: {
// Message history: keeps the last 10 messages in context
lastMessages: 10,
// Semantic recall: searches past conversations by meaning
semanticRecall: {
topK: 3,
messageRange: 2,
},
// Working memory: persistent user profile the agent updates over time
workingMemory: {
enabled: true,
template: `# User Profile
- Name:
- Role:
- Company:
- Communication style:
- Meeting prep preferences:
`,
},
},
});
export const meetingAssistant = new Agent({
name: 'MeetingAssistant',
model: 'openai/gpt-4.1',
memory,
});
Mastra handles storage, retrieval, and injection into the agent’s context automatically. You configure the types declaratively; the framework does the rest.
3. Slack Integration
Communication happens through Slack via the Chat SDK, a platform-agnostic interface for bot communication.
// src/chat.ts
import { Chat } from "chat";
import { createSlackAdapter } from "@chat-adapter/slack";
import { meetingAssistant } from "./mastra/agents/meeting-assistant";
export const bot = new Chat({
userName: "meeting-assistant",
adapters: {
slack: createSlackAdapter(),
},
});
bot.onNewMention(async (thread, message) => {
await thread.subscribe();
await thread.startTyping();
const result = await meetingAssistant.generate(message.text, {
memory: {
thread: thread.id,
resource: "user",
},
});
await thread.post(result.text);
});
Two things happening here. First, onNewMention fires when someone @mentions the bot in Slack. Second, memory.thread scopes messages to the specific Slack thread, while memory.resource uses a fixed ID so working memory (your profile) is shared across all threads.
4. Webhooks: Reacting to Events
The agent needs to know when a meeting is booked. That’s where webhooks come in.
// src/mastra/index.ts
import { Mastra } from "@mastra/core";
import { registerApiRoute } from "@mastra/core/server";
export const mastra = new Mastra({
server: {
apiRoutes: [
registerApiRoute("/webhooks/cal", {
method: "POST",
handler: async (c) => {
const payload = await c.req.json();
const triggerEvent = payload.triggerEvent;
if (triggerEvent !== "BOOKING_CREATED") {
return c.json({ ok: true, skipped: true });
}
const attendee = payload.payload?.attendees?.[0];
const channel = bot.channel(`slack:${process.env.SLACK_CHANNEL_ID}`);
// Post immediately, then research asynchronously
channel.post(`Researching *${attendee.name}* for upcoming meeting...`).then(async (sent) => {
const threadId = `slack:${channelId}:${sent.id}`;
const prompt = [
`I have a meeting coming up with ${attendee.name} (${attendee.email}).`,
`Event: ${payload.payload.title}`,
`Time: ${payload.payload.startTime}`,
`Research this person and give me a concise meeting brief.`,
].join("\n");
const result = await meetingAssistant.generate(prompt);
await slack.postMessage(threadId, { markdown: result.text });
});
return c.json({ ok: true });
},
}),
],
},
});
The webhook receives the Cal.com booking payload, immediately posts a “researching” message to Slack, then kicks off the agent to research and post the brief. This way Cal.com doesn’t time out waiting for the research to complete.
5. Task Scheduling: Time-Delayed Actions
After a meeting, the agent should follow up. That requires scheduling a task for later execution.
// src/scheduler.ts
export async function scheduleTask(
name: string,
type: string,
scheduledFor: string,
payload: Record<string, unknown>,
) {
await db.insert(scheduledTasks).values({
name,
type,
scheduledFor,
payload: JSON.stringify(payload),
});
}
The scheduler polls every 30 seconds for due tasks, marks them as running, executes the handler, then marks them complete or failed. Simple, reliable, no external dependencies.
registerTaskHandler("follow-up", async (payload) => {
const { threadId, message } = payload as { threadId: string; message: string };
const slack = bot.getAdapter("slack");
await slack.postMessage(threadId, { markdown: message });
});
When the meeting ends, the webhook handler schedules a follow-up task. Thirty seconds after the scheduled end time, the handler fires and posts to the Slack thread: “The meeting should be wrapping up! How did it go?”
The Mental Model
Here’s how I think about agent architecture now:
Communication is the interface. Slack, Telegram, or any messaging platform. It’s just how the user talks to the agent.
Tools are the agent’s capabilities. Each tool should do one thing well. The description is the contract. Write it clearly, or the agent won’t know when to use it.
Memory is the differentiator. Message history for in-session continuity, semantic recall for cross-session retrieval, working memory for persistent user facts. Most agents fail because they only implement one or none.
Webhooks make it proactive. Without external triggers, the agent only acts when asked. With webhooks, it can act on events: bookings, form submissions, anything.
Task scheduling closes the loop. The agent doesn’t just respond. It reminds, follows up, checks in. Time-delayed actions turn a reactive chatbot into a proactive assistant.
The Point
The point isn’t the code. It’s the architecture.
An agent that only chats doesn’t get you very far. An agent that integrates with your tools, remembers across conversations, reacts to events, and follows up on time is actually useful.
Mastra makes this easier than rolling your own. But the pattern works with any framework: give the agent tools, give it memory in layers, connect it to events, and let it act on time.
That’s an assistant. Everything else is a chatbot.
Code
The full implementation is on GitHub: github.com/dgalarza/meeting-assistant
Want Help Building This?
If you’re building AI agents into your workflow, whether it’s a personal assistant, an internal tool, or a customer-facing product, I can help.
Further Reading
- Mastra Documentation
- Mastra Memory
- Chat SDK
- How AI Agents Remember Things: deep dive into the memory taxonomy and architecture behind agents that remember
- How AI Agents Remember Things (YouTube): video walkthrough of agent memory systems
- Exa API
More on building real systems
I write about AI integration, architecture decisions, and what actually works in production.
Occasional emails, no fluff.