Claude Opus 4/7 + Claude Code: 7 Practical Tips for Maximizing Extended Context
Claude Opus 4.7 shipped with a 1M token context window. That’s five times what Sonnet 4.5 offers. However, this doesn’t mean we should no longer be careful with our context window.
The lost-in-the-middle problem doesn’t disappear at 1M tokens. Content in the center of the window still gets less attention than content at the beginning and end. Opus 4.7 uses a new tokenizer that improves model performance, but it also means files you read consume context in subtly different ratios than before. Anthropic’s docs note the new tokenizer can use up to 35% more tokens per equivalent input compared to previous models. And adaptive thinking, now the only supported thinking mode in 4.7 (fixed budgets are removed), consumes context dynamically. The model thinks longer on harder problems and shorter on easy ones. That thinking counts against your window.
If you’re coming from my earlier post on Understanding Claude Code’s Context Window, everything there still applies. The fundamentals haven’t changed. What has changed is the ceiling, and the set of controls available to you.
Here are seven workflow adjustments I’ve made since Opus 4.7 dropped. Each one addresses a specific constraint I hit in daily production use.
1. Front-Load Context in Your First Turn
One of the big changes from Opus 4.6 to Opus 4.7 is that it no longer reads between the lines. Opus 4.6 was better at taking a vague prompt and “figuring it out”. Opus 4.7, however, no longer does this. You need to provide good context to the model to achieve good results. The first message in the session anchors everything that follows.
Structure your first turn to include three things: what you want and why, which files or areas of the codebase are relevant, and what “done” looks like.
Here’s an example. Instead of this:
Add rate limiting to the API
Try this:
Add rate limiting to the webhook ingestion endpoint in
packages/gateway/src/routes/webhooks.ts. We're getting
hammered by a misbehaving integration that sends duplicate
events. Use the existing Redis connection in src/lib/redis.ts.
Rate limit by client IP, 100 requests per minute. Add tests
in __tests__/webhooks.test.ts. Don't change the event
processing logic in src/lib/event-handler.ts.
The second version tells Opus 4.7 exactly what to touch, why, and what to leave alone. You define the “what” and the constraints. Let the model propose the “how.”
One thing to watch for: don’t turn your first message into a specification document. If you find yourself writing more than a paragraph or two, you’re probably trying to control implementation details that the model should decide. Name the files, the constraints, and the definition of done. Stop there.
2. Switch Effort Levels Mid-Session
Thinking tokens count against your context window. A single xhigh response on a complex problem can use significantly more tokens than the same question at high. Over the course of a session, this adds up fast.
Opus 4.7 introduced xhigh effort and replaced the old fixed thinking budgets with adaptive thinking. At xhigh, the model almost always engages deep reasoning on complex work and skips thinking on simpler tasks. That’s useful for architecture decisions, complex debugging, and multi-file refactors. It’s overkill for renaming a variable across twenty files.
Here’s how I handle it. I start sessions at xhigh for the initial planning and implementation work. When I shift to mechanical tasks, I drop the effort level:
/effort high
Rename the files, run the migration, update the imports. Then when I need deep analysis again:
/effort xhigh
In practice: you spend the first part of a session at xhigh implementing a feature, then need to update some test fixtures and rename a few constants. Drop to high or even medium for that work. When you’re ready to debug a failing integration test, go back to xhigh.
The gotcha here is context switching cost. Don’t toggle effort every other message. Batch your mechanical tasks together and run them at a lower effort level in one block. Then switch back for the next piece of deep work.
3. Compact at 60%, Not When You See a Warning
Autocompact triggers when your context window is nearly full. By the time that happens with a 1M window, you’ve been running with degraded output quality for a while. The lost-in-the-middle effect doesn’t wait for you to run out of room. It starts affecting responses well before you hit the ceiling.
My rule of thumb: check /context periodically and compact when you hit around 60%. That sounds like a lot to throw away, but consider the flip side. You still have 400K tokens after compacting, which is twice the entire Sonnet 4.5 window.
Here’s what /context output looks like in a session approaching that threshold:
Context Usage
⛁ ⛀ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ claude-opus-4-7 · 610k/1000k tokens (61%)
After a proactive compact:
Context Usage
⛁ ⛀ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ claude-opus-4-7 · 85k/1000k tokens (8.5%)
That’s a fresh start with all the important decisions preserved. Much better than letting autocompact fire at capacity and losing coherence.
The trade-off with early compaction is that you lose conversational nuance. Specific phrasings, detailed file contents, and turn-by-turn reasoning all get compressed into a summary. This is why Tip 4 exists.
4. Steer Your Compaction
Running /compact without guidance lets the model decide what to keep and what to drop. This works reasonably well for short sessions, but in a long session with multiple decisions, the model often drops specifics that matter for the next phase of work.
Always pass steering instructions when you compact. Name the topics and the decisions you need preserved.
Here are three examples from real sessions:
/compact Preserve the auth refactor decisions: we chose
JWT with rotating refresh tokens over session cookies,
the token service is in src/lib/auth/tokens.ts, and
the migration adds a refresh_tokens table.
/compact Keep the schema changes we made to the proposals
table (added status enum, soft delete columns, and the
client_id foreign key). Preserve the repo pattern decision
from packages/shared/src/db/repos/proposals.ts.
/compact We're moving to phase 2 of the API implementation.
Preserve the route structure decisions (REST for CRUD,
webhooks for async events) and the middleware chain order.
Drop the debugging of the TypeScript config issues.
Keep your steering to two or three sentences. Name the topics, not every detail. The model will fill in the specifics from the conversation history. You are giving it a priority list, not writing the summary yourself.
5. Use Subagents for Context Isolation
I covered subagents in detail in my context window post, but Opus 4.7 shifts the default behavior. In my experience, Opus 4.7 spawns fewer subagents on its own compared to earlier models (Anthropic’s release notes confirm this as a deliberate behavior change). It’s more inclined to do work inline, which means exploration output that used to be isolated now accumulates in your main context.
That’s fine for focused tasks. It becomes a problem when you need to explore a large area of the codebase or review a significant diff. The fix: explicitly request subagent delegation.
The key is scoping what comes back. Instead of:
Review the changes on this branch
Try:
Have a subagent review the changes on this branch against
main. Report back: any bugs, any missing test coverage,
and any patterns that don't match our existing conventions.
Don't include the full diff in the report.
Good candidates for subagent delegation:
- Code reviews: The subagent reads every changed file, but your main context only gets the summary.
- Codebase exploration: “Have a subagent map out how the notification system works across packages/gateway and packages/agents.”
- Test analysis: “Spawn a subagent to check which tests cover the payment flow and identify gaps.”
- Pattern audits: “Use a subagent to find all places we handle errors in route handlers and check for consistency.”
The gotcha with subagents is that they don’t share your conversation history. If you made a decision earlier in the session that affects how the subagent should evaluate something, include that decision in the delegation prompt. The subagent starts fresh.
6. Use Rewind to Recover from Failed Approaches
Every failed approach leaves artifacts in your context: the wrong implementation, the correction, the explanation of why it was wrong. With Opus 4.7’s literal instruction following, this creates a real problem. The model may anchor on parts of a failed attempt even after you have corrected course, because that failed code is still in the conversation history.
The /rewind command (or double-tap Escape) rolls back to a previous point in the conversation. This removes the failed approach from context entirely, as if it never happened.
Here’s when to use rewind versus inline correction:
Rewind when the approach is fundamentally wrong. You asked for a webhook handler and got a giant switch statement, but your codebase uses an event routing pattern. Correcting inline means the model has both patterns in context and may blend them.
Correct inline when the details need adjustment. The approach is right but a method name is wrong, or it missed an edge case. The cost of the correction in context is low, and the model benefits from seeing the refinement.
A practical example: I asked Claude to implement a notification dispatch system. The first attempt built a synchronous pipeline. My codebase uses BullMQ for async job processing. Rather than explaining why synchronous was wrong and asking it to redo the work, which would leave both approaches in context, I rewound and rephrased:
Implement notification dispatch using our existing BullMQ
job infrastructure in packages/agents/src/lib/queue.ts.
Each notification type gets its own job processor.
Follow the pattern in the heartbeat-runner for job setup.
Clean context. Clear direction. No conflicting signals.
One warning: rewind is destructive. If the failed approach contained useful insights (it identified the right files to modify, or surfaced a constraint you hadn’t considered), note those before rewinding. You can include them in your rephrased prompt.
7. Know When to Clear, Compact, or Continue
Quality degrades gradually in long sessions. You won’t see a cliff. Responses get slightly less precise, slightly more generic, slightly more likely to miss constraints you established earlier. A 1M window means sessions can run much longer, which makes the decision of when to stop harder, not easier.
Here’s the decision framework I use:
Continue when you’re mid-task, below 60% context usage, and working on a single coherent thread. The model has strong recall of recent decisions and the work is flowing.
Compact when you’ve finished a phase and are starting the next one. You need the architectural decisions but not the turn-by-turn implementation details. This is where Tip 4’s steering instructions matter most.
Clear when the next task is unrelated to what you’ve been doing. Also clear when the model starts repeating itself, when you’ve already compacted multiple times in the session, or when you’ve persisted your plan externally (in a TODO file, a Linear issue, or a CLAUDE.md update).
Start a new session when you need different MCP servers, when you’re switching to a different branch, or when you’re doing parallel worktree work. Each worktree should get its own session. I covered why in Extending Claude Code with Worktrees for True Database Isolation.
The full session lifecycle follows a natural arc. Start with a strong first prompt (Tip 1) at xhigh effort (Tip 2). During the working phase, delegate exploration to subagents (Tip 5) and rewind failed approaches (Tip 6). When you hit around 60% context, compact proactively (Tip 3) with steering instructions (Tip 4). Then decide whether to continue, clear, or start fresh (Tip 7).
Session Start
├── Tip 1: Front-load context in first turn
├── Tip 2: xhigh for deep work, high/medium for mechanical
│
│ [Working...]
│
├── Tip 5: Delegate exploration to subagents
├── Tip 6: Rewind failed approaches
│
│ [~60% context used]
│
├── Tip 3: Proactive /compact
├── Tip 4: Steer the compaction
│
│ [Continue or...]
│
└── Tip 7: Clear / New session when needed
The Mental Model
The 1M context window isn’t five times more room. It’s five times more rope.
With a 200K window, context pressure forced discipline. You had to be deliberate about what went into the window because you would run out. With 1M tokens, poor habits go unnoticed much longer before the consequences show up. That makes discipline harder, not easier.
The one principle behind all seven tips: active context management beats passive accumulation. Front-load your intent. Control your effort levels. Compact before you need to. Steer the compaction. Isolate expensive exploration. Remove dead ends. Know when to stop.
These aren’t theoretical suggestions. They’re the adjustments I’ve made in my own workflow over the past week of daily Opus 4.7 usage. It rewards precision and punishes ambiguity. Give it clear context, and it delivers.
If this post was the explanation, the cheat sheet is the reference. Two sides: token costs for common MCPs on one, the
/clear//compact/ subagent decision tree on the other.
Further Reading
More on building real systems
I write about AI integration, architecture decisions, and what actually works in production.
Occasional emails, no fluff.