The Multi-Model Reality
Not every task needs the same model. Design work — architecture decisions, tricky refactors, complex debugging — benefits from the most capable model available. Routine execution — mechanical renames, boilerplate generation, straightforward test writing — runs faster and cheaper on a lighter model.
Most developers already think this way. The problem is acting on it. Switching models mid-session traditionally means losing the thread. The new model starts cold: no knowledge of the plan, no awareness of what was just completed, no sense of what comes next. The cost of re-establishing context often exceeds the savings from using a faster model.
This trade-off disappears when state lives outside the model entirely.
Why Model Switches Lose Context
In a standard AI coding workflow, the context window is the working memory. The plan, the progress, the decisions, the current position — all of it lives in the transcript. When you switch models, that transcript does not transfer. The new model sees the files on disk and nothing else.
This is one of the six ways AI agents lose continuity. And it is the one that is entirely self-inflicted — you chose to switch, and the architecture punished you for it.
State in the MCP, Not the Model
Todo-mcp stores all working state in a durable SQLite database accessed through the Model Context Protocol. The strategic task list, the in-flight breadcrumbs, the sprint position, the open decisions — none of it lives in the model's context window. The window is the scratchpad. The MCP server is the notebook.
This means a model switch is operationally identical to crash recovery: the new model runs context_resume, reads the same store, and picks up exactly where the previous model left off. The state never lived in the model, so swapping the model does not lose the state.
The Handoff Pattern
Here is how a clean model handoff works in practice:
Step 1: The Outgoing Model Parks Clean State
Before the switch, the current model ensures its breadcrumb is current. This happens naturally if it is following the task-execution skill — breadcrumbs are updated at every meaningful checkpoint. No special "handoff" step is needed. The store is already up to date.
key: task_#7_progress
value: "Harden auth module — DONE: extracted handler, added Authenticator
interface, wired DI. NEXT: token-expiry test needs a fixture.
BUILD: clean, 12/14 tests passing." Step 2: Switch Models
Change the model in your configuration. The old session's transcript is gone. That is fine — nothing critical lived there.
Step 3: The Incoming Model Resumes
The new model's session starts with context_resume — the same command that opens every session. It reads the task list, the breadcrumb, and any open context. Within seconds it knows: task #7 is in progress, what is done, what is next, and the current build state.
Active sprint: Identity & Restore
In progress: #7 Harden auth module
Recently done: #4, #5, #6
Breadcrumb: extracted handler, added interface, wired DI
NEXT: build test fixture for token-expiry
Build state: clean, 12/14 passing The new model continues from exactly where the old one left off. No re-explanation. No re-reading the repo to figure out what happened.
When to Switch: Practical Scenarios
Design with a big model, execute with a fast one
Use the most capable model to author the sprint plan, make architecture decisions, and tackle the first tricky task. Then hand off to a faster model for the batch of routine tasks that follow. The plan, the decisions, and the context all persist in the store. The fast model just executes.
Unblock with a different model
Sometimes a model gets stuck in a loop — retrying the same failed approach. Switching to a different model gives a fresh perspective while preserving all the context about what has already been tried (logged in breadcrumbs). The new model does not repeat the failed attempts because it can see them.
Cost management across a long sprint
A multi-day sprint does not need peak capability for every session. Use the powerful model for planning and the hard problems. Use the efficient model for the straightforward sessions. The sprint state is continuous regardless of which model is driving on any given day.
What the Store Carries Across the Switch
| State | Where it lives | Survives model switch? |
|---|---|---|
| Sprint plan and task backlog | Todo-mcp tasks | Yes |
| Current task progress | Context breadcrumb | Yes |
| Decisions and rationale | Auto-memory + context | Yes |
| Lessons and preferences | Auto-memory | Yes |
| Conversation transcript | Context window | No (and it does not need to) |
The only thing lost is the literal transcript — which was going to be summarized away eventually anyway. Everything that matters is in the store.
This Is Not Multi-Agent
An important distinction: model switching is one agent at a time, using different models sequentially. It is not multiple agents running concurrently on the same project. Todo-mcp is designed for one agent working across time — one model hands off to the next through the shared store, never two models touching the same state simultaneously.
This is intentional. Sequential handoff through a durable store is simple and reliable. Concurrent multi-agent coordination is a different problem with different trade-offs — and not one this architecture tries to solve.
The Bigger Picture
Model switching is just one of the six continuity problems that persistent memory solves. The same architecture that makes model handoff seamless also handles crashes, compaction, session boundaries, and multi-week projects. The underlying principle is the same in every case: state that lives outside the model survives anything that happens to the model.
As the model landscape evolves — new models, different price points, specialized capabilities — the ability to switch freely without losing your place becomes increasingly valuable. The projects that can do it will use the right model for each job. The projects that cannot will overpay for capability they do not always need, or lose context every time they try to economize.
Getting Started
If you are already using Todo-MCP, model switching requires no additional setup. Follow the same skill-based workflow, ensure breadcrumbs are current before switching, and run context_resume on the other side. The store handles the rest.
If you are evaluating AI coding tools and multi-model workflows matter to you, this is the architecture question to ask: does the tool's memory live inside the model, or outside it? The answer determines whether switching models is a fresh start or a seamless continuation.