Swap the Engine Without Stopping the Car: Model Switching with Persistent AI Memory - Todo-MCP Blog
Advanced 6 min read | January 2025

Swap the Engine Without Stopping the Car: Model Switching with Persistent AI Memory

Use a powerful model for design and a fast model for execution — without losing context. When state lives outside the model, switching models mid-sprint is seamless.

Get Your Free AI Coding Tutorial

Learn how to solve context loss in Claude Code and Cursor

or enter your details

By continuing, you agree to our Terms and Privacy Policy.

The Multi-Model Reality

Not every task needs the same model. Design work — architecture decisions, tricky refactors, complex debugging — benefits from the most capable model available. Routine execution — mechanical renames, boilerplate generation, straightforward test writing — runs faster and cheaper on a lighter model.

Most developers already think this way. The problem is acting on it. Switching models mid-session traditionally means losing the thread. The new model starts cold: no knowledge of the plan, no awareness of what was just completed, no sense of what comes next. The cost of re-establishing context often exceeds the savings from using a faster model.

This trade-off disappears when state lives outside the model entirely.

Why Model Switches Lose Context

In a standard AI coding workflow, the context window is the working memory. The plan, the progress, the decisions, the current position — all of it lives in the transcript. When you switch models, that transcript does not transfer. The new model sees the files on disk and nothing else.

This is one of the six ways AI agents lose continuity. And it is the one that is entirely self-inflicted — you chose to switch, and the architecture punished you for it.

State in the MCP, Not the Model

Todo-mcp stores all working state in a durable SQLite database accessed through the Model Context Protocol. The strategic task list, the in-flight breadcrumbs, the sprint position, the open decisions — none of it lives in the model's context window. The window is the scratchpad. The MCP server is the notebook.

This means a model switch is operationally identical to crash recovery: the new model runs context_resume, reads the same store, and picks up exactly where the previous model left off. The state never lived in the model, so swapping the model does not lose the state.

The Handoff Pattern

Here is how a clean model handoff works in practice:

Step 1: The Outgoing Model Parks Clean State

Before the switch, the current model ensures its breadcrumb is current. This happens naturally if it is following the task-execution skill — breadcrumbs are updated at every meaningful checkpoint. No special "handoff" step is needed. The store is already up to date.

key:   task_#7_progress
value: "Harden auth module — DONE: extracted handler, added Authenticator
        interface, wired DI. NEXT: token-expiry test needs a fixture.
        BUILD: clean, 12/14 tests passing."

Step 2: Switch Models

Change the model in your configuration. The old session's transcript is gone. That is fine — nothing critical lived there.

Step 3: The Incoming Model Resumes

The new model's session starts with context_resume — the same command that opens every session. It reads the task list, the breadcrumb, and any open context. Within seconds it knows: task #7 is in progress, what is done, what is next, and the current build state.

Active sprint:   Identity & Restore
In progress:     #7 Harden auth module
Recently done:   #4, #5, #6
Breadcrumb:      extracted handler, added interface, wired DI
                 NEXT: build test fixture for token-expiry
Build state:     clean, 12/14 passing

The new model continues from exactly where the old one left off. No re-explanation. No re-reading the repo to figure out what happened.

When to Switch: Practical Scenarios

Design with a big model, execute with a fast one

Use the most capable model to author the sprint plan, make architecture decisions, and tackle the first tricky task. Then hand off to a faster model for the batch of routine tasks that follow. The plan, the decisions, and the context all persist in the store. The fast model just executes.

Unblock with a different model

Sometimes a model gets stuck in a loop — retrying the same failed approach. Switching to a different model gives a fresh perspective while preserving all the context about what has already been tried (logged in breadcrumbs). The new model does not repeat the failed attempts because it can see them.

Cost management across a long sprint

A multi-day sprint does not need peak capability for every session. Use the powerful model for planning and the hard problems. Use the efficient model for the straightforward sessions. The sprint state is continuous regardless of which model is driving on any given day.

What the Store Carries Across the Switch

StateWhere it livesSurvives model switch?
Sprint plan and task backlogTodo-mcp tasksYes
Current task progressContext breadcrumbYes
Decisions and rationaleAuto-memory + contextYes
Lessons and preferencesAuto-memoryYes
Conversation transcriptContext windowNo (and it does not need to)

The only thing lost is the literal transcript — which was going to be summarized away eventually anyway. Everything that matters is in the store.

This Is Not Multi-Agent

An important distinction: model switching is one agent at a time, using different models sequentially. It is not multiple agents running concurrently on the same project. Todo-mcp is designed for one agent working across time — one model hands off to the next through the shared store, never two models touching the same state simultaneously.

This is intentional. Sequential handoff through a durable store is simple and reliable. Concurrent multi-agent coordination is a different problem with different trade-offs — and not one this architecture tries to solve.

The Bigger Picture

Model switching is just one of the six continuity problems that persistent memory solves. The same architecture that makes model handoff seamless also handles crashes, compaction, session boundaries, and multi-week projects. The underlying principle is the same in every case: state that lives outside the model survives anything that happens to the model.

As the model landscape evolves — new models, different price points, specialized capabilities — the ability to switch freely without losing your place becomes increasingly valuable. The projects that can do it will use the right model for each job. The projects that cannot will overpay for capability they do not always need, or lose context every time they try to economize.

Getting Started

If you are already using Todo-MCP, model switching requires no additional setup. Follow the same skill-based workflow, ensure breadcrumbs are current before switching, and run context_resume on the other side. The store handles the rest.

If you are evaluating AI coding tools and multi-model workflows matter to you, this is the architecture question to ask: does the tool's memory live inside the model, or outside it? The answer determines whether switching models is a fresh start or a seamless continuation.

Ready to Never Lose Context Again?

Get your free setup tutorial and start shipping faster

or enter your details

By continuing, you agree to our Terms and Privacy Policy.

Continue Reading

Back to all articles