Cut API Costs by Routing Tasks to Cheaper Models
The problem
You burn through your Claude API budget on tasks that don't need a frontier model. Quick refactors, style tweaks, and doc updates cost the same as architecture planning.
The fix
Create two instances: a cheap one for routine work and a powerful one for hard problems. Route tasks by picking the right command.
The math
A DeepSeek Flash call costs a fraction of what Claude Opus charges per token. Most day-to-day coding tasks — renaming variables, writing boilerplate, updating configs — do not need a frontier model.
With claude-multi you get two (or more) commands in your terminal. Each one launches a full Claude Code session, but pointed at a different provider. Pick the right one for the job and watch your bill drop.
Setting up the split
After installing claude-multi, open the TUI:
claude-multiCreate a budget instance using the DeepSeek template. This gives you a claude-budget command that routes through DeepSeek’s API at lower per-token cost.
Then create a power instance using the GLM template. This gives you claude-power for the harder tasks.
# Routine work -- cheap and fastclaude-budget "add error handling to all fetch calls"
# Hard problems -- frontier modelclaude-power "redesign the data layer to support offline-first"Both instances share the same Claude Code interface. Same /loop, same skills, same plugins. The only difference is which provider handles the request.
What you keep
Because each instance is still Claude Code under the hood, you lose nothing:
- All your MCP servers still work
- Plugins and skills carry over if you enable auto-sync
- Conversation history stays isolated per instance
- You can still use
/compact,/loop, and every built-in command
When to use which
| Task | Command | Why |
|---|---|---|
| Rename variables | claude-budget | Simple text transformation |
| Write unit tests | claude-budget | Pattern-based generation |
| Debug a regex | claude-budget | Focused, small scope |
| Refactor a module | claude-power | Needs broad context |
| Design an API | claude-power | Requires judgment |
| Architecture review | claude-power | Complex reasoning |
Tips
- Use auto-sync during setup so both instances share the same plugins and skills
- Check
claude-multiTUI to see cost-relevant config per instance under Instance details - Add a third instance for MiniMax if you want another price-performance tier