Can I use it with local models like Ollama?
Yes, as long as your local model server exposes an Anthropic-compatible REST endpoint. Claude Code speaks the Anthropic API protocol, so the server on the other end needs to understand that format.
How to set it up
Create an instance without a template:
claude-multi add localThen edit ~/.claude-multi/local/settings.json and set the env vars to point at your local server:
{ "env": { "ANTHROPIC_BASE_URL": "http://localhost:11434/v1", "ANTHROPIC_MODEL": "your-model-name", "ANTHROPIC_SMALL_FAST_MODEL": "your-fast-model" }}Replace the URL and model names with whatever your local server exposes.
What works and what doesn’t
If your local server faithfully implements the Anthropic messages API (the /v1/messages endpoint), Claude Code will work with it. Tools like Ollama with an Anthropic-compatible adapter, LiteLLM, or vLLM with the right proxy can bridge the gap.
The further your local setup deviates from the Anthropic API spec, the more likely you are to hit edge cases, especially around streaming, tool use, and extended thinking.
A practical note on cost
Running local models eliminates per-token API costs entirely. You pay in compute (GPU time, electricity) instead. For high-volume tasks like code generation and refactoring, this can be significantly cheaper than any cloud provider, if you have the hardware.
Related questions
- Which providers are supported?: the built-in templates
- How do I create a new instance?: the full setup walkthrough
More info
- /docs/providers/: template reference and env var details
- src/templates.ts: see how templates set
ANTHROPIC_BASE_URLand model mappings for reference