Providers

Can I use it with local models like Ollama?

Q: Can I use it with local models like Ollama?

Any provider that exposes an Anthropic-compatible REST API works, including local setups like Ollama with the right adapter.

Yes, as long as your local model server exposes an Anthropic-compatible REST endpoint. Claude Code speaks the Anthropic API protocol, so the server on the other end needs to understand that format.

How to set it up

Create an instance without a template:

claude-multi add local

Then edit ~/.claude-multi/local/settings.json and set the env vars to point at your local server:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:11434/v1",
    "ANTHROPIC_MODEL": "your-model-name",
    "ANTHROPIC_SMALL_FAST_MODEL": "your-fast-model"
  }
}

Replace the URL and model names with whatever your local server exposes.

What works and what doesn’t

If your local server faithfully implements the Anthropic messages API (the /v1/messages endpoint), Claude Code will work with it. Tools like Ollama with an Anthropic-compatible adapter, LiteLLM, or vLLM with the right proxy can bridge the gap.

The further your local setup deviates from the Anthropic API spec, the more likely you are to hit edge cases, especially around streaming, tool use, and extended thinking.

A practical note on cost

Running local models eliminates per-token API costs entirely. You pay in compute (GPU time, electricity) instead. For high-volume tasks like code generation and refactoring, this can be significantly cheaper than any cloud provider, if you have the hardware.

Which providers are supported?: the built-in templates
How do I create a new instance?: the full setup walkthrough

More info

/docs/providers/: template reference and env var details
src/templates.ts: see how templates set ANTHROPIC_BASE_URL and model mappings for reference

Can I use it with local models like Ollama?

How to set it up

What works and what doesn’t

A practical note on cost

Related questions

More info