Moonshot Kimi Provider for Claude Code
Agentic coding with strong tool-use and reasoning
claude-multi add kimi Use cases
Kimi K2.5 is Moonshot AI’s coding-focused model with strong performance on agentic benchmarks. It excels at multi-step tool use, which is the core of how Claude Code operates. The Anthropic-compatible endpoint at moonshot.ai connects directly to Claude Code without any adapters.
Model specs
| Role | Model | Context | Max Output |
|---|---|---|---|
| All roles | Kimi K2.5 | 128K | 65,536 |
The template maps K2.5 to every role. It handles heavy reasoning and fast tasks equally well.
Thinking mode is enabled with REASONING_EFFORT: high and 16,000 thinking tokens. Auto-compaction is tuned for the 128K context window. Without these settings, Claude Code assumes a 200K window for unrecognized models and never compacts, leading to context overflow crashes.
Setup
- Create an account at moonshot.ai and generate an API key
- Run the setup command:
claude-multi add kimi- Paste your API key when prompted
The template configures the base URL, model mapping, thinking parameters, context limits, and compaction thresholds.
When to pick Kimi
Kimi is a strong choice for interactive, tool-heavy workflows. If you spend most of your Claude Code time in agentic mode (reading files, running commands, editing code in sequence), K2.5 handles that loop well. It is also competitive on price.
The 128K context window is sufficient for most day-to-day development. If you regularly work with codebases larger than 100K tokens, consider MiniMax or DeepSeek for their 1M windows.
Kimi is pay-per-token only. There is no subscription plan.
Pricing details
Moonshot charges per token with no minimums. Check moonshot.ai for current pricing.
Related providers
Pay-per-token via moonshot.ai