DeepSeek Provider for Claude Code
Frontier coding at per-token pricing with a 1M context
claude-multi add deepseek Use cases
DeepSeek-V4-Pro is a frontier coding model with a 1M token context window and strong performance across benchmarks. DeepSeek-V4-Flash handles fast tasks at a fraction of the cost. Both are accessible through a native Anthropic-compatible endpoint, which means zero friction with Claude Code.
Model specs
| Role | Model | Context | Max Output |
|---|---|---|---|
| Primary (Opus/Sonnet) | DeepSeek-V4-Pro | 1M | 128K |
| Fast (Haiku) | DeepSeek-V4-Flash | 1M | 128K |
The template maps V4-Pro to Opus and Sonnet roles for heavy lifting, and V4-Flash to Haiku and small/fast roles for quick tasks. Subagent work also uses V4-Flash, keeping background operations cheap.
Thinking mode is enabled with REASONING_EFFORT: high and 32,000 thinking tokens. Effort level is set to max.
Setup
- Create an account at deepseek.com and generate an API key
- Run the setup command:
claude-multi add deepseek- Paste your API key when prompted
The template handles the base URL, model mappings, thinking parameters, and output limits.
When to pick DeepSeek
DeepSeek is the default recommendation for developers who want frontier performance without a subscription. The pay-per-token model means you only pay for what you use, and the pricing is competitive across the board.
V4-Pro handles complex coding tasks at the same quality tier as much more expensive models. V4-Flash is fast enough for interactive autocomplete, shell commands, and subagent work where latency matters more than depth.
If you prefer a fixed monthly cost over variable billing, the GLM Coding Plan is the subscription alternative.
Pricing details
DeepSeek charges per token with no minimum commitment. Rates are among the lowest for frontier-class models. Check deepseek.com for current pricing.
Related providers
Pay-per-token via deepseek.com