qwen

Alibaba Qwen Provider for Claude Code

Three-tier coder models from Alibaba Cloud

Quick start claude-multi add qwen

Use cases

Full-stack web development

Code completion and generation

Testing and CI pipeline work

Multi-language codebase navigation

Qwen3-Coder is Alibaba’s coding model family with three tiers: Next for heavy reasoning, Plus for balanced work, and Flash for speed. The DashScope API exposes a native Anthropic-compatible endpoint, so Claude Code connects directly.

Model specs

Role	Model	Context	Max Output
Primary (Opus)	Qwen3-Coder-Next	128K	65,536
Balanced (Sonnet)	Qwen3-Coder-Plus	128K	65,536
Fast (Haiku)	Qwen3-Coder-Flash	128K	65,536

Each tier maps to the corresponding Claude Code role. Heavy reasoning goes to Next, everyday coding to Plus, and quick tasks to Flash. Subagent work also uses Flash.

Thinking mode is enabled with REASONING_EFFORT: high and 16,000 thinking tokens. Auto-compaction is tuned for the 128K context window.

Setup

Create an account at Alibaba DashScope (international endpoint) and generate an API key
Run the setup command:

claude-multi add qwen

Paste your API key when prompted

The template configures the international endpoint at dashscope-intl.aliyuncs.com. If you are in mainland China, you may want to use the domestic endpoint instead.

Coding Plan alternative

Alibaba also offers a Coding Plan subscription with its own endpoint. If you prefer a monthly commitment over pay-per-token, use the qwen-coding template:

claude-multi add qwen-coding

The model mappings are identical. Only the base URL changes.

When to pick Qwen

Qwen is a solid choice if you want tiered model quality at different price points. Flash is fast and cheap for autocomplete and simple edits. Plus handles most coding tasks well. Next brings the deepest reasoning for architecture decisions and complex debugging.

The pay-per-token model means you pay proportionally. Background work runs on Flash at lower cost while complex tasks get the full power of Next.

Pricing details

DashScope charges per token with no minimums. Each tier has its own rate: Flash is cheapest, Next is most expensive. Check the DashScope pricing page for current rates.

DeepSeek - similar positioning, 1M context
GLM - subscription model, also Chinese provider
MiniMax - larger context window

Pricing

Pay-per-token via Alibaba DashScope

Related providers

glm Frontier reasoning models on a fixed monthly plan minimax 1M context window with 512K output tokens deepseek Frontier coding at per-token pricing with a 1M context