claude-multi Blog

v0.8.1: Fixing the Broken npm Package

hmziqrs — Wed, 03 Jun 2026 00:00:00 GMT

# v0.8.1: Fixing the Broken npm Package You installed `claude-multi` globally. You ran it. You got this: ``` error: Module not found "/Users/.../claude-multi/bin/../dist/cli.js" ``` That's not a missing dependency. The file was never there. ## Two builds, one directory claude-multi has two build steps. The CLI compiles from TypeScript into a single JS bundle. The documentation site compiles from Astro into static HTML. We had both outputting into `dist/`. The build script in `package.json`: ```json "build": "bun build src/cli.ts --outdir dist --target node --format esm" ``` The docs build in `astro.config.mjs` uses Astro's default output directory, which is also `dist/`. When the package got published, the `files` field said to include `dist/cli.js`. But `dist/` was full of HTML files from the docs build: `index.html`, `_astro/`, fonts, images, sitemap XML. The whole rendered site. No `cli.js` anywhere. The shell wrapper tried to load `$DIR/../dist/cli.js`, found HTML instead, and threw a module resolution error. ## The fix Moved the CLI build output from `dist/` to `build/`. Three files changed in the source, two more in CI. `package.json` now says `--outdir build` and lists `build/cli.js` in the `files` array. The shell wrapper in `bin/claude-multi.js` resolves `../build/cli.js` instead of `../dist/cli.js`. The CI workflows that verify the build artifact now check `build/cli.js`. Astro keeps `dist/`. The CLI gets its own directory. They don't touch each other. ## Upgrading If you're on 0.8.0 and hitting the error: ``` bun add -g claude-multi@0.8.1 ``` The tarball now contains exactly seven files. No docs artifacts leaking in. --- Full changelog: [CHANGELOG.md](https://github.com/hmziqrs/claude-multi/blob/master/CHANGELOG.md). Provider reference: [/docs/providers/](/docs/providers/).

v0.8.2: Granular Sync Modes and Responsive Web

hmziqrs — Wed, 03 Jun 2026 00:00:00 GMT

# v0.8.2: Granular Sync Modes and Responsive Web Auto-sync was a binary choice. Either `plugins/` and `skills/` were symlinked to `~/.claude` (auto-sync on), or they were independent copies (auto-sync off). Two options. Pick one. The problem with auto-sync: every instance sees every plugin you install in `~/.claude`, immediately. You install something in your default Claude, and every `claude-*` alias gets it whether you want it or not. The problem with manual mode: you have to copy new plugins by hand every time. Install a plugin in `~/.claude`, then manually copy it to each instance that needs it. Tedious. ## Three modes instead of two Auto works the same as before. The entire `plugins/` and `skills/` directories are symlinked to `~/.claude/plugins` and `~/.claude/skills`. Any change in `~/.claude` is instantly visible to the instance. Full-manual works the same as the old "manual" mode. Independent copies of everything. No symlinks. Half-manual is the new one. The `plugins/` and `skills/` directories are real directories, not symlinks. But each plugin and skill inside them is individually symlinked back to `~/.claude`. You get the existing plugins from your default installation, but new installs in `~/.claude` don't automatically appear. You control what shows up. The function behind this is `halfSyncPluginsAndSkills()` in `config.ts`. When switching from auto to half-manual, it removes the whole-directory symlink, creates a real directory in its place, then iterates over every item in `~/.claude/plugins` and `~/.claude/skills` and creates individual relative symlinks for each one. Items that already exist in the instance directory are not overwritten. If you added your own plugin to the instance, it stays. If you later install a new plugin in `~/.claude`, it does not appear until you explicitly re-sync. Re-syncing is available in the TUI. The ToggleAutoSync screen now has a "Force re-sync" option for auto and half-manual modes. It rebuilds the symlinks without changing the mode. ## Downgrade only You can go auto to half-manual to full-manual. You cannot go back up. Going back up would require reconciling diverged directories, which is a data loss problem. If the instance has its own plugins that don't exist in `~/.claude`, going back to auto-sync would lose them under a symlink. `canConvertSyncMode()` in `constants.ts` enforces this. It uses an ordered array and checks that the target mode has a higher index than the current one. ## CLI and TUI changes The `add` command got new flags: ``` claude-multi add my-instance --sync-mode half-manual claude-multi add my-instance --half-manual claude-multi add my-instance --auto-sync claude-multi add my-instance --manual ``` `--sync-mode` accepts `auto`, `half-manual`, or `full-manual`. The other flags are shortcuts. Specifying more than one is an error. The `auto-sync` command now accepts mode names: ``` claude-multi auto-sync my-instance half-manual ``` Legacy `on`/`off` still works. `on` maps to `auto`, `off` maps to `full-manual`. The TUI ToggleAutoSync screen is now a Sync Mode screen. Current mode gets a color label (green for auto, cyan for half-manual, yellow for full-manual), and it shows which downgrades are available. `plugin install/remove` is blocked when the instance is in half-manual mode, because individually symlinked plugins can't be individually managed. ## The web side Five files changed on the docs site. The header got a hamburger dropdown. Below 60rem, the nav links disappear and a `nav-dropdown` custom element takes over. Frosted glass panel (`backdrop-filter: blur(20px) saturate(140%)`), Escape to close, click-outside to close, proper ARIA menu roles. On desktop, nothing changed. The sidebar is sticky now. On desktop (72rem+), the `.page` container is a CSS grid where the sidebar and main content overlap in the same grid cell. The sidebar uses `position: sticky` instead of fixed. The difference: a sticky element stops at the boundary of its scroll container, so the sidebar scrolls with the page until it hits the footer, then stops. No overlap. On mobile and tablet, Starlight's default layout is untouched. The footer used to be trapped inside Starlight's `.page` wrapper, sitting in the sidebar column instead of spanning the viewport. The fix moves it outside `.page` entirely. Dev mode uses Astro middleware, the static build uses an integration `buildDone` hook. The footer HTML comes from `site-footer-html.ts`, shared between marketing and docs pages. Both the header and the right sidebar (table of contents) pin with `position: sticky`. Header at `top: 0`, TOC at `top: var(--sl-nav-height)`. ## Backward compat Existing instances keep their current behavior. The config has a new `syncMode` field, but `getSyncMode()` falls back to the old `autoSync` boolean if it's not set. `autoSync: false` resolves to `full-manual`. Everything else resolves to `auto`. Both fields are written on update so the config works with old and new versions. ## Upgrading ``` bun add -g claude-multi@0.8.2 ``` Existing instances keep their current mode. To switch: ``` claude-multi auto-sync my-instance half-manual ``` 22 files changed. 17 for sync modes, 5 for the web. The `SyncMode` enum, `canConvertSyncMode()`, `availableSyncModeConversions()`, and `halfSyncPluginsAndSkills()` are new exports in `constants.ts`. --- Full changelog: [CHANGELOG.md](https://github.com/hmziqrs/claude-multi/blob/master/CHANGELOG.md). Provider reference: [/docs/providers/](/docs/providers/).

MiniMax M3 for Claude Code: 1M Context, Benchmarks, Pricing

hmziqrs — Mon, 01 Jun 2026 00:00:00 GMT

MiniMax released M3 yesterday. The short version: 1M-token context window, frontier coding scores, native image and video input, toggleable thinking. The claude-multi `minimax` template now uses it. Existing instances can sync with `claude-multi doctor fix`. The longer version is worth reading. M3 clears Opus 4.7 on three benchmarks and costs about 1/30th the price. It is also the first open-weight model to ship with a 1M context window, frontier-level coding, and native multimodality all at once. ## What changed in the template The model names in the `minimax` template went from `MiniMax-M2.7` to `MiniMax-M3` across all model slots (opus, sonnet, haiku, small/fast). If you created a MiniMax instance before this update, running `claude-multi doctor fix` will sync the new template automatically ([how template sync works](/blog/v064-provider-template-sync/)). Three things are different this time: **No more auto-compaction override.** M2.7 had a 128K context window. Claude Code assumes 200K for unrecognized models, which meant auto-compaction never fired and context would fill to 100% before crashing. M3 has a 1M context window. The override is gone. **Output tokens up to 512K.** M2.7 capped at 64K. M3 supports up to 512K output tokens, which matters for the long-horizon agentic tasks the model is built for. **Effort level set to `max`.** M3 supports toggleable thinking. The template enables thinking with `REASONING_EFFORT: "high"` and sets `CLAUDE_CODE_EFFORT_LEVEL: "max"`. ## The M3 architecture: MSA The headline technical feature is MiniMax Sparse Attention (MSA). Standard attention scales quadratically with context length. MSA replaces full attention with KV-block selection, where an index branch scores blocks of key-value pairs and a sparse branch only computes attention on the selected blocks. The practical result: at 1M tokens, M3's per-token compute is 1/20th of the previous generation. Prefilling is 9x faster. Decoding is 15x faster. MiniMax claims MSA matches full attention on the vast majority of capabilities across their ablations. M3 was designed around sparse attention from the start. The compute cost does not explode at long context the way it does with full attention, which is what makes the 1M window usable in practice rather than a spec sheet number. ## MiniMax M3 benchmarks The coding and agentic benchmarks are what matter most for Claude Code users. ### Coding | Benchmark | M3 | Opus 4.7 | GPT-5.5 | Gemini 3.1 Pro | M2.7 | |---|---|---|---|---|---| | SWE-Bench Pro | 59.0 | **64.3** | 58.6 | 54.2 | 56.2 | | SWE-Bench Verified | 80.5 | **87.6** | 82.9 | 80.6 | 79.9 | | Terminal-Bench 2.1 | 66.0 | 66.1 | **78.2** | 70.3 | 51.1 | | SVG-Bench | **63.7** | 62.3 | 58.2 | 59.2 | 48.0 | | KernelBench Hard | 28.8 | **30.7** | 20.9 | 18.6 | 10.5 | | PaperBench | 52.6 | **58.5** | 57.5 | 46.7 | 30.6 | M3 beats GPT-5.5 on SWE-Bench Pro (59.0 vs 58.6) and edges past Opus 4.7 on SVG-Bench (63.7 vs 62.3). Opus still leads the main SWE-Bench scores. But M3 went from mid-pack with M2.7 to second place on most coding benchmarks, and the gap to Opus is narrower than the gap between Opus and the rest. ### Agentic | Benchmark | M3 | Opus 4.7 | GPT-5.5 | M2.7 | |---|---|---|---|---| | Claw-Eval | **74.5** | 71.6 | -- | 49.7 | | MCP Atlas | 74.2 | **77.0** | 75.3 | 49.4 | | DRACO | 73.2 | **77.7** | -- | 66.8 | | BankerToolBench | 76.1 | **81.3** | 70.0 | 63.9 | Claw-Eval is the end-to-end autonomous agent evaluation. M3 takes the top spot at 74.5, ahead of Opus 4.7 at 71.6. This is the benchmark that most closely matches what Claude Code does: sustained multi-step tool use in a real environment. M3 was trained for multi-turn production-like collaboration using an interactive user-simulator framework. ### Multimodal | Benchmark | M3 | Opus 4.7 | GPT-5.5 | Gemini 3.1 Pro | |---|---|---|---|---| | OmniDocBench | **91.6** | 89.3 | 87.5 | 88.1 | | MMMU-Pro | 78.1 | 77.0 | **81.2** | 80.5 | | Video-MMMU | 84.6 | 83.0 | 86.4 | **87.9** | OmniDocBench measures multimodal document understanding across text, tables, charts, and images. M3 leads at 91.6. If you use Claude Code for document-heavy workflows, M3 can ingest the paper, figures, tables, and formulas all at once within its 1M context window. ### The jump from M2.7 The upgrade from M2.7 to M3 is massive. On SWE-Bench Pro, M3 jumps from 56.2 to 59.0. On KernelBench Hard, it nearly triples from 10.5 to 28.8. On Claw-Eval, it goes from 49.7 to 74.5. On SVG-Bench, from 48.0 to 63.7. PaperBench goes from 30.6 to 52.6. Across every benchmark, M3 is a different class of model than M2.7. ## MiniMax M3 pricing and Token Plans Through the MiniMax API: | Tier | Input (per 1M) | Output (per 1M) | Context | |---|---|---|---| | Standard (up to 512K) | $0.60 | $2.40 | up to 512K | | Long context (512K to 1M) | $1.20 | $4.80 | 512K to 1M | | Cache read | $0.12 | -- | -- | For comparison, Claude Opus 4.7 runs about $15/M input and $75/M output. M3 at $2.40/M output is roughly 1/30th the cost. Even at the long-context tier ($4.80/M output), it is still a fraction of what Opus charges. MiniMax also offers Token Plans (subscription): - **Plus**: $20/month for about 1.7 billion tokens - **Max**: $50/month for about 5.1 billion tokens - **Ultra**: $120/month for about 9.8 billion tokens Both Token Plan and pay-per-token use the same `api.minimax.io` endpoint. The API key type determines which quota is consumed. ## Real-world demonstrations MiniMax published three extended task runs that show what 1M context plus frontier coding looks like in practice. **CUDA kernel optimization.** M3 optimized an FP8 GEMM kernel on NVIDIA Hopper GPUs over 24 hours. 147 submissions, 1,959 tool calls. Hardware peak utilization went from 7.6% to 71.3%, a 9.4x speedup. Most models stopped improving within 30 submissions. M3's best solution showed up on submission 145. The tool call history gets dense and structured fast, and MSA's sparse attention keeps the model focused on what matters as the conversation grows. **Paper reproduction.** M3 autonomously reproduced an ICLR 2025 Outstanding Paper over 12 hours. 18 commits, 23 experimental figures. The paper's text, formulas, and figures all fit in context at once. The multimodal input handled the curves and charts natively. **Training models from scratch.** On PostTrainBench, M3 was given four base models and told to synthesize data, train, evaluate, and iterate, all without human intervention. It scored 0.37, compared to Opus 4.7 at 0.42 and GPT-5.5 at 0.39. ## Getting started **New instance:** ```bash claude-multi add minimax --provider minimax --api-key sk-... ``` **Existing instance (sync to M3):** ```bash claude-multi doctor check # see what needs updating claude-multi doctor fix # re-applies the latest template ``` Or from the TUI: press `!` to open the health screen, then `f` to fix. The sync preserves your API key and any custom env vars you set. --- Provider reference with model mappings, endpoints, and plan notes: [/docs/providers/](/docs/providers/). Environment variable reference: [/docs/environment-variables/](/docs/environment-variables/). Template source: [src/templates.ts](https://github.com/hmziqrs/claude-multi/blob/master/src/templates.ts). For background on how the minimax template was originally added, see [Five New Provider Templates](/blog/five-new-provider-templates/).

v0.6.5: Action Buttons, Health Screen Fix, Hardened Migrations

hmziqrs — Sat, 30 May 2026 00:00:00 GMT

# v0.6.5: Action Buttons, Health Screen Fix, Hardened Migrations Instance details grew action buttons. The health screen had broken dismiss actions, now fixed. And migrations got more resilient to crashes and race conditions. ## Action buttons in instance details The instance details screen was a read-only wall of text. You could see your instance's config, version, plugins, MCP servers. But to do anything, you had to go back to the main menu and find the right screen. Now when you open an instance's details and press Enter, you get an actions menu: - Update settings template: re-syncs the provider template env vars (model names, thinking limits, base URL) to whatever the latest template defines. Your API key and custom tunable vars are preserved. - Update alias wrapper: regenerates the wrapper script at `~/.local/bin/claude-` to match the current standard. - Override alias to standard: only shows up when the wrapper is mismatched or missing. Force-regenerates it. Each action shows a live status indicator next to it. ✓ up to date, ⚠ mismatch detected, ✗ wrapper missing. You can tell whether an instance needs attention before picking an action. This is the per-instance equivalent of `claude-multi doctor fix`. Targeted at a single instance, with visual feedback about what's wrong. ## Health screen: dismiss actually works now The health screen (press `!` from the main menu) had a bug where pressing `d` to dismiss an issue did nothing. The handler checked a `selectedIssue` variable that was never set. There was no way to select an issue from the list. Dead code the whole time. The fix replaces the old static issue cards with a Select menu. You pick an issue with arrow keys and Enter, see its full details (severity, instance name, resolution hint), then press `d` to dismiss it. `D` still dismisses all at once. There was also a bug where the "Instance migrations pending" warning kept showing after running `doctor fix`. The migration saved the updated config to disk, but the app's in-memory state wasn't refreshed before health checks re-ran. The fix: call `reload()` after migration, so the health check sees the fresh config version and stops reporting a stale warning. ## Migration hardening The migration system got three infrastructure fixes. Stale lock detection. If claude-multi crashes mid-migration (SIGKILL, power loss, whatever), a lock file stays behind at `~/.claude-multi/.migration.lock`. The old code checked whether the PID in the lock was still running, but PIDs get recycled on long-running systems. Now there's a 30-minute staleness check: if the lock is older than half an hour, it gets removed regardless of what the PID check says. Atomic health status writes. The health status file at `~/.claude-multi/health-status.json` was written with plain `writeFileSync`. If the process crashed mid-write, or if two claude-multi instances wrote at the same time, the file could end up corrupted. Now it writes to a `.tmp` file first, verifies the JSON parses, then renames it into place. Doctor fix double-fire guard. Pressing `f` on the health screen calls `handleDoctorFix`, which is async. If you hit `f` twice fast, the function would run twice concurrently. The second run was harmless (migrations are idempotent), but it caused an unnecessary reload and a misleading result screen. Now there's a `doctorRunning` flag that prevents concurrent invocations. ## Under the hood `syncProviderTemplateForInstance()` in `config.ts` is the on-demand version of the v0.6.3 migration logic. Takes an instance, detects its provider, re-applies the latest template while preserving API key and tunable env vars. `detectTemplateMismatch()` and `detectWrapperMismatch()` in `instance-diagnostics.ts` are pure functions that compare an instance's current state against the expected template. The UI uses them to show mismatch indicators. `TUNABLE_ENV_VARS` moved to `constants/env.ts`. It's shared between diagnostics (excluded from comparison) and sync (preserved during update) so they can't drift apart. --- Full changelog: [CHANGELOG.md](https://github.com/hmziqrs/claude-multi/blob/master/CHANGELOG.md). Provider reference: [/docs/providers/](/docs/providers/).

v0.7.0: No More Pinned Claude Binary

hmziqrs — Sat, 30 May 2026 00:00:00 GMT

# v0.7.0: No More Pinned Claude Binary Here's what was happening. claude-multi installed its own private copy of `@anthropic-ai/claude-code` into `~/.claude-multi/bin/` via npm. Every wrapper script hardcoded the path to that copy. The idea was to pin a known-good version so third-party providers wouldn't break during the v2.1.154 incident. Meanwhile your global `claude` kept auto-updating by rotating a symlink at `~/.local/share/claude/versions/`. But the wrappers stayed stuck on whatever `claude-multi` had pinned. You'd be on 2.1.158 globally while every `claude-*` alias was still running 2.1.156. Two versions behind. No way to close the gap without changing code and bumping a constant. That pinned copy is gone. Every wrapper now resolves to whatever `claude` you have in PATH. One binary, one version, auto-updates work. ## What changed `getClaudePath()` used to have three priorities: env override, pinned binary, global PATH. We ripped out the pinned binary check. Now it goes straight from env override to `which claude` (or `where claude` on Windows). `getGlobalClaudePath()` and `tryGetGlobalClaudePath()` became identical to `getClaudePath()` and `tryGetClaudePath()` after the removal, so we deleted them. All callers updated. We removed four things from `version.ts`: `COMPATIBLE_CLAUDE_VERSION`, `isThirdPartyApiBroken()`, `getPinnedBinaryVersion()`, and `installPinnedClaude()`. The pinned path constants `PINNED_BIN_DIR` and `PINNED_CLAUDE_BIN` are gone from `paths.ts`. 53 references across 15 files. Zero left. ## Health checks and doctor fix rewritten This one was embarrassing. The health check that validates wrapper binary paths used to compare against `PINNED_CLAUDE_BIN`. If a wrapper pointed to anything else, it flagged it as broken. The v0.6.2 migration started using the global path, which meant the health check would immediately flag every migrated wrapper as wrong. Then `doctor fix` would rewrite them back to the pinned binary. The migration and the health check were fighting each other. Every time you ran doctor, it undid what the migration just did. Both now resolve the current global claude path and compare against that. The check also picked up a Windows `.cmd` regex pattern so it can parse wrapper scripts on both platforms. `doctor fix` used to install the pinned binary as its first step. Now it just checks that `claude` is available in PATH, tells you which binary it found, and regenerates any stale wrappers. ## Migration fixes The v0.6.2 migration already called `tryGetGlobalClaudePath()` to point wrappers at the global binary. Since we deleted that function, the migration now calls `tryGetClaudePath()`. Same result, different name. We added a `console.warn` for the case where claude can't be found during migration. Previously the wrapper regeneration just silently skipped. Now you actually hear about it. The v0.6.3 migration had a bug with `TUNABLE_ENV_VARS`. It had a local copy of the set that was missing two entries: `CLAUDE_CODE_AUTO_COMPACT_WINDOW` and `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE`. If you had customized auto-compaction settings, that migration would overwrite them with template defaults. We replaced the duplicate with an import from `constants/env.ts` so the canonical set is the only set. ## Three Windows bugs Found during the audit, not caused by this change, but fixed in the same pass. `where claude` output was split on `\n` but Windows uses `\r\n`. The PATH membership check in the `add` command used `lastIndexOf('/')` and `split(':')` instead of `path.dirname()` and `path.delimiter`. And the health check had no regex for `.cmd` wrapper format. ## Upgrading from v0.6.5 Run `claude-multi doctor fix`. It regenerates all your wrapper scripts to point to your global `claude` binary. Then delete the old npm install: ``` rm -rf ~/.claude-multi/bin/ ``` That's about 100MB of duplicated Claude Code you don't need anymore. Your instances follow whatever version your global `claude` is on now. When Claude Code auto-updates itself, all your aliases pick it up on the next launch. ## Numbers 15 files changed. 53 references removed. 4 bugs fixed. 242 tests passing. --- Full changelog: [CHANGELOG.md](https://github.com/hmziqrs/claude-multi/blob/master/CHANGELOG.md). Provider reference: [/docs/providers/](/docs/providers/).

Claude Code v2.1.156 fixes the third-party provider breakage

hmziqrs — Fri, 29 May 2026 00:00:00 GMT

# Claude Code v2.1.156 fixes the third-party provider breakage Yesterday I wrote about how Claude Code v2.1.154 broke every third-party API provider. The bug was in the fallback logic: Claude Code's retry handler only caught HTTP 400, but providers return HTTP 422 for validation errors. The beta feature that sends `role: "system"` in the messages array would fail, the fallback would not trigger, and your conversation died. Anthropic shipped v2.1.156 a few hours ago. It fixes this. ## What changed in v2.1.156 The fallback detection now catches both HTTP 400 and 422. When a provider rejects the `mid-conversation-system` beta with either status code, Claude Code removes the beta header and falls back to putting system instructions in `` blocks inside user messages. This is the same fallback that already worked for first-party API calls that happened to return 400. Versions 2.1.154 and 2.1.155 still have the bug. Upgrade to 2.1.156 or later. ## What claude-multi users need to do If you are on claude-multi v0.6.0 or later: ```bash claude-multi doctor fix ``` This reinstalls the pinned Claude binary at `~/.claude-multi/bin/` to v2.1.156 and updates any wrapper scripts that still point to the old version. If you are on an older version of claude-multi, update first: ```bash bun update -g claude-multi claude-multi doctor fix ``` ## Auto-updates are back on When the breakage hit, we disabled auto-updates in claude-multi to prevent Claude Code from silently updating to a broken version. Now that v2.1.156 fixes the issue, auto-updates are re-enabled. New instances created with provider templates will not have `DISABLE_AUTOUPDATER=1` or `DISABLE_UPDATES=1` injected into their settings. If you have existing instances with those env vars in their `settings.json`, you can remove them manually or re-apply the provider template. They are harmless to leave in place; they just prevent Claude Code from auto-updating. ## The safety infrastructure stays We are keeping the version pinning and health check code in place, labeled with `[SAFE PARK]` comments throughout the codebase. If a future Claude Code update breaks third-party providers again, we can reactivate it by flipping a few constants: - `isThirdPartyApiBroken()` in `src/version.ts` gets an updated version range - `COMPATIBLE_CLAUDE_VERSION` gets the last working version number - `PROVIDER_COMMON_ENV` in `src/templates.ts` gets `DISABLE_AUTOUPDATER` and `DISABLE_UPDATES` back - `autoUpdates` in `src/config.ts` flips back to `false` The `doctor fix` command and the TUI health screen will pick up the changes automatically. No new code needed. This is the second time in recent months that a Claude Code update has broken provider compatibility. Keeping the safety code parked and ready seems prudent. ## TL;DR - v2.1.156 fixes the HTTP 422 bug that broke third-party providers in v2.1.154 and v2.1.155 - Run `claude-multi doctor fix` to update your pinned binary - Auto-updates are re-enabled for new instances - The pinning safety code stays parked, ready to reactivate if it happens again

v0.6.3: Drop the Pinned Binary, Update Every Provider Template

hmziqrs — Fri, 29 May 2026 00:00:00 GMT

# v0.6.3: Drop the Pinned Binary, Update Every Provider Template Two things in this release: moving away from the pinned Claude Code binary, and bringing every provider template up to date with correct thinking and output token limits. ## No more pinned binary Every claude-multi wrapper script used to point at a pinned Claude Code binary living at `~/.claude-multi/bin/`. That was the right call during the v2.1.154 incident, when we needed to lock everyone to v2.1.153 to avoid the third-party provider breakage. But the incident is over. Claude Code v2.1.156 fixed it, and pinning a binary nobody remembers to update just means your instances fall behind. I ran into this myself after the 0.6.2 release. The migration had run, the wrapper templates looked correct, but my instances were still executing v2.1.153 because the pinned binary had never been updated. The wrappers were pointing at the right file. The file was just the wrong version. Starting with 0.6.3, instance migrations regenerate wrappers to point at your globally installed `claude`, the one `which claude` finds on your PATH. The migration checks whether the wrapper content actually differs before writing anything, so instances that are already correct don't get touched. If you set `CLAUDE_MULTI_CLAUDE_PATH`, that still takes priority over everything else. ## Provider templates: thinking and output tokens Every provider template now has explicit `MAX_THINKING_TOKENS`, `MAX_OUTPUT_TOKENS`, `ENABLE_THINKING`, and `REASONING_EFFORT` values matched to what each model actually supports. Before this, several templates left thinking disabled or used generic limits that didn't line up with the model's capabilities. | Provider | Thinking tokens | Output tokens | |---|---|---| | MiniMax M2.7 | 32K | 64K | | DeepSeek V4-Pro | 32K | 128K | | MiMo V2.5-Pro (both plans) | enabled | 128K | | Kimi K2.5 | 16K | 64K | | Qwen3-Coder (both plans) | 16K | 64K | ## Model name fixes Two model references changed: - **MiMo** (pay-per-token and Token Plan): bumped from `mimo-v2.5-pro` to `mimo-v2.5-pro[1m]` and `mimo-v2.5` to `mimo-v2.5[1m]`. The `[1m]` variant gives you the full 1M-token context window instead of the default 128K. - **Kimi**: opus model corrected from `kimi-k2.6` to `kimi-k2.5`. The K2.6 opus tier doesn't exist on the Anthropic-compatible endpoint. K2.5 is the right model. If you created a MiMo or Kimi instance before this update, your `settings.json` still has the old model names. You can either run the instance migration (which updates the wrapper script) or edit the model fields by hand. ## Running the migration ```bash claude-multi doctor check # see what needs updating claude-multi doctor fix # apply fixes ``` Or from the TUI: press `!` to open the health screen, then `f` to fix. ## Under the hood The migration system gained two new functions: `getGlobalClaudePath()` and `tryGetGlobalClaudePath()` in `wrapper.ts`. These resolve the claude binary from the env override or PATH, without checking the pinned binary. The instance migration uses these to generate wrappers that point at whatever `claude` you have installed. The old `getClaudePath()` function still exists and still checks the pinned binary first. It's used when creating new instances. The health check's `fixWrapperVersions` also still targets the pinned binary for its "safe park" behavior. These are separate flows from the migration, and they work differently on purpose. --- Full provider reference with model mappings, endpoint URLs, and plan notes is at [/docs/providers/](/docs/providers/). Changelog with all the details is at [CHANGELOG.md](https://github.com/hmziqrs/claude-multi/blob/master/CHANGELOG.md).

v0.6.4: Existing Instances Now Auto-Sync Provider Template Updates

hmziqrs — Fri, 29 May 2026 00:00:00 GMT

# v0.6.4: Existing Instances Now Auto-Sync Provider Template Updates Here's the problem. When we update a provider template in claude-multi, the new model names and token limits only affect instances created after the update. Your existing MiMo instance still has `mimo-v2.5-pro` without the `[1m]` context window. Your DeepSeek instance is missing `MAX_THINKING_TOKENS`. Nobody goes back and edits `settings.json` by hand. v0.6.4 fixes this with a new instance migration that detects which provider each instance uses and re-applies the latest template. ## How it works The migration reads each instance's `settings.json`, extracts `ANTHROPIC_BASE_URL`, and matches it against the nine known provider templates (including MiMo Token Plan region variants). If it finds a match, it re-applies the latest template env vars as a spread merge: - Template values (model names, thinking/output limits) overwrite existing keys - Your API key survives - Any env vars you added yourself survive - If your base URL doesn't match any template, the migration skips that instance The first time this runs, it also writes a `providerTemplate` field to the instance metadata in `config.json`. Future migrations use that field instead of re-detecting from the base URL. ## What this means for your instances If you created an instance before the v0.6.3 template updates, running `claude-multi doctor fix` will now sync these changes: - **MiMo**: `mimo-v2.5-pro` becomes `mimo-v2.5-pro[1m]` (1M context window) - **DeepSeek**: gets `MAX_THINKING_TOKENS: "32000"` and `MAX_OUTPUT_TOKENS: "128000"` - **Kimi**: opus corrected from `kimi-k2.6` to `kimi-k2.5`, gets thinking and output limits - **MiniMax, Qwen**: get their respective thinking and output token limits ## Running it ```bash claude-multi doctor check # see what needs updating claude-multi doctor fix # apply fixes ``` Or from the TUI: press `!` to open the health screen, then `f` to fix. --- Provider reference with model mappings, endpoints, and plan notes: [/docs/providers/](/docs/providers/). Full changelog: [CHANGELOG.md](https://github.com/hmziqrs/claude-multi/blob/master/CHANGELOG.md).

Claude Code v2.1.154 Broke Every Third-Party Provider. Here's What Happened.

hmziqrs — Thu, 28 May 2026 00:00:00 GMT

# Claude Code v2.1.154 Broke Every Third-Party Provider. Here's What Happened. On May 28, 2026, Anthropic shipped Claude Code v2.1.154. Within hours, every non-Anthropic API provider stopped working. GLM, DeepSeek, MiniMax, Kimi, Qwen, MiMo -- all of them returned the same error: ``` API Error: 422 {"detail":[{"type":"literal_error","loc":["body","messages",1,"role"], "msg":"Input should be 'user' or 'assistant'","input":"system", "ctx":{"expected":"'user' or 'assistant'"}}]} ``` If you're using `claude-multi` to run Claude Code against a 3rd-party provider, you probably hit this. ## What changed Claude Code v2.1.150 introduced a new Anthropic beta: `mid-conversation-system-2026-04-07`. It's been in every version since. This beta changes how Claude Code sends system instructions to the API. Before, Claude Code used the top-level `system` parameter for system prompts. Every Anthropic-compatible API supports this: ```json { "model": "claude-sonnet-4-20250514", "system": "You are a helpful assistant.", "messages": [ {"role": "user", "content": "Hello"} ] } ``` After the beta, Claude Code also sends system instructions as messages with `role: "system"` inside the `messages` array: ```json { "model": "claude-sonnet-4-20250514", "system": "You are a helpful assistant.", "messages": [ {"role": "user", "content": "Hello"}, {"role": "system", "content": "Remember: be concise"}, {"role": "assistant", "content": "Hi! How can I help?"} ] } ``` The beta is opt-in via the `anthropic-beta` header. Claude Code enables it automatically for first-party API calls (api.anthropic.com). For third-party providers, it should be disabled. But the detection logic has a gap. ## Why providers reject it The Anthropic Messages API spec allows `role: "system"` in the messages array when the beta header is present. Without the header, it's not valid. Most third-party providers implement the base spec without beta features. Their validators reject any role that isn't `"user"` or `"assistant"`, and return 422. ## Claude Code has a fallback -- but it's incomplete Anthropic anticipated this. Claude Code has built-in retry logic that detects when a provider rejects the `mid-conversation-system` beta. When it catches the error, it removes the beta header and falls back to putting system instructions inside `` blocks in user messages. The detection function (minified, from the binary): ```javascript function pP8(error) { if (!Yy) return false; if (!(error instanceof APIError) || error.status !== 400) return false; // ... pattern matching on error message } ``` `error.status !== 400`. The fallback only triggers on HTTP 400. Third-party providers return HTTP 422 for validation errors. The retry logic never fires. This is the bug. The beta feature itself is fine for first-party API. The error handling checks for one specific status code when providers use a different one for the same error. ## What actually happens 1. User runs `claude-glm` (or any claude-multi instance) 2. Claude Code detects it's using a third-party API 3. The `mid-conversation-system` beta should be disabled, but the detection has edge cases 4. Claude Code sends a request with `role: "system"` in the messages array 5. Provider returns HTTP 422 6. Claude Code's fallback checks `error.status !== 400` -- doesn't match 422 7. Error surfaces to the user 8. Conversation is dead If the provider had returned 400 instead of 422, the fallback would have kicked in. ## How we fixed it in claude-multi Three layers. ### Layer 1: Pin the Claude Code version We maintain a pinned Claude Code installation at `~/.claude-multi/bin/`. All claude-multi instances use this instead of the global `claude` binary. Currently pinned to v2.1.153, the last version before the breakage. ```bash npm install --prefix ~/.claude-multi/bin @anthropic-ai/claude-code@2.1.153 ``` The `getClaudePath()` function checks this path first, before falling back to `which claude`. ### Layer 2: Disable auto-updates We added `DISABLE_AUTOUPDATER=1` and `DISABLE_UPDATES=1` to every provider template's environment variables. Claude Code won't auto-update to a broken version. From `src/templates.ts`: ```typescript const PROVIDER_COMMON_ENV: Record = { DISABLE_AUTOUPDATER: "1", DISABLE_UPDATES: "1", }; ``` These get merged into every instance's `settings.json` when created via a provider template. ### Layer 3: Doctor fix for existing instances For instances created before the fix, we added a health check and repair system. The health check detects two wrapper formats: 1. **Shell format** (current): `exec "/path/to/claude" "$@"` 2. **Node.js format** (legacy): `spawn("/path/to/claude", ...)` Both get flagged as version issues. The fix regenerates wrappers as clean shell scripts pointing to the pinned binary. ```bash # CLI claude-multi doctor check # show issues claude-multi doctor fix # auto-fix # TUI claude-multi # press ! for health screen, then f to fix ``` The TUI shows a banner when version issues are detected: ``` ⚠ 1 error, 2 warnings — press ! to review ⚠ Some instances use a Claude version incompatible with 3rd-party APIs. Press ! to auto-fix. ``` ## What providers should do Return HTTP 400, not 422, for message validation errors. Claude Code's existing fallback handles 400. This is the fastest path to compatibility without waiting for a Claude Code patch. ## What Anthropic should do Two things: 1. Broaden the status code check in the fallback logic. `error.status >= 400 && error.status < 500` instead of `error.status !== 400`. 2. Provide an env var to disable specific betas. `CLAUDE_CODE_DISABLE_MID_CONVERSATION_SYSTEM=1` would let users opt out without downgrading. ## The broader issue This is the second time in recent months that a Claude Code update has broken third-party provider compatibility. The first was a change in how model names are validated. Anthropic tests against their own API, and third-party compatibility is incidental. If you rely on third-party providers, pin your Claude Code version and disable auto-updates. ## TL;DR - Claude Code v2.1.150+ sends `role: "system"` in the messages array (new beta feature) - Third-party providers reject it with HTTP 422 - Claude Code's fallback only handles HTTP 400 - Fix: pin to v2.1.153, disable auto-updates, run `claude-multi doctor fix`

Claude Code is doing more of the job now, and claude-multi makes the rest of it cheaper