MIMO Code: Xiaomi's Free, Open-Source Coding Agent That Beats Claude
Xiaomi just released MIMO Code as open-source. It outperforms Claude Code on long coding tasks. Here's why it matters for developers.

TL;DR
- MIMO Code (open-source, MIT) beats Claude Code on tasks 200+ steps: 65% win rate vs 35%
- Cost: 66% cheaper ($1 vs $3 per 1M tokens) + uses 40% fewer tokens on long tasks
- Why: Better architecture (checkpoints, persistent memory, goal verification)—not a smarter model
- Who benefits: Developers doing long refactoring jobs. Short tasks? Both tools are equal.
- Bottom line: Free harness + cheap tokens + open-source control. Try MiMo Auto (free trial, no signup)
Benchmark Reality: Where MIMO Wins
On June 10, 2026, Xiaomi's MiMo team released MIMO Code V0.1.0—open-source, MIT licensed, and outperforming Claude Code on long coding tasks.
Here's the raw data:
| Benchmark | MIMO Code | Claude Code | Gap |
|---|---|---|---|
| Terminal Bench 2.0 | 86.7% | 65.4% | +21.3% |
| SWE-bench Pro | 62% | 57% | +5% |
| A/B Win Rate (200+ steps) | 65% | 35% | +30% |
| A/B Win Rate (<200 steps) | 50% | 50% | — |
Source: VentureBeat, Xiaomi MIMO Blog, Internal A/B testing (576 developers, 474 repos)
The story: MIMO dominates on long-horizon tasks (200+ execution steps). On short tasks, they're equal. Real development work is long-horizon work.
Feature Comparison: Head-to-Head
| Feature | Claude Code | MIMO Code |
|---|---|---|
| Open-source | ⌠| ✅ MIT License |
| Cross-session memory | ⌠| ✅ (4-layer system) |
| Goal verification | ⌠| ✅ |
| Checkpointing | ⌠| ✅ |
| Long-horizon (200+ steps) | âš ï¸ Degrades | ✅ Optimized |
| Context window | 128K (Sonnet) | 1M (MiMo-V2.5) |
| Self-hosted option | ⌠| ✅ |
Pricing: The Real Numbers
| Aspect | Claude Code | MIMO Code |
|---|---|---|
| Harness Cost | Free | Free (Open-source MIT) |
| Model: Input Tokens | $3.0/1M (Sonnet 4.6) | $1.0/1M (V2.5) |
| Model: Output Tokens | $15/1M | $4.0/1M |
| Cost per 200-step task | ~$1.50 (500K tokens) | ~$0.30-0.45 (300K tokens) |
| Free Trial | None | MiMo Auto (limited time) |
| Control | Proprietary | Open-source, self-hosted |
Key insight: MIMO doesn't just cost less per token. It uses fewer tokens on long tasks because it doesn't waste context re-reading the same files and re-explaining state.
Developer ROI: Who Benefits and How Much?
Claude Code Users
Current situation: Paying $3/1M input tokens. Hitting context window issues on complex projects.
With MIMO:
- Cost savings: 66% reduction ($3 → $1 per 1M input tokens)
- Token efficiency: 40% fewer tokens on long tasks (smart memory prevents re-reads)
- Real math: A 200-step refactoring task costs ~$1.50 with Claude Code. With MIMO: ~$0.30-0.45.
- Time saved: No context resets = continuous agent reasoning instead of re-explaining state
Migration cost: Zero. MIMO can import your existing Claude Code configuration.
Verdict: Immediate ROI on long-horizon projects.
GitHub Copilot Users
Current situation: Line-by-line code suggestions. No agent loop. No project context.
With MIMO:
- Capability jump: From autocomplete → full coding agent
- Project understanding: MIMO maintains project memory. Copilot restarts each session.
- Task scope: Copilot handles 5-10 step tasks. MIMO handles 200-step refactoring.
Cost comparison:
- Copilot: $10-20/month
- MIMO: Pay-per-token (~$0.30-0.45 per complex task, or free with MiMo Auto trial)
Verdict: Copilot is line-level suggestions. MIMO is agent-level automation. Use both or choose MIMO for long projects.
Open-Source Tool Users (Custom Agents)
Current situation: Building custom agents in Python/JS. Maintaining your own checkpoint logic. No memory system.
With MIMO:
- Time saved: Don't rebuild checkpointing/memory architecture. MIMO includes it.
- Reliability: 4-layer memory system prevents context corruption on 200+ step tasks.
- Deployment: Terminal-native. One command to run.
Cost: Free harness + model cost (bring your own API keys).
Verdict: Massive time savings. MIMO is production-ready; your custom agent probably isn't.
Enterprise/Agentic AI Teams
Current situation: Evaluating Anthropic's Dynamic Workflows + Claude Code for long-horizon automation.
With MIMO:
- Open-source control: Fork it, customize it, run it on your infrastructure
- Cost at scale: 66% cheaper per token. At 10M tokens/month, that's $20K/month savings.
- Architecture reference: MIMO's checkpoint + memory system is a reference implementation for building your own agents.
Verdict: Strong alternative to proprietary solutions.
Why It Works: The Architecture
Most people think the model is the bottleneck. Wrong.
When you run the exact same underlying model (MiMo-V2.5-Pro) through both harnesses, MIMO Code still outperforms Claude Code by ~5 percentage points. That 5-point gap is pure architecture.
MIMO's three advantages:
1. Checkpoints (Not Just History) Instead of scrolling back through 500 lines of context, MIMO writes structured checkpoints at 20%, 45%, and 70% of context budget. The writer subagent extracts intent, working state, file tree, errors, and design decisions. When the window fills, it rebuilds using these checkpoints, not a lossy summary.
2. Persistent Project Memory (4-Layer System)
- Session memory (this conversation)
- Project memory (architectural decisions, rules, verified facts)
- Global memory (user preferences)
- Full history (SQLite trace for audit)
Claude Code? Session-scoped only. New session = explain everything again.
3. Goal Verification (Prevents Fake Completions) The agent declares "done." MIMO launches an independent verifier that checks: did it actually satisfy the stopping condition? If not, it feeds back the gap. Claude Code trusts the agent's judgment.
How to Get Started
# One-line install
curl -fsSL https://mimo.xiaomi.com/install | bash
# Or via npm
npm install -g @mimo-ai/cli
On first launch, you choose your model:
- MiMo Auto (free, limited time, no signup)
- Xiaomi MiMo platform login
- Import from existing Claude Code config
- Custom model (bring your own API keys)
FAQs
Q: Do I have to pay for the underlying model? A: Yes. MIMO Code is the harness (free, open-source). The model (MiMo-V2.5) costs money per token, similar to how Claude Code uses Claude API. But MiMo tokens are ~50% cheaper, and on long tasks you use fewer of them. Use MiMo Auto (free trial) to test first.
Q: Can I use Claude models with MIMO Code? A: The harness is model-agnostic by design. You could theoretically bring your own Claude API key to MIMO's harness. That would give you Claude's capability plus MIMO's architecture advantages—the best of both worlds.
Q: Is this production-ready? A: V0.1.0 is early, but it's open-source MIT and used internally by Xiaomi. The benchmarks are self-reported, but the human A/B test with 576 developers on 474 real private repositories is credible evidence.
Q: Will Anthropic respond with a better harness? A: Probably. Claude Code will improve. But this signals that harness engineering is now table stakes. Expect rapid iteration from all providers on memory systems and long-horizon capabilities.
Q: Should I switch from Claude Code? A: If you do long-horizon tasks (200+ steps), yes. If you do short iterations, both are competitive. The open-source angle + cost savings alone are worth trying the free MiMo Auto tier.
Resources
- GitHub Repository — Source code, MIT licensed
- Official Blog — Technical deep dive
- Quick Start Docs — Installation & setup
- Try it now — Free MiMo Auto access
- Announcement — Original Twitter/X post
- Benchmark Details — VentureBeat coverage
About the Author
Kartik Daware Jain
Product Thinker · AI Writer · Founder, AI Product pulse
Kartik thinks and writes at the intersection of AI and product strategy. He founded AI Product pulse - the independent publication for builders and PMs navigating the AI era - covering frameworks, teardowns, AI tools, and career strategy. His writing is practitioner-first: grounded in real product decisions, not academic theory.
Free Newsletter
Enjoyed this? Get more like it every Sunday.
Frameworks, teardowns, and AI tools for PMs - free on Substack.