
Chinese electronics and car manufacturer Xiaomi surprised the global AI community today Release of MiMo-V2-ProThe new 1-trillion-parameter foundational model approaches that of US AI giants OpenAI and Anthropic, but costs about a seventh or sixth when accessed via a proprietary API – and, crucially, sends 256,000 tokens worth of data back and forth.
Led by Fuli Luo, a veteran of the disruptive DeepSeek R1 project, the release echoes what Luo describes. "quiet ambush" on the global frontier. In addition, Luo noted in a statement Letter X that the company plans to open source a model variant from this latest release, " when the models are stable enough to deserve it."
paying attention "field of activity" intelligence – the transition from code generation to digital autonomous operation "claws"-Xiaomi is trying to completely leapfrog the conversational paradigm.
Before this foray into frontier artificial intelligence, Beijing-based Xiaomi established itself as a titan "The Internet of Things" and consumer equipment.
Known globally as the world’s third-largest smartphone maker, Xiaomi spent the early 2020s making a high-priced entry into the automotive sector. Electric vehicles (EVs) like its SU7 and the recently launched YU7 SUV have made the company a vertically integrated powerhouse capable of combining hardware, software and now advanced thinking.
This generation in physical-world engineering informs the MiMo-V2-Pro’s architecture; built to be "the brain" Whether these systems manage global supply chains or manage the complex structures of an autonomous coding agent.
Technology: Architecture of agency
The central problem "Agent Era" it maintains high-fidelity reasoning over large data spans without any restrictions "exploration tax" in delay or price. The MiMo-V2-Pro solves this through a sparse architecture: although it has 1T of total parameters, only 42B are active during any forward pass, making it about three times larger than its predecessor, the MiMo-V2-Flash.
The efficiency of the model is based on the advanced Hybrid Attention mechanism. Standard transformers typically experience a quadratic increase in computational requirements as the context grows; MiMo-V2-Pro uses a 7:1 hybrid ratio (up from 5:1 in the Flash version) to handle a massive 1M-token context window. This architectural choice allows you to maintain the depth of the model "memory" of long-term tasks without the performance degradation typically seen in frontier models.
Analogy: Think of the model as an expert researcher in a vast library, not as a student reading a book page by page. A 7:1 ratio allows the model "to filter" 85% of data is for context, with high-density focus on the 15% most relevant to the task.
This is combined with a lightweight Multi-Token Prediction (MTP) layer that allows the model to predict and generate multiple tokens simultaneously. "thinking" stages of agent workflows. According to Luo, these structural decisions were made months ago, especially a "structural advantage" for the unexpected speed with which the industry is shifting toward agents.
Product and comparison: A third-party reality check
Xiaomi’s internals paint a picture of a superior model "the real world" assignments on synthetic criteria. The MiMo-V2-Pro achieved an Elo of 1426 in the agency’s GDPval-AA, which measures the performance of real business tasks, putting it ahead of key Chinese peers such as the GLM-5 (1406) and Kimi K2.5 (1283).
Still following the West "maximum effort" models such as the Claude Sonnet 4.6 (1633) in raw Elo represent the highest performance recorded for a model of Chinese origin in this category.
A third-party benchmarking organization Artificial Analysis confirmed these claimsMiMo-V2-Pro ranks 10th in the global Intelligence Index with a score of 49. This puts it on par with GPT-5.2 Codex and ahead of Grok 4.20 Beta. These results show that Xiaomi has successfully built a model with the high level of reasoning capability required for engineering and manufacturing.
Artificial Analysis’ key indicators highlight a significant leap over the previous open-weight version, MiMo-V2-Flash (which scored 41 points):
-
Degree of hallucination: The Pro model reduced hallucination rates to 30%, a dramatic improvement over the Flash model’s 48%.
-
Know-it-all index: It scored +5, ahead of GLM-5 (+2) and Kimi K2.5 (-8).
-
Token Efficiency: MiMo-V2-Pro requires only 77M output tokens to run the entire Intelligence Index, significantly less than GLM-5 (109M) or Kimi K2.5 (89M), indicating a shorter and more efficient reasoning process.
Xiaomi’s own graphics highlight it even more "General Agent" and "Coding agent" opportunities. In ClawEval, the agency’s benchmark for scaffolding, the model scored 61.5 points, approaching the performance of the Claude Opus 4.6 (66.3) and significantly outperforming the GPT-5.2 (50.0). In coding-specific environments like Terminal-Bench 2.0, it achieved 86.7, offering high reliability when running commands in a live terminal environment.
How businesses should evaluate MiMo-V2-Pro for use
For those involved in today’s AI organizations – from Infrastructure to Security – MiMo-V2-Pro represents a paradigm shift in the world. "Price-Quality" crooked.
Infrastructure decision makers will find MiMo-V2-Pro an attractive candidate for the Pareto frontier between intelligence and cost. Artificial Analysis reported that running their index costs just $348 for MiMo-V2-Pro, $2,304 for GPT-5.2, and $2,486 for Claude Opus 4.6.
For organizations managing GPU clusters or procurement, the ability to enter the top 10 global intelligence at about 1/7 the cost of incumbent companies in the West is a strong incentive for production-scale testing.
Information decision makers can use the 1M context window for RAG-ready architectures, allowing them to deliver entire corporate codebases or documentation sets into one command without the fragmentation required by smaller context models.
A system/orchestra decision maker should consider the MiMo-V2-Pro as a baseline "the brain" for multi-agent coordination. Because the model is optimized for OpenClaw and Claude Code, it can handle long-horizon planning and precise tool usage without the constant human intervention that plagued previous models.
Its high ranking in GDPval-AA indicates that it is particularly well-suited for the workflow and orchestration layer needed to scale AI across the enterprise. It allows the creation of systems that can go beyond simple automation to solving complex, multi-step problems.
However, security decision makers should exercise caution. A lot "agent" The robust nature of the model – its ability to use terminals and manipulate files – increases the surface area for operational injection and unauthorized access to the model.
While its low hallucination rate (30%) is a boon for defense, the lack of public weights (unlike the Flash version) means homeland security teams can’t perform deep operations. "at the model level" audits sometimes required for highly sensitive deployments. Any enterprise implementation should be accompanied by robust monitoring and auditing protocols.
Cost, availability and the way forward
Xiaomi has priced the MiMo-V2-Pro to dominate the developer market. Pricing is based on context usage with competitive rates for caching to support high-frequency reasoning tasks.
-
MiMo-V2-Pro (up to 256K): $1 for 1 million input tokens and $3 for 1 million output tokens
-
MiMo-V2-Pro (256K-1M): $2 for 1 million input tokens and $6 for 1 million output tokens
-
The cache reads: $0.20 per 1 million tokens for the lower tier and $0.40 for the higher tier
-
Cache entry: Temporarily free ($0)
Here’s how the world stacks up against other advanced frontier models:
|
Model |
Introduction |
Exit |
Total Cost |
Source |
|
Grok 4.1 Fast |
$0.20 |
$0.50 |
$0.70 |
|
|
MiniMax M2.7 |
$0.30 |
$1.20 |
$1.50 |
|
|
Gemini 3 Flash |
$0.50 |
$3.00 |
$3.50 |
|
|
Kimi-K2.5 |
$0.60 |
$3.00 |
$3.60 |
|
|
MiMo-V2-Pro (≤256K) |
$1.00 |
$3.00 |
$4.00 |
|
|
GLM-5-Turbo |
$0.96 |
$3.20 |
$4.16 |
|
|
GLM-5 |
$1.00 |
$3.20 |
$4.20 |
|
|
Claude Haiku 4.5 |
$1.00 |
$5.00 |
$6.00 |
|
|
Gwen3-Max |
$1.20 |
$6.00 |
$7.20 |
|
|
Gemini 3 Pro |
$2.00 |
$12.00 |
$14.00 |
|
|
GPT-5.2 |
$1.75 |
$14.00 |
$15.75 |
|
|
GPT-5.4 |
$2.50 |
$15.00 |
$17.50 |
|
|
Claude Sonnet 4.5 |
$3.00 |
$15.00 |
$18.00 |
|
|
Closing the Business 4.6 |
$5.00 |
$25.00 |
$30.00 |
|
|
GPT-5.4 Pro |
$30.00 |
$180.00 |
$210.00 |
This aggressive deployment is designed to drive high-intensity application flows that define next-generation software. The model is currently only available through Xiaomi’s first-party API, with no current support for image or multi-modal access – a notable omission in an era. "Omni" models, although Xiaomi offered a separate MiMo-V2-Omni for these needs.
The "Hunter Alpha" The period at OpenRouter has proven that the market has a strong appetite for this particular blend of efficiency and reasoning. Fuli Luo’s philosophy – research speed a "true love for the world you built"— resulted in a model ranked 2nd in China and 8th in the world in terms of specified intelligence indices.
Whether to stay a "quiet" whether to ambush or become the basis of a global realignment of AI power depends on how quickly developers adopt it "field of activity" on "chat window". For now, Xiaomi has moved the goalposts: the question is no longer simple "can he talk" but "can it move?"




