Microsoft introduces three internal AI models that directly challenge OpenAI



Six months after renegotiating a deal that once barred it from independently pursuing frontier artificial intelligence, Microsoft released three in-house models that directly challenge its $13 billion partner. MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 are already available in Microsoft Foundry and do not bear OpenAI’s name anywhere on the label.

The models are the first public product of the MAI Superintelligence team, which Microsoft AI CEO Mustafa Suleiman created in November 2025 with a mission to realize what the company calls “humanistic superintelligence.” First in an internal memo in March Business Insider reported on thisSuleiman wrote that he intends to focus all his energy on super intelligence and introduce world-class models for Microsoft in the next five years. This ambition now has its first tangible proof.

MAI-Transcript-1 is, on paper, the most immediate offender of the three. The speech-to-text model claims the lowest word error rate across 25 languages ​​in the FLEURS benchmark, an average of 3.8 percent, and Microsoft says it outperforms OpenAI’s Whisper-large-v3 in all 25 languages, Google’s Gemini 3.1 Flash in 22 out of 25 languages, and EleveneLabs v152. It’s 2.5 times faster than Microsoft’s previous Azure Fast transcription service and costs $0.36 per hour of audio. Perhaps the most obvious is the team that built it: only 10 people.

MAI-Voice-1 completes the voice circuit. The text-to-speech model generates 60 seconds of natural-sounding audio in less than a second on a single GPU, and supports creating a custom voice from a few seconds of sample audio. Combined with MAI-Transcribe-1 and the extensive language model of the customer’s choice, it forms a complete audio pipeline that runs entirely on Microsoft infrastructure, independent of OpenAI technology.

MAI-Image-2, the oldest of the three, already existed Arena.ai debuted in third place on the text-to-image leaderboard In March, it was only beaten by Google’s Gemini 3.1 Flash and OpenAI’s GPT Image 1.5. The model was developed in collaboration with photographers, designers and visual storytellers, and WPP, one of the world’s largest marketing groups, is among the first enterprise partners to build with it at scale.

The strategic context is more important than the criteria. Until September 2025, Microsoft’s original partnership agreement with OpenAI prevented the company from independently pursuing general AI development under the contract. The revised memorandum of understanding substantially changed this calculation. Microsoft has retained licensing rights to everything OpenAI builds until 2032, has won $250 billion in new Azure cloud business commitments, and significantly gained the freedom to create competing models. Suleiman acknowledged the main point directly: according to him, the renegotiation of the contract allowed Microsoft to conduct its own super-intelligence independently.

The timing is intentional. Jacob Andreou, formerly a senior vice president at Snap, took over as Copilot’s executive vice president on March 17, relieving Suleiman of day-to-day product responsibilities. The MAI models landed just two weeks later. Microsoft also hired Ali Farhadi, the former chief executive of the Allen Institute for AI, for Suleiman’s super-intelligence team in March, suggesting that ambitions extend far beyond transcription and imaging.

Development for OpenAI creates an awkward dynamic. Microsoft remains its single largest investor and primary cloud infrastructure provider, and the two companies continue to share a platform in Foundry, which hosts both OpenAI and Microsoft models. But OpenAI’s push for commercial monetization accelerate in parallel, and the relationship begins to resemble two companies orbiting the same market with overlapping products, rather than a partnership with a clear division of labor. OpenAI’s $110 billion raise in February, backed by SoftBank, Nvidia and Amazon, valued the company independently of Microsoft at a level that made its original partnership framework increasingly anachronistic.

The broader AI model market is fragmenting along similar lines. Anthropic’s $380 billion valuation increases by $30 billion established it as a trusted third force in the AI ​​enterprise with $14 billion in revenue. Google continues to rapidly iterate on Gemini. The days when OpenAI was the only game in town for AI capabilities and Microsoft was happy to be the exclusive distribution channel are definitely over.

Microsoft Foundry, formerly known as Azure AI Foundry and before that Azure AI Studio (the second rebrand in twelve months), now serves developers at more than 80,000 enterprises, including 80 percent of Fortune 500 companies. This distribution advantage is what makes the MAI family of models strategically important: Microsoft doesn’t need to beat OpenAI on every metric to drive enterprise spending into internal models. It must be competitive enough for customers to choose an integrated option over a third-party alternative, dynamic the past year of AI industry consolidation made it more and more convincing.

Suleiman said it will take another year or two for the super-intelligence team to develop boundary-level language models. What’s up this week is the foundation: a multimodal toolkit that gives Microsoft its own voice, ears, and eyes independent of OpenAI. The $13 billion partnership is not over. But it stands to reason that Microsoft needs OpenAI to compete in artificial intelligence quietly dismantled one model release at a time.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *