Microsoft's first in-house reasoning model challenges Anthropic and OpenAI on enterprise benchmarks without relying on distillation.
Microsoft's first in-house reasoning model challenges Anthropic and OpenAI on enterprise benchmarks without relying on distillation.

Microsoft's first in-house reasoning model challenges Anthropic and OpenAI on enterprise benchmarks without relying on distillation.
Microsoft Corp. unveiled MAI-Thinking-1, its first proprietary reasoning model with 35 billion active parameters, at the Build 2026 conference Tuesday, directly challenging Anthropic's Claude and OpenAI's GPT families in the enterprise AI market.
"MAI-Thinking-1 was designed to be good at complex multi-step instructions, long context reasoning, and code generation," Kyle Daigle, Microsoft Developer CMO and COO of GitHub, said at a media briefing ahead of the keynote.
The model, built from scratch on commercially licensed data without distillation from third-party models, features a 128,000-token context window. Independent evaluators preferred it over Anthropic's Claude Sonnet 4.6, and it matches Claude Opus 4.6 on the SWE Bench Pro coding benchmark, according to Microsoft. The company also introduced six additional models spanning image generation, transcription, voice, and code.
The launch marks Microsoft's deepest push yet into proprietary AI development, reducing its reliance on OpenAI after the two companies renegotiated their partnership. Microsoft shares, trading at roughly 33 times forward earnings, could benefit if the in-house models reduce the roughly $13 billion in annual AI infrastructure costs the company has committed.
A Full Model Family Takes Shape
Beyond the reasoning model, Microsoft released MAI-Image-2.5 and a Flash variant for text-to-image generation and editing, already live in PowerPoint and OneDrive. MAI-Transcribe-1.5, described as five times faster than competing transcription models, will support 43 languages. MAI-Voice-2 and its Flash variant add 15 languages with multiple voice options. MAI-Code-1-Flash, an inference-efficient coding model, is integrated directly into GitHub Copilot and Visual Studio Code.
All models will eventually be available through Microsoft Foundry and a new environment called MAI Playground. The breadth of the lineup shows Microsoft's intent to cover the full AI stack — from reasoning and coding to multimodal generation — rather than relying on a single flagship model.
Hardware and Agents Extend the Reach
Microsoft also unveiled Scout, a proactive personal agent that handles scheduling, meeting prep, and routine tasks through Teams and Outlook without waiting for user input. Scout begins rolling out to Frontier customers Tuesday. On the hardware side, the Surface RTX Spark Dev Box, powered by Nvidia's RTX Spark chip, delivers up to 1 petaflop of AI compute and 128 gigabytes of unified memory, capable of running models up to 120 billion parameters locally. It ships later this year in the US.
The company repositioned Windows as an agent-native runtime through Microsoft Execution Containers, a new sandboxing system now in preview, and made its scientific research platform, Microsoft Discovery, generally available.
Microsoft's vertical integration into model development reduces its dependence on OpenAI, whose partnership was recently restructured to loosen ties between the two companies. If MAI-Thinking-1 delivers on its benchmark claims, it could shift enterprise AI procurement away from third-party API providers and toward Microsoft's Azure platform. Nvidia, whose H100 and B200 GPUs power much of Microsoft's training infrastructure, stands to benefit from continued capital expenditure growth regardless of which model wins. Microsoft's Azure AI revenue grew 157 percent year over year in the most recent quarter, and in-house models could improve margins by reducing per-token inference costs.
This article is for informational purposes only and does not constitute investment advice.