MAI-Code-1-Flash - Microsoft의 첫 자체 코딩 모델

github.blog

137B Total, 5B Active — Inside Microsoft's First In-House Coding Model

MAI-Code-1-Flash, GitHub Copilot coding model, MoE architecture, token efficiency, Microsoft AIDev

Introducing MAI-Code-1-Flash

MAI-Code-1-Flash is now available for GitHub Copilot

MAI-Code-1-Flash available on more Copilot surfaces

GitHub Copilot's token billing started on June 1. Microsoft unveiled its own coding model on June 2 — one day later. Could be a coincidence. But once you understand how this model is designed, you might think otherwise.

3-Second Summary

137B/5B MoE → Trained in production → +16pts vs Claude Haiku → 60% fewer tokens → New daily coding default

137 billion parameters — so why is it cheap and fast?

MAI-Code-1-Flash has 137B total parameters, but only 5B activate during inference. That's the Mixture-of-Experts (MoE) architecture.

Think of it like a team of specialists. When a patient comes in, only the relevant doctor or two handles the case — the rest stay on standby. The model works the same way: for each token, only the most relevant 5B parameters activate. The other 132B sit out that computation. You get the breadth of a 137B model at the speed and cost of a 5B one.

Fast, affordable, and smart — the MoE architecture is why. Pricing lands at $0.75 per 1M input tokens and $4.50 per 1M output tokens. And it uses 60% fewer tokens on hard problems on top of that.

137B

Total parameters

Active at inference

256K

Context window (tokens)

Trained on real workflows, not just benchmarks

Most coding models are optimized to score well on SWE-Bench and similar benchmarks. MAI-Code-1-Flash took a different approach. It was trained directly inside GitHub Copilot's production harness — actual file edits, terminal calls, and multi-turn conversations.

And one more thing: no knowledge distillation from OpenAI or any third-party model. Built entirely on Microsoft's own clean, traceable, enterprise-grade data. It's as much a declaration of AI independence as it is a product launch.

	Typical coding model	MAI-Code-1-Flash
Training environment	Benchmark optimization	Copilot production harness
Data source	Varies (may include distillation)	Self-collected, no third-party distillation
SWE-Bench Pro	35.2% (Claude Haiku 4.5)	51.2% (+16 points)
SWE-Bench Verified	66.6% (Claude Haiku 4.5)	71.6%
Token efficiency	Baseline	Up to 60% fewer tokens on hard tasks

On instruction following (IF Bench), it leads Claude Haiku 4.5 by 28.9 points. On an adversarial reasoning benchmark spanning 186 questions across 34 categories, it hit 85.8% adjusted accuracy. Not what you'd expect from a "small" model.

The billing connection

GPT-5.5 runs $5 input / $30 output per 1M tokens. MAI-Code-1-Flash is $0.75 / $4.50 — and uses 60% fewer tokens. The difference in your monthly bill can be substantial.

How to set up MAI-Code-1-Flash in Copilot's model picker

Update VS Code + Copilot extension
The model picker only shows in the latest version. Update the GitHub Copilot extension from VS Code's Extensions tab.
Select in picker or use Auto
In the Copilot Chat panel, click the dropdown for the model list. Pick MAI-Code-1-Flash directly, or choose Auto to let Copilot route based on task type automatically.
Task-based routing guide
Inline edits, refactors, short bug fixes, repo Q&A, repetitive tasks → MAI-Code-1-Flash. Complex architecture design, deep security reviews, large-scale autonomous implementations → frontier models (MAI-Thinking-1, Claude Opus, etc.).
Business/Enterprise users
General availability for Business and Enterprise plans rolled out June 26, 2026. If it's not in your picker yet, give it a few days or check GitHub Community Discussions.
Monitor usage dashboard
Check the Usage Dashboard in Copilot settings to see per-model token consumption. Verify the token savings in real numbers on your own workflows.

When to reach for a different model

For major architecture decisions, long autonomous implementations, and complex multi-system debugging, MAI-Code-1-Flash may not be the best choice. It's optimized as a fast first responder for everyday coding — escalate to larger models when you need deeper reasoning.

Here's where MAI-Code-1-Flash currently runs:

1/3

IDEs

VS Code, Visual Studio, JetBrains IDEs, Eclipse, Xcode

2/3

GitHub Services

Copilot Chat on GitHub, GitHub Mobile, Copilot cloud agent

3/3

CLI

Copilot CLI (use it directly in your terminal)

Want to go deeper?

Introducing MAI-Code-1-Flash Official announcement from Microsoft's Superintelligence team. Full training methodology, MoE architecture, and benchmark breakdown. microsoft.ai

MAI-Code-1-Flash is now available for GitHub Copilot Initial launch changelog with gradual rollout schedule across Copilot tiers and model picker instructions. github.blog

MAI-Code-1-Flash available on more Copilot surfaces Expansion to JetBrains, Eclipse, Xcode, mobile, and CLI — 9 additional platforms. github.blog

MAI-Code-1-Flash for Copilot Business and Enterprise Enterprise rollout announcement and availability timeline. github.blog

Microsoft MAI-Code-1-Flash in GitHub Copilot: Pricing and Performance Pricing structure breakdown and practical use case analysis. smartscope.blog

MAI-Code-1-Flash: Microsoft's Copilot-Native Coding Model Developer-perspective analysis of model routing and real-world use cases. chatforest.com

GitHub Copilot's Token Billing Backlash Hits as Microsoft Build 2026 Opens With MAI Strategic context: the billing change and MAI launch timing analyzed. the-agent-report.com

FAQ

Does the Auto picker automatically select MAI-Code-1-Flash?

Copilot's Auto router analyzes your task type and picks the most suitable model, which may include MAI-Code-1-Flash. You can also select it manually in the picker. Check which model was used in Copilot settings' Usage Dashboard.

Does it work in JetBrains or Xcode?

Yes, from June 18, 2026, support expanded to JetBrains IDEs, Eclipse, Xcode, Visual Studio, GitHub Mobile, and Copilot CLI. Business/Enterprise plans got general availability on June 26.

Is MAI-Code-1-Flash available on all Copilot plans?

Yes — Free, Student, Pro, Pro+, and Max all have access. It runs within Copilot's AI Credits usage model, but the 60% token efficiency means your credits go significantly further than with heavier models.

What's the difference between MAI-Code-1-Flash and MAI-Thinking-1?

MAI-Code-1-Flash is optimized for everyday tasks: inline edits, refactors, and quick bug fixes. MAI-Thinking-1 is a reasoning model designed for complex architecture decisions and long autonomous implementations. They serve different roles — pick the right one for your task.

Did Microsoft really train this without OpenAI data?

Microsoft officially stated it was trained on clean, traceable, properly licensed data without distillation from any third-party model. This is their first coding model built entirely in-house, with no knowledge extracted from OpenAI models.

Written by Rush

Tracking where business meets AI.

Did you find this reference helpful?

Get curated references delivered to your inbox weekly

Share this reference

Antioch — Meet the Cursor for Robot AI

Physical AI startups no longer need to rent warehouses or build million-dollar test facilities. Antioch brings software-speed development to robotics through cloud simulation — and just raised $8.5M seed to prove it.

Explore more AI workflow guides on similar topics

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

morningbrew.com

Medvi telehealth, AI startup leverage, GLP-1 startup, one-person unicorn, AI operations

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

Matthew Gallagher built Medvi, a GLP-1 telehealth startup, in 14 months with $20,000 and AI tools. 2 employees. 16.2% net margin. $401M in year one. Here's how the model works — and where it's breaking.

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

substackcdn.com

What if your code review was already done when you woke up, and your newsletter

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

What if your code review was already done when you woke up, and your newsletter sources were already organized? Here's how to automate recurring tasks with Claude Code Scheduled Task.

Next →Antioch — Meet the Cursor for Robot AI