OpenAI's next-gen image model showed up before it was ever officially announced. GPT-Image-2 was caught testing on LM Arena under three codenames — and it looks like the long-standing problem of AI images butchering text might finally be fixed.
What Is It?
GPT-Image-2 is OpenAI's upcoming image generation model — not officially announced yet. In early April 2026, it appeared on LM Arena (the blind AI model evaluation platform) under three codenames: maskingtape-alpha, gaffertape-alpha, and packingtape-alpha. It was pulled within hours.
Developer Pieter Levels (@levelsio) was the first to identify the models, which set off a wave of community-captured outputs. Two things stood out:
- Text rendering: Text placed inside images comes out sharp and accurate
- World knowledge: It knows what real brands, interfaces, and objects actually look like
And the yellow tint that plagued GPT-Image-1 appears to be gone.
What Changes?
| Feature | GPT-Image-1.5 (Current) | GPT-Image-2 (Leaked) |
|---|---|---|
| Architecture | 4o-based | Brand-new independent architecture |
| Text rendering accuracy | ~95% | 99%+ (estimated) |
| Color | Yellow tint present | Natural color, tint removed |
| Photorealism | High | Near-photographic |
| World knowledge | Good | Significantly improved (brands, UIs, handwriting, etc.) |
| Aspect ratio support | 1:1, 3:2, 2:3 | 16:9 widescreen confirmed |
Here's the thing — AI image models have always had three glaring weaknesses: garbled text, mangled hands, and inaccurate real-world objects. GPT-Image-2 appears to tackle all three at once.
What the Community Actually Made
Images generated during the blind test spread across the community — and people couldn't tell they were AI-made.
- IKEA store at night
Mistaken for an actual photograph. Sign fonts, lighting, and entrance signage were all reproduced accurately. - YouTube & Windows UI
Accurate enough to pass as a real screenshot. Button text and layout matched the real thing. - Medical handwritten notes
Handwriting that looks like it came from an actual person. Previous models couldn't come close. - Clock hand test
Set a specific time, and the hands point to exactly the right position. Nano Banana Pro failed this one. - Comic book panels
Spider-Man and Batman costume details rendered accurately, with readable speech bubbles.
Why This Is a Genuine Step Change
While GPT-Image-1.5 was built on 4o (GPT-4 Omni), analysts suggest GPT-Image-2 runs on a completely new architecture. That means this isn't just an incremental upgrade — it's closer to a generational shift.
There's also relevant context: OpenAI shut down Sora on March 24, 2026. Inference costs were running $15 million a day, and the GPU resources freed up from that shutdown are believed to have been redirected to GPT-Image-2 training and inference.
Getting Started
GPT-Image-2 isn't officially out yet, but some ChatGPT users are reportedly getting it through A/B testing. Here's what you can do right now.
How to check if you're on GPT-Image-2
Add "Format 16:9" at the end of your prompt. If you get a 16:9 wide image with sharp text and no yellow tint, you're on the new model.
- Keep generating complex images
Generate text-heavy posters, infographics, or UI screenshots in ChatGPT Images 5–15 times in a row. This reportedly increases your odds of being routed to the new model. - Prep your text-rendering use cases
Line up the tasks where text accuracy matters most: product mockups, social media cards, presentation slides. - Run your own benchmark
Use the same prompts across Nano Banana Pro, Midjourney V7, and Ideogram 3.0 to feel the difference firsthand. - Plan for the API launch
GPT-Image-1.5 runs $0.133 per high-quality 1024×1024 image via API. Given the new architecture, GPT-Image-2 may land slightly higher — around $0.15–0.20.
Competitive Landscape at a Glance
| Model | Key Strength | vs. GPT-Image-2 |
|---|---|---|
| Nano Banana Pro | Google compute, early-mover advantage | Came out behind GPT-Image-2 in multiple blind tests |
| Midjourney V7 | Artistic style, strong community | Trails on photorealism and text rendering |
| FLUX Pro | Open-source, local deployment | Gap in world knowledge and complex scene handling |
| Ideogram 3.0 | Specialized in text rendering | GPT-Image-2 leads on overall capability, not just text |
Deep Dive Resources
How LM Arena Blind Testing Works
LM Arena is a platform where users compare two outputs without knowing which model made them. Because Elo scores are based on pure performance — no marketing, no branding — a high score here is real validation. OpenAI used the same approach in December 2025, testing GPT-Image-1.5 under the codenames Chestnut and Hazelnut before releasing it.
Sora's Shutdown and the GPU Reallocation
Sora shut down in March 2026, with peak inference costs hitting $15 million a day. Total in-app revenue over its entire lifetime? Just $2.1 million. Sam Altman said OpenAI would redirect compute toward next-gen automation research and enterprise applications — and GPT-Image-2 looks like one of the primary beneficiaries.
Multilingual Text Rendering
Turkish-speaking users tested GPT-Image-2's ability to render non-Latin characters and reported a noticeable improvement over previous models. Similar gains are expected for Korean, Arabic, and other scripts.




