cf-assets.cloudflare.com

The AI That Beat the World's Best Hackers — Why Anthropic Locked It Away

Project Glasswing, Claude Mythos, AI vulnerability detection, software security, wolfSSLDev

Project Glasswing: Securing Critical Software for the AI Era

Project Glasswing: An Initial Update

Anthropic Scales Claude Mythos to Critical Infrastructure in 15+ Countries

OpenBSD is considered the world's most security-hardened OS. A bug had been hiding there for 27 years. AI found it.

Alone. Automatically. And you can't use that AI yet.

30-second summary

AI surpasses top hackers → 23,000+ vulnerabilities found → Coalition instead of public release → The AI security game has changed

You still think AI security tools are just "assistants," right?

Honestly? That was accurate until recently. AI-based security scanners were good at finding known patterns, but real zero-day vulnerabilities remained the domain of elite security researchers.

In April 2026, that assumption was officially overturned. Anthropic revealed an unreleased model called Claude Mythos Preview — and the benchmark numbers are quite provocative.

83.1%

CyberGym (vulnerability reproduction)

93.9%

SWE-bench Verified

77.8%

SWE-bench Pro

More telling than the benchmarks: independent security firm XBOW called Mythos absolutely unprecedented precision, while the UK AI Security Institute declared it "the first model to solve both of their cyber ranges end to end."

Previous security AIs would find a bug and stop at "this looks suspicious." Mythos finds the bug, converts it into working exploit code, and chains multiple vulnerabilities into larger attacks. It runs full penetration tests, not just bug identification.

How does AI find bugs that hid for 27 and 16 years?

Numbers first. Mythos Preview scanned over 1,000 open-source projects and flagged 23,019 issues total — 6,202 rated high or critical severity. Six independent security firms verified 1,752 of these; 90.6% confirmed real vulnerabilities.

Software	Hidden for	Severity	Impact
wolfSSL cryptography library	CVE-2026-5194	Critical	Certificate forgery — enables convincing fake bank and email sites
OpenBSD	27 years	High	Remote system crash
FFmpeg (video encoder)	16 years	High	Survived 5M+ automated test runs undetected
Linux kernel	Multiple	Critical	Privilege escalation from user to full system control

You may not know wolfSSL, but your devices almost certainly use it. It's embedded in billions of IoT devices, automotive systems, and hardware. The exploit would have let attackers forge certificates — making fake websites appear completely legitimate with a green padlock. It's patched now.

Cloudflare found 2,000 bugs in their codebase, 400 of them high or critical severity. False positive rate was better than human testers. Mozilla found 271 vulnerabilities in Firefox 150 — more than 10x what they found with Claude Opus 4.6 on prior versions.

Why didn't anyone catch that 16-year-old FFmpeg bug?

Automated tests ran 5 million+ times and missed it. Traditional fuzzers ask "does this input crash the program?" in brute repetition. AI understands the logic of the code and constructs sophisticated edge cases. It attacks meaning, not just surface patterns.

So why didn't Anthropic release it publicly?

Here's the core. Mythos Preview is not available to general users. Access is restricted to specific partner organizations, and public release has been pushed until "stronger safety features are added."

This isn't just "it's dangerous, lock it up." The logic is: the same AI can attack or defend — who gets there first determines the outcome. Attackers gaining this capability is only a matter of time. Anthropic's bet was getting defense there first.

That's Project Glasswing — an industry defense coalition launched in April 2026.

AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks
— 11 founding partners of Project Glasswing

It launched with 50 partners and expanded to 150+ organizations across 15+ countries by June 2026. Korea is included — Samsung, SK Hynix, and SK Telecom are among the partners. Anthropic has committed up to $100M in model credits and $4M in direct grants to open-source foundations.

	Old security paradigm	Post-Glasswing
Vulnerability discovery speed	Months	Weeks
Monthly high-risk findings	150–300	900+
Patch deployment time	60–150 days avg	Target: under 2 weeks
False positive rate	High (noisy)	Better than human testers

Results are already showing. A partner bank intercepted a $1.5M fraudulent wire transfer, and Palo Alto Networks has been patching at 5x their normal rate.

What your team should do right now

Audit your open-source dependencies
Like wolfSSL, you may be using libraries you don't know about. Run npm audit, pip-audit, or trivy to map your dependency tree today, and automate alerts with GitHub Dependabot or Snyk.
Cut your patch deployment cycle
Industry average is 60–150 days. AI has compressed discovery to weeks — patches need to keep up. Build security patch validation and deployment into your CI/CD pipeline. That's the most urgent structural change right now.
Join the Claude Security waitlist
Anthropic is running an enterprise beta of Claude Security. Early data: 2,100 vulnerabilities patched in 3 weeks using Claude Opus 4.7. Sign up now to move fast when access opens.
Harden your defense architecture
Cloudflare's lesson: "Patching alone isn't enough." WAF, Zero Trust, and system segmentation need to run in parallel. Even when you find vulnerabilities, attackers can move before patches ship.
Understand the AI-vs-AI era
Security expert Sejun Park (CEO, Theori) notes the acceleration in AI-driven vulnerability discovery predates Mythos. Defense teams without AI tooling will find it increasingly hard to keep pace with AI-powered attacks.

🔗

더 깊이 파고 싶다면

Project Glasswing Official Announcement

Anthropic explains how Mythos Preview finds vulnerabilities and the full project strategy.

Project Glasswing: An Initial Update

Detailed data on 10,000+ vulnerability discoveries, partner performance, and wolfSSL CVE specifics.

Cloudflare's Mythos Pipeline

The 8-stage pipeline and prompt design strategy behind finding 2,000 bugs.

TechCrunch: Expansion to 15+ Countries

Coverage of the 150+ org expansion, critical infrastructure inclusion, and Korea's role.

Help Net Security: 10,000+ Vulnerability Analysis

Technical breakdown of wolfSSL CVE-2026-5194 and discovery scale.

FAQ

Is Claude Mythos Preview available to the public?

Not yet. Access is currently limited to Glasswing partner organizations and Cyber Verification Program participants. Anthropic plans to release it gradually once stronger safety features are in place.

60–150 days for patching sounds long. Can teams actually cut that down?

Yes. The fastest path is integrating security patches into your CI/CD pipeline instead of treating them as a separate process. Automated test-and-deploy workflows can realistically bring this under two weeks.

Can small startups benefit from Project Glasswing?

You're already getting indirect benefits — open-source projects your stack depends on (wolfSSL, Firefox, Linux kernel) are being patched first. For direct access, the Claude Security enterprise beta is open for applications.

If AI finds vulnerabilities, can attackers also use AI to attack faster?

Exactly — that's the core premise of Glasswing. Anthropic has stated that attackers gaining this capability is only a matter of time. The coalition exists so defense gets there first: proactively patched vulnerabilities mean fewer open windows for attackers using the same AI.

Is 90.6% accuracy good or bad?

Context matters. There's about 10% false positives, but Cloudflare reported the rate was better than human testers. The real bottleneck isn't accuracy — it's patch deployment speed. Discovery is now faster than the time it takes to ship a fix.

Written by Rush

Tracking where business meets AI.

Did you find this reference helpful?

Get curated references delivered to your inbox weekly

Share this reference

Antioch — Meet the Cursor for Robot AI

Physical AI startups no longer need to rent warehouses or build million-dollar test facilities. Antioch brings software-speed development to robotics through cloud simulation — and just raised $8.5M seed to prove it.

Explore more AI workflow guides on similar topics

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

morningbrew.com

Medvi telehealth, AI startup leverage, GLP-1 startup, one-person unicorn, AI operations

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

Matthew Gallagher built Medvi, a GLP-1 telehealth startup, in 14 months with $20,000 and AI tools. 2 employees. 16.2% net margin. $401M in year one. Here's how the model works — and where it's breaking.

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

substackcdn.com

What if your code review was already done when you woke up, and your newsletter

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

What if your code review was already done when you woke up, and your newsletter sources were already organized? Here's how to automate recurring tasks with Claude Code Scheduled Task.

Next →Antioch — Meet the Cursor for Robot AI