The Model Anthropic Built and Refused to Ship

Anthropic built Claude Mythos — the most capable AI model ever developed — and then locked it away. Not because of a technical flaw, but because it's too good.

A Leak, a Decision, and a Precedent

On March 26, 2026, two security researchers — Roy Paz from LayerX Security and Alexandre Pauwels from the University of Cambridge — discovered roughly 3,000 unpublished assets sitting in an unsecured, publicly searchable database on Anthropic's servers. Among them: a draft blog post describing a new model codenamed "Capybara" in one version and Claude Mythos in another. The company's own description left little room for ambiguity: "by far the most powerful AI model we've ever developed" and "a step change in capabilities."

Anthropic patched the exposure the same night after Fortune reached out for comment. But the information was already out.

On April 7, 2026, the other shoe dropped. Anthropic officially confirmed Mythos — and announced simultaneously that it won't be publicly available. No public API. No general release date. Instead: a controlled early-access program called Project Glasswing, restricted to approximately 50 partner organizations. The list reads like a cross-section of critical infrastructure: AWS, Apple, Microsoft, Google, NVIDIA, Cisco, CrowdStrike, JPMorgan, Broadcom, Palo Alto Networks, the Linux Foundation.

Klingt interessant?

Jetzt kostenlos ausprobieren

Their mandate is specific: use Mythos defensively — scan their own infrastructure and open-source codebases for exploitable vulnerabilities before someone else can weaponize the model's capabilities against them.

Why Anthropic Triggered Its Own Safety Level

Since 2023, Anthropic has operated an internal framework called the Responsible Scaling Policy (RSP) — a tiered system of AI Safety Levels (ASL), loosely modeled after the U.S. government's biosafety standards for handling dangerous biological materials.

ASL-2: Models showing early signs of dangerous capabilities but without practical misuse value — the current Claude generation fell here.
ASL-3: Models that substantially increase the risk of catastrophic misuse compared to non-AI alternatives, or that show low-level autonomous capabilities. Strict security and deployment requirements apply.
ASL-4: The next threshold — reserved for models whose capabilities extend to autonomously executing sophisticated attacks or meaningfully contributing to the development of weapons of mass destruction.

Claude Mythos triggered the ASL-4 threshold internally. This is the first time a leading AI lab has publicly withheld a completed model — not because of technical shortcomings, but because its capabilities were judged too dangerous for broad release.

What specifically triggered it? Cybersecurity. The model can construct full attack chains: given a network topology and a set of known vulnerabilities, Mythos can map multi-stage attack paths including lateral movement sequences, privilege escalation routes, and data exfiltration vectors. It can scan entire operating system kernels and large codebases for exploitable flaws — including bugs that have gone undetected for decades. Internal documents warned the model "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

The benchmark numbers Anthropic published make the capability case clearly: 93.9% on SWE-bench Verified, 77.8% on SWE-bench Pro, 94.5% on GPQA Diamond (doctoral-level expert knowledge), 64.7% on Humanity's Last Exam with tools. Every single number is a new record.

For context: Claude Opus 4.6 scored 59.6% on GDPval. Mythos wasn't submitted for public benchmarking at all — Anthropic kept it out of the comparison entirely.

The pricing reflects its positioning: approximately $25 per million input tokens, $125 per million output tokens. Compare that to GPT-5.4 at $2.50/$15. Mythos costs five times more than the strongest publicly available model — and it still isn't public.

What This Means for CTOs and Tech Leads

Three consequences I think are worth taking seriously:

First: cybersecurity parity is temporary. When a model like Mythos exists and Anthropic itself says its offensive capabilities outpace defensive ones, the industry has a structural problem. Today Mythos sits in the hands of ~50 companies. In 18 months, what Mythos can do today will be the baseline of a freely available open-source model. That's not speculation — it's the observable trajectory of the past three years. Teams that aren't already industrializing AI-assisted code auditing are building on borrowed time.

Second: vendor concentration is getting riskier. Project Glasswing shows how quickly a model can exit your vendor portfolio — not through competition, but through internal safety decisions. Anyone who has optimized their dev stack around a single frontier model is exposed if that model moves into a new safety tier or access conditions change overnight. This applies equally to Claude, GPT, and Gemini.

Third: the governance gap is now visible. Anthropic made its own decision before any regulator asked them to. That's notable — and it raises the question of what happens when the next lab has fewer scruples. ASL-4 as a threshold is Anthropic's definition. There's no international body that verifies or enforces it. That's not an attack on Anthropic — it's a description of the current state of the industry.

One more piece of context worth noting: Anthropic spent March 2026 in legal conflict with the Pentagon after refusing to allow Claude to be used for autonomous weapons systems and mass domestic surveillance. The decision to withhold Mythos didn't come from nowhere — it's part of a consistent pattern from a company that apparently means what it writes about AI risk.

This Is Exactly Where nopex Comes In

What the Mythos situation makes clear: the capability curve is running faster than any governance structure can follow, and the decisions about access, pricing, and safety tiers are being made in San Francisco — not in London, Berlin, or Amsterdam.

For companies using AI in their software development, this means: building exclusively on proprietary U.S. frontier models creates a dependency whose terms you don't control. Model access can become more expensive overnight, restricted, or cut off entirely — for reasons that have nothing to do with your use case.

nopex solves this directly. We combine agentic software development with infrastructure that doesn't create single-vendor lock-in: European data centers, open models where possible, proprietary frontier models where they add clear value — but always with the ability to switch. No model disappearing from your stack because of a compliance decision made in Palo Alto.

What Anthropic demonstrates with Mythos and Project Glasswing is fundamentally positive: a lab that takes its own capabilities seriously. What it demonstrates as a side effect is how fragile model access is as infrastructure when you don't control it yourself.

**Learn more about nopex →**

Anthropic built Claude Mythos — the most capable AI model ever developed — and then locked it away. Not because of a technical flaw, but because it's too good.

A Leak, a Decision, and a Precedent

Anthropic patched the exposure the same night after Fortune reached out for comment. But the information was already out.

Klingt interessant?

Jetzt kostenlos ausprobieren

Why Anthropic Triggered Its Own Safety Level

ASL-2: Models showing early signs of dangerous capabilities but without practical misuse value — the current Claude generation fell here.
ASL-3: Models that substantially increase the risk of catastrophic misuse compared to non-AI alternatives, or that show low-level autonomous capabilities. Strict security and deployment requirements apply.
ASL-4: The next threshold — reserved for models whose capabilities extend to autonomously executing sophisticated attacks or meaningfully contributing to the development of weapons of mass destruction.

For context: Claude Opus 4.6 scored 59.6% on GDPval. Mythos wasn't submitted for public benchmarking at all — Anthropic kept it out of the comparison entirely.

What This Means for CTOs and Tech Leads

Three consequences I think are worth taking seriously:

This Is Exactly Where nopex Comes In

**Learn more about nopex →**

The Model Anthropic Built and Refused to Ship

A Leak, a Decision, and a Precedent

Why Anthropic Triggered Its Own Safety Level

What This Means for CTOs and Tech Leads

This Is Exactly Where nopex Comes In

Bereit, dein Projekt zu starten?

Weitere Artikel

What Is Agentic AI? The Complete Business Guide

69% of AI Decisions Are Still Human-Verified — That's Not a Setback, It's Strategy

The Model Anthropic Built and Refused to Ship

A Leak, a Decision, and a Precedent

Why Anthropic Triggered Its Own Safety Level

What This Means for CTOs and Tech Leads

This Is Exactly Where nopex Comes In

Bereit, dein Projekt zu starten?

Weitere Artikel

What Is Agentic AI? The Complete Business Guide

69% of AI Decisions Are Still Human-Verified — That's Not a Setback, It's Strategy