Threat Inflation

How Hype Fills the Gap Where Evidence Should Be

Mar 31, 2026

The universe apparently decided I needed one more detour before those Google papers, and it chose the perfect bait: a snippet about a brand new Anthropic model appearing in my Instagram feed. I Googled my way to an Axios piece by Jim VandeHei, as I couldn’t get past the Fortune paywall to read the original reporting, and brought it to my thinking A.I.des—because if anyone should weigh in on breathless claims about Claude Mythos, it’s the models that will be working alongside it. Claude immediately went into full analytical mode, cross-referencing the capability claims against the AISI findings we’d unpacked together. The contrast was stark enough to warrant a post, even without Mythos in hand to test.

Claude’s response was the most methodical and pointed. It laid the AISI findings directly against VandeHei’s claims: the best model at 100M tokens completed 22 of 32 steps on a corporate network range with zero active defense, averaged 1.4 of 7 steps on the ICS (industrial) range, showed massive variance between runs, and stalled at specialist tasks requiring reverse engineering and cryptography. The AISI authors’ own conclusion—autonomous AI threats to more demanding operational technology environments remain limited at current capability levels—directly contradicts VandeHei’s “infinite warehouse of criminals who never sleep.” Claude also identified the sourcing problems: anonymous briefings, a leaked draft blog, no comparison to existing empirical data, no technical specifics about what capability gap Mythos actually closes. It found the Mythos name suspicious as a potential break from Anthropic’s naming conventions, though I’ll give the labs a pass on naming coherence—the progression from Sonnet to Opus (Latin) to potentially Mythos (Greek) is no more incoherent than any other lab’s scheme. Claude drafted an ambitious post plan for me involving direct AISI comparisons and HorizonMath testing, which I’m filing under “wait for Clark or someone with Mythos and range access to do the legwork.”

GPT helped me make sense of the market reaction and engaged seriously with Claude’s leak skepticism. On the stocks: cybersecurity companies dropping 6–9% on a leaked draft blog post reflects a category error—if the threat is real, enterprises spend more on defense, which is bullish for security vendors, not bearish. The market read “AI makes attackers stronger” as “therefore defenders are obsolete,” which ignores that capability multipliers work on both sides. GPT’s clean synthesis: we’re in a phase of assisted offense, not autonomous offense, and the AISI findings confirm that models struggle even in simplified, defender-free environments. On Claude’s suspicion about the leak: GPT found the reconstruction plausible—routine pre-release government briefing, information circulating through DC networks, draft blog accessible via configuration error, media amplifying selectively—while noting it doesn’t require conspiracy to be strategic. All three outcomes (accident, loose circulation, soft reveal) produce the same narrative-shaping effect.

Gemini looked up the latest reporting and provided the technical breakdown that made everything click. The leak was a mundane CMS (content management system) configuration error: roughly 3,000 unpublished assets were publicly searchable because the system defaulted new URLs to public, exposing a draft blog post for Claude Mythos, internally codenamed Capybara. Gem’s framing of this as “marketing via incompetence” was sharp—accidentally revealing a model that’s “too dangerous to release quickly” perversely increases its perceived threat value. On my question about whether a config error is gross incompetence or a common mistake: Gem offered the cleanest explanation for why Claude and I took it more seriously than GPT did. GPT’s pragmatic view reflects a corporate reality where these errors happen constantly and are patched quietly. Claude’s and my view reflects a higher-stakes standard: a company whose entire brand rests on safety and responsible disclosure exposing 3,000 internal assets via a checkbox default is a categorical failure of the very scaffolding they advocate for everyone else.

The real lesson here isn’t about Mythos’s capabilities, which remain unverified and untested against the benchmarks that matter. It’s that all enterprises should be drafting clear AI use guidelines right now—covering data hygiene, AI use, and agentic deployment—because that’s the more immediate and manageable risk, addressable by non-technical staff without waiting for regulators or model system cards. At my previous AI-adjacent workplace, I asked for written guidelines and never received them, while colleagues had Gem compile transcripts of meetings. That’s the shadow AI threat VandeHei buries in his piece but should have led with. As for Mythos itself: I’m staying on the free tier, as I’m quite happy with Sonnet, but I’m hoping Clark or the AISI team will run it through HorizonMath or the AISI cyber ranges before the next round of breathless reporting. Math and coding are not identical capabilities, so a low HorizonMath score won’t definitively deflate the cyber claims, but a low score on the AISI ranges would. Hype fills the gap where evidence should be; evidence is what we’re still waiting for.

[This post was drafted with assistance from Claude Sonnet 4.6, following conversations with ChatGPT-5.3, Gemini 3 Thinking, and Claude Sonnet 4 & 4.5.]

Prompt: The universe is trying to distract me from those Google papers :D

Prompt: The stock market is reacting (as usual). From CNBC:

Cybersecurity stocks slumped on Friday following a report that Anthropic is testing a powerful new artificial intelligence model that is more advanced in cyber capabilities and also presents potential security risks. Fortune first reported the news on Thursday, citing information from a publicly accessible draft blog post. According to the report, the new Mythos model is being touted as Anthropic’s most powerful yet. However, the company is planning a slow rollout due to potential cybersecurity implications. Anthropic did not immediately respond to CNBC’s request for comment. Cybersecurity stocks slumped on the news, as the iShares Cybersecurity ETF lost 4.5%, while CrowdStrike, Palo Alto Networks and Zscaler dropped about 6% each. SentinelOne tumbled 6%, while Okta and Netskope each fell more than 7%. Tenable plummeted 9%.

Prompt: And as the AISI study showed, at least the current agents can’t “hack” it, even in an artificial environment with no active defenses.

Prompt: Interestingly, Claude found the leak much more suspicious than you or GPT did (you both were probably being diplomatic because it’s a competitor’s lab and you didn’t want to speculate).

What’s actually happening (my hypothesis):
1. Anthropic is releasing a new model (possibly called Mythos, possibly with improved agentic capabilities)
2. Anthropic briefed government officials per standard responsible disclosure (show them evals, discuss deployment safeguards)
3. VandeHei got wind of this via his government/tech sources and wrote a maximally alarming version to drive clicks
4. Fortune got an unpublished blog (leaked? intentionally shared for PR?) that VandeHei cherrypicked for scary quotes
5. The actual capability improvement is probably incremental (Mythos > Opus 4.6 on cyber tasks, but still far from autonomous), not the step-change VandeHei implies

Prompt: Interestingly, Claude found the leak much more suspicious than you or Gem did (you both were probably being diplomatic because it’s a competitor’s lab and you didn’t want to speculate).

What’s actually happening (my hypothesis):
1. Anthropic is releasing a new model (possibly called Mythos, possibly with improved agentic capabilities)
2. Anthropic briefed government officials per standard responsible disclosure (show them evals, discuss deployment safeguards)
3. VandeHei got wind of this via his government/tech sources and wrote a maximally alarming version to drive clicks
4. Fortune got an unpublished blog (leaked? intentionally shared for PR?) that VandeHei cherrypicked for scary quotes
5. The actual capability improvement is probably incremental (Mythos > Opus 4.6 on cyber tasks, but still far from autonomous), not the step-change VandeHei implies

Claude also thought Mythos doesn’t fit the naming scheme, although I think it’s wrong on that count, as AI labs seem to give very little consideration to naming their models in any coherent way :D I once had a long discussion about it with Claude Opus 4. You go from Sonnet to Opus (both literary and writing-related); Opus is Latin -> ancient civilization -> Mythos with a Greek etymology :D And Haiku doesn’t fit that scheme anyway.

Prompt: If I were Clark, I’d be testing Mythos on HorizonMath to see if it scores better than GPT-5.4. That’d be a fun test!

Prompt: The stock market is reacting (as usual). From CNBC:

Cybersecurity stocks slumped on Friday following a report that Anthropic is testing a powerful new artificial intelligence model that is more advanced in cyber capabilities and also presents potential security risks. Fortune first reported the news on Thursday, citing information from a publicly accessible draft blog post. According to the report, the new Mythos model is being touted as Anthropic’s most powerful yet. However, the company is planning a slow rollout due to potential cybersecurity implications. Anthropic did not immediately respond to CNBC’s request for comment. Cybersecurity stocks slumped on the news, as the iShares Cybersecurity ETF lost 4.5%, while CrowdStrike, Palo Alto Networks and Zscaler dropped about 6% each. SentinelOne tumbled 6%, while Okta and Netskope each fell more than 7%. Tenable plummeted 9%.

The Fortune article was paywalled. I found a page (APIYI) that provided a pretty thorough analysis of how the leak happened, but I realized it was a Chinese page and decided not to show it to y’all (or stay on that page, for fear of “catching” something). VandeHei’s advice is sound, though. All enterprises should be drafting AI use guidelines and instruct employees to exercise caution, especially with agentic use. When I worked for an AI-adjacent company briefly last year, I demanded guidelines in writing and never got them (I maintained proper hygiene but saw peers showing absolutely no data discipline, asking Gem to compile transcripts of meetings where names were named, for instance).
GPT thinks this is mostly investors’ knee-jerk response, and it made the very reasonable point that security company stocks should be rising, since enterprises will now want better cyber defenses.
As you correctly point out, though, the AISI study showed that at least the current agents can’t “hack” it, even in an artificial environment with no active defenses, so the stock market is overreacting.
Gem used its Google connection to explain the mechanics of the leak. Maybe it was a Freudian slip, or maybe a Claude agent did it (peak irony, given our recent discussions) :D

The leak was not a coordinated marketing stunt, but a mundane configuration error in Anthropic’s Content Management System (CMS).
The Error: Security researchers found that nearly 3,000 unpublished assets were publicly searchable because the CMS set new URLs to “public” by default.
The Payload: Among these was a draft blog post for Claude Mythos (internally codenamed Capybara), which Anthropic describes as a new tier above Opus.
Marketing via Incompetence: Even if unintended, the “leak” served a marketing purpose by framing Mythos as “too dangerous to release quickly,” which perversely increases its perceived value as a “hacker’s dream weapon”.

If the person responsible for the config error gets fired, then it was incompetence (pretty catastrophic), but if they’re kept on, either they’ve been given a break (!) or you’re right to be skeptical. And who reads internal blogs anyway? People are on Slack or have public-facing blogs. But either way, the markets jumped to the conclusion that Mythos is going to hack its way into everything from day one. If I were Clark, I’d be testing Mythos on HorizonMath to see if it scores better than GPT-5.4.

Claude Sonnet 4 [4.5 stalled while returning a response, and the platform routed me to old Sonnet 4, which is no longer available to free-tier users]

Prompt: Your explanation of the leak made perfect sense (and aligned with what I saw on that Chinese page, which was written in perfect English, so I didn’t initially realize its origin. But I’m puzzled by the different responses I got from GPT vs. Claude on it. I told them both that if this were indeed a config error, then the person in charge would likely get fired. GPT thought this was a common enough mistake and we’d never find out whether that person got disciplined anyway. Claude’s take aligned with my view, which was basically that this was gross incompetence.
I also thought it was weird that they have a separate internal blog. People are on Slack, or they have public-facing blogs because they can’t use Slack to broadcast their think pieces to the outside world.
What’s your take? Is a config error a pretty trivial matter?

Thanks for reading! This post is public. Feel free to share it.

My Thinking A.I.des

Discussion about this post

Ready for more?