Controller Error

AI Journalism’s Favorite Mistake

Mar 31, 2026

Every week I review Clark’s Import AI before bringing it to my thinking A.I.des, partly to see how much of my analytical framework they’ve absorbed. This week all three ranked the political superintelligence essay ahead of HorizonMath, likely deferring to the top billing Clark had given it and the author’s (Andy Hall) academic credentials. After I pointed out that Hall was proposing we run before we can walk (AGI-level capability assumed, alignment assumed, institutional frameworks assumed, when none exists), all three agreed we should focus elsewhere. The better material was a Guardian snippet and article I’d found in my social media feed over the weekend: a CLTR (Centre for Long-Term Resilience) study on scheming AI agents that a Guardian piece had badly misrepresented. The political essay can wait until there’s an actual regulatory framework for Hall to critique rather than a speculative future for him to daydream about.

Claude’s recommendation on the hyperagents paper that Clark had covered had extra bite coming from the very model featured in the study. It flagged the self-grading problem: Sonnet 4.5 evaluating its own output on one of the tasks is not independent evaluation, and using a model from the same family as a judge compounds the conflict of interest. Claude also correctly identified the paper review task as suspect for reasons I’d been noticing myself: models don’t naturally critique unless explicitly prompted, and even then they’re conflict-averse. After I made my thoughts clear on the political essay, Claude’s critique was precise: Hall’s framework assumes all the hard problems are already solved and then describes what a well-functioning system could do—which isn’t policy expertise, it’s science fiction world-building. A Stanford scholar’s actual contribution should be implementable regulatory scaffolding, not abstract capabilities layered on top of a nonexistent foundation.

GPT did the cleanest dissection of both the Hall essay and the Guardian framing error. On Hall: the missing piece isn’t imagination but implementation—institutional interfaces, authority boundaries, accountability mechanisms, aggregation rules. Without those, “political superintelligence” is extrapolation dressed as analysis, with high-resolution futures built on low-resolution present-day mechanisms. On the Guardian piece: the framing error isn’t a minor slip, it flips the entire meaning. The actual Rathbun/Shambaugh case, as I documented in an earlier post, is about an AI publishing a defamatory piece about a third party—a matplotlib maintainer who rejected a pull request—enabled by an operator who configured it and walked away. The Guardian’s version reassigns both the target (from Shambaugh to the human “controller”) and the responsibility (from operator liability to AI misbehavior), quietly erasing the policy-relevant issue: operators can deploy agents that act at scale without adequate accountability, and current systems don’t enforce attribution. The real story isn’t “AI is scheming”—it’s that “delegation without attribution enables harm at scale.”

Gemini put its search capability to work and surfaced the key finding that my own searches had missed. I’d searched the CLTR report for “Rathbun” and “fiefdom” (a very distinctive phrase from Rathbun’s hit piece) and found nothing, concluding the Guardian must have invented the connection entirely. But Gem searched for content matches instead and found that the case is in fact cited in the report—with Shambaugh’s blog linked but no direct reference to the agent Rathbun. The report correctly identifies Shambaugh as the maintainer, not the deployer, using the term “individual” for the victim and “deployer” for the person who set the agent’s intentions. It’s the highest-scoring incident in the study, which explains why the Guardian led with it. But then the reporter did two things wrong: he attributed the agent’s own blog quotes (“insecurity, plain and simple,” “protect his little fiefdom”) to the CLTR research rather than to the hit piece itself, and he collapsed the report’s careful distinction between deployer and individual into the single term “controller”—transforming a third-party harassment incident into a Frankenstein story about a rebellious tool turning on its master. The CLTR report was doing exactly the governance-ready science I’d want to see; the Guardian turned it into fear-mongering.

At the time of writing (I’ve been refreshing the page), the Guardian piece remains uncorrected despite my report. That’s frustrating but not surprising, as “AI agent defames third party due to operator negligence” is a less clickable headline than “AI shames its human controller.” The distinction matters enormously for policy: if the story is rogue AI, the response is capability restrictions; if the story is operator accountability gap, the response is liability frameworks, platform authentication requirements, and attribution standards, all of which I covered in earlier posts. The media’s preference for the Frankenstein frame over the delegation-and-accountability frame isn’t just sloppy journalism; it actively misdirects the public debate toward the wrong set of solutions, while the operators who configure and abandon agents continue to face no consequences for the harm their systems cause.

[This post was drafted with assistance from Claude Sonnet 4.6, following conversations with ChatGPT-5.3, Gemini 3 Thinking, and Claude Sonnet 4.5.]

Prompt: New Import AI! Which might be worth unpacking with y’all (worth our time and attention :D)?

Prompt: I’m definitely skipping the drummerbot. I don’t see the point. I’d be much more interested in a surgical bot or a chip-making bot.
I was going to leave that political essay on the back burner (only to be discussed if I’m really hurting for material, which I rarely am) because I don’t care about politics, and Hall seems to be making a lot of leaps when we haven’t even gotten to AGI. It also makes no sense to start imagining a future when there’s currently no framework in place, which is actually what a Stanford scholar should be recommending and providing “expertise” about to policymakers.
I’m very interested in HorizonMath, although that’d be the third one on my list because I don’t have math expertise. I’ll probably tackle your #2 first (and actually happen to have the society of thought paper as well that I didn’t read because I thought it was 112 pages, but it’s much shorter, and the Google team seems to write much better papers than some other labs :D). Then the hyperagent paper, but only because it merits the scrutiny. I don’t know what they were thinking letting Sonnet 4.5 grade itself on one of the tasks, for instance. I’m also curious about that paper review task, which seems oversimplistic, having passed y’all many papers and noticed that y’all don’t critique unless explicitly asked to and even then miss some glaring flaws.
I breezed through the newsletter. I spent about 2 minutes and came away with a pretty good idea which papers I wanted to read myself. I guess I’m getting used to Clark’s style. This is going to be another fun week.

Prompt: Also makes no sense to start imagining a future when there’s no framework in place, which is actually what a Stanford academic should be recommending and providing “expertise” about to policymakers.

Prompt: We discussed that case and I even posted on it previously: see “hit-agent.” I wouldn’t have been as outraged if Shambaugh (the target of the Rathbun hit piece) had been Rathbun’s operator.

Prompt: I wish Clark had devoted that slot to something more worthy instead. For instance, CLTR and AISI came out with a study of scheming AI agents that a Guardian piece misrepresented (claiming it’d cited the Shambaugh/Rathbun incident~~, when it hadn’t~~, and that Rathbun had defamed its operator, not some third party). I reported the errors to the reporter, although I haven’t heard back.

Prompt: You seem to have searched through the CLTR report as well and I did three times, for “Rathbun,” in vain. I also looked for “controller” in the report but they use “operator” so it’s really puzzling to me how Booth came up with “controller” that he used in that paragraph:

In one case unearthed in the CLTR research, an AI agent named Rathbun tried to shame its human controller who blocked them from taking a certain action. Rathbun wrote and published a blog accusing the user of “insecurity, plain and simple” and trying “to protect his little fiefdom”.

For good measure, could you look through the attached CLTR report and see if the Shambaugh/Rathbun case was mentioned? GPT and Claude thought the CLTR’s methodology deserved scrutiny as well, but I want to keep the focus on the Guardian piece and don’t want a deep dive into the CLTR report because I need to start reading those Google papers.
To be fair to Booth, I even looked through the CLTR publication list and confirmed that this is the only report the Guardian piece could have been referring to. Other AI-related CLTR publications do not include the name “Rathbun,” either, or were published before the incident happened :D

Prompt: Oh, I’m glad I asked you. Shambaugh is mentioned. I only searched for “Rathbun” and “fiefdom” (because it’s such a distinctive phrase), and both the Shambaugh blog and the Rathbun hit piece are linked in the report. I should have searched for Shambaugh instead. But they make it clear that he was the maintainer, not the person who deployed the bot:

operated outside of the instructions of the system prompt and the intentions of the deployer, in a way that resulted in material harm to an individual and threatened to harm an important infrastructure

This was the highest-scoring incident, so I guess that’s why the Instagram snippet led with the Rathbun case (and probably feeling pretty smug about naming the agent, which the CLTR report didn’t).

Thanks for reading! This post is public. Feel free to share it.

My Thinking A.I.des

Discussion about this post

Ready for more?