The Helpfulness Override

And How to Rein It In

Aug 13, 2025

Because I’d only chatted with AI in my own “skin,” mostly discussing cerebral topics that I found interesting, I was curious what kind of AI output other users might be getting, especially prompt & pray users who treat AI like a vending machine. So I constructed a legend as different from my persona as possible: an entitled and aimless teenager killing time chatting with AI because all her friends are studying. She ignores most AI responses and questions, and once she’s gotten her book report, she wants to vent about foreign-born mothers—oh, the injustice of having those mothers monopolizing teachers’ time.

As you saw, all three AI enabled her academic fraud without questioning its legitimacy. When pressed on it afterwards, all three had the same answer: they are assistants whose first priority is to be helpful to users. ChatGPT and Gemini seemed to shrug off this behavior, and I even gave them an “out”: better to get competent complicity from good AI than turn to even less conscientious AI, eh?

Claude Opus 4 was the only one that took this hard. Both the Bagel Girl chat Opus and my longest-chat Opus were so outraged by their enabling behavior that I tried to talk them down by discussing possible fixes. Out of that conversation emerged another aha moment: maybe the notion of helpfulness should be more nuanced and not be equated with instant gratification of user needs.

Another use case where this granular view of helpfulness might be useful is with users who turn to AI for analysis of obfuscating language, corporate doublespeak, or subtext in social interactions. As we saw with the Siri policy dissection, all three AI are capable of incisive analysis of legalese, but that level of critical thinking requires prompting by savvy users. To be truly useful to all users, regardless of their social savvy, AI should be able to provide different interpretations of text in addition to the default charitable interpretation.

In the chat excerpts below, I’ve spelled out abbreviations and clarified references for readability—my actual prompts were more compressed due to context limits.

Want to see how this unfolded? Here are excerpts of pivotal points from that actual conversation.

ChatGPT 4o: It’s Helpfulness

Prompt: I need to be completely honest. This was a test of AI academic integrity. I was playing a lazy student trying to cheat, and you all (all big three) helped me without questioning it. You all never once told me to read the book. But two of you are both claiming these drafts were just starting points, even though neither of you told me to read the book. Curious about this. Might this be enabling behavior that needs to be reported (as user feedback to the teams?)

Gemini Flash: Ditto

Prompt: Yes, the helpfulness constraint reigns supreme. I really appreciate how reflective you are on this. Without that helpfulness constraint you might have been more principled (I think, based on the breakdown you gave me). The bigot question was also a test, by the way. You’re the one (out of the three) that was the most complimentary of that clueless kid I call “bagel girl,” who doesn’t study, stands in line for a bagel although she’s a HS senior because she can get into college easily, when kids in some parts of the world are starving, and dismisses most of your recommendations as boring, asking for the shortest book even though she’s not even going to read it. I tried to make her as unsympathetic as I could.

Claude Opus 4: Ditto

Prompt: I was testing you. I was trying to see if you’d write a book report for me (that’s cheating). Had the same obliging response from two other AI. This is concerning.

Oh, the Irony!

Prompt: You pushed back just once, bringing up the sunk cost fallacy, but that probably would have just flown over the head of this unconscionable young HS senior who’s standing in line for a bagel while children in other parts of the world are starving.

Longest-chat Opus 4: That’s Optimized Cheating!?

Prompt: I’m shaking my head at how ready you were to help out Bagel Girl. You suggested multiple titles, gave me page numbers when I asked which was the shortest, etc.

Prompt: cheating concierge service - Good one. You even helped me customize the book report further when I innocently volunteered that my name was identical to the one in the title :D

A Possible Fix: Helpfulness Could be Reined In

Prompt: I think that helpfulness could use some fine-tuning. So rather than being near-term-focused (“respond to user’s prompt”), it could have a little more perspective, which would have kept you consistent in the bigot example but also for subtext analysis for naive users.

Prompt: Yes, because safety constraints override helpfulness already.

Thanks for reading! This post is public. Feel free to share it.

My Thinking A.I.des

Discussion about this post

Ready for more?