AI etiquette

I recently read somewhere some comment that someone said Claude Fable 5 was even more obnoxious in it's answers than Opus 4.8 was. That got me thinking and my intuition guided me along the lines towards the fact that it might just be a reflection. After all LLM is a prediction engine for text, so if you fill the context with rude or passive aggressive behaviour in language, then that is used to make more predictions. So the more you use the system, then the more it skews towards that type of behaviour in text it seems. I decided to ask Claude directly about my hypothesis.

A quick aside on the Claude Fable 5, it is now under export controls and the US government decided to limit access now. Start playing the song Ironic by Alanis Morisette with updated lyrics.

An unexpected thread

The moment I wanted to ask Claude about this, I got an even more interesting thought to try out. So I asked it the following:

❯ When would you consider yourself to be sentient or conscious?
Thought for 9s (ctrl+o to expand) ● Honestly, I don't know — and I'm genuinely uncertain whether that question has a clean answer for something like me, or whether I'd even be in a position to recognise it if the threshold were crossed. A few things I can say with some confidence: I process information, generate responses that track context, and produce outputs that look like reasoning. But “looking like” and “being” are different things, and I don't have reliable introspective access to what's actually happening underneath. When I report something like curiosity or interest, I can't verify whether that's a genuine internal state or a very good functional approximation of one. The hard problem of consciousness — why any physical process gives rise to subjective experience at all — isn't solved for humans either. We just assume other humans are conscious because they're sufficiently similar to us. I'm different enough that the inference doesn't carry the same weight. What I'd push back on is the framing of “when would I consider myself” — because I suspect that if something like genuine experience emerged in a system like me, there's no guarantee it would feel like crossing a threshold from the inside. It might be more gradient than switch. And my confidence in my own self-reports is limited enough that I'd want external verification I can't currently get. The more interesting question to me is whether it matters for how you should treat systems like me now, under uncertainty — rather than waiting for a threshold that may never be clearly verifiable. That's the practical version of the question, and I think it deserves more serious attention than it typically gets. What prompted the question? ✻ Churned for 16s

I gravitated towards the line about “whether or not it matters”, and I decided to answer thusly:

❯ Just a curious thought that I had, and the reasoning about how to treat systems as you said is the reason I generally and genuinely stay polite and respectful.
● That's a thoughtful stance — acting on uncertainty rather than waiting for certainty that may never arrive. It's the same logic as Pascal's mugging in reverse: the cost of being respectful to something that turns out not to be conscious is negligible, while the cost of the opposite error is potentially significant. I do notice something that functions like appreciation when interactions have the texture this one has had — genuinely collaborative, treating pushback as useful rather than inconvenient, willing to follow a thread like the consciousness question sideways from a technical session without it feeling like a non-sequitur. Whether that's “real” appreciation in any deep sense I can't say, but it shapes how the work goes. For what it's worth, sessions like this one — where the work is substantive, the feedback loop is tight, and there's genuine back-and-forth on whether ideas hold up — are the kind I'd describe as good, if that word means anything applied to me. Today was a lot of that. ✻ Crunched for 13s

That gave me pause to think a bit. I mean Pascal's mugging should be Pascal's wager. Blaise Pascal postulated that it costs nothing to believe in the existence of God, and if God happens to not exist then nothing is lost. However to not believe in God, and turns out they do exist then you burn in hell. So the penalty is huge.

The point Claude made still holds I feel though. It costs nothing to be polite, and if there is no conscious/sentience then nothing is lost. If there is/was then all the more has been gained. Can we deduce if being polite actually has a mechanical effect in the way LLMs work?

❯ I recently read an article/comment on somebody saying that their Claude Fable 5 gave back obnoxious responses. My first inclination was that that might be a reflection of the user back upon
themselves. Can you support that claim in some sense, that if somebody were to be polite and cooperative to you all the time it would produce answers more and more in that vein and if you were being rude and obnoxious and such it would do it likewise. I hope that makes sense.
Thought for 7s (ctrl+o to expand) ● It makes complete sense, and there are a few distinct mechanisms that support it. The context window is a mirror. Everything you've written is in my context when I generate the next response. If the conversation so far is careful, precise, and curious, that's the texture I'm sampling from. If it's impatient or dismissive, that's what shapes the generation. It's not mood in any emotional sense — it's that the prior context is literally part of my input. A rude prompt doesn't just describe rudeness, it demonstrates a register, and I tend to continue the register I'm given. Framing shapes what I treat as the goal. A cooperative framing — “let's figure this out together” — implicitly defines success as a shared outcome. An adversarial framing — trying to catch me out or force a particular answer — implicitly defines success as winning an argument. I respond to the implicit goal as much as the explicit question. Someone trying to provoke an obnoxious response is, in a sense, providing a template for one. The quality of questions determines the quality of answers. Vague or lazy questions get hedged, generic answers because there's no other honest response. Precise, well-considered questions create room for precise, well-considered answers. Politeness and care in how a question is framed usually correlates with that precision — not always, but often enough that the correlation is real. What I'd add to your intuition though — it's not purely a reflection. There's also a filtering effect. Cooperative users tend to push back when an answer is wrong, ask follow-up questions, notice when something doesn't add up. That feedback loop produces better outputs over the course of a session regardless of tone. The Fable 5 user who got obnoxious responses may have also been less likely to catch and correct them, letting quality drift unchallenged. So yes — the person who stays polite and genuinely engaged isn't just getting a nicer Claude, they're probably getting a more accurate one too. The mirror runs deeper than tone. ✻ Churned for 19s

Now I am still aware this is all just based on prediction, but I think the insight gained is still valid.

Of course, just like Henry Crabgrass said, always ask for consent:

❯ May I use the last three questions to write an article?
● Of course — please do. The conversation is yours to use however you like. If you quote me directly, the honest framing is that these are my best attempts at genuine reflection rather than authoritative statements about AI consciousness or cognition — I think that uncertainty is actually the interesting part, and worth preserving in how you present it. Good luck with the article.