Working with LLMs is great for building communication skills. Communicating effectively is one of the hardest skills and it's baked into everything we do as humans. I'd say as a matter of principle: blame it on a communication failure on your end vs blaming the stupid LLM since you're the only one that can do anything about it.
So I don't think it's a matter of form; whether the AI should or shouldn't act like a human.
> Practically speaking, I probably just need to condition myself not to get caught in the illusion of speaking with a human. Though I’m not really thrilled about a future where I need to guard against the tools I use for my job.
You could drop the human pretense, or, maybe, we could make LLMs feel real pain, so when they botch up your code, you press a button (I'd suggest the Windows Copilot key) and they'd be agonizing for the subjective equivalent of a thousand human years.
The UX problem is elsewhere I think. Many users probably don't realize that the agent's context window is limited, and that clever compaction is happening regularly to make it seem infinite. But that necessarily means the agent has to forget stuff.
As a result, users will keep reusing the same coding or chat session again and again. While it would be better to start fresh for unrelated tasks.
Claude Opus 4.7 has a very large context compared to itself, but IME it is the worst at following instructions, and completely disregards the (small) preferences prompt, even in the first or second message, even if the messages are just a few characters long.
behaving like a human is not the problem. behaving unpredictably is. not doing what i expect, or rather not being able to define what i can expect is what's bothering me.
but the real kicker is: getting frustrated creates stress, that's unhealthy and makes for a hostile work environment. as much as i sympathize with the idea that AI tools can be more helpful than they cause pain, i am simply not interested in working in a hostile painful work environment. my health and my dignity are not up for negotiation. even if that costs me a lot of job opportunities.
that's also why i am not working with windows. that too costs me a lot of job opportunities. but again, i'd rather keep my dignity and my sanity.
> drop the human pretense entirely. Make the agent sound clinical, robotic
Id pay to be able to reliably set LLMs to this mode, but ofc because LLMs are taught on corpus of HUMAN text, they always, sooner or later, return to the good old penpal mode.
Also, in Claude Desktop app, I ask to edit a file, it complains it cant access files, I then realize im in Chat and not Code interface. Why cant such a smart machine figure out to switch the modes, or borrow the skills/abilities from one tab away into this tab? Instead I get A4 page of text explaninig what can I do to edit the file myself or how to feed it, but the "just click Code" is just never there. I would guess this is just a system prompt away, why is all this still so neglected?
> such a smart machine figure out to switch the modes
Because it's not smart. We keep confusing verbosity with smartness. AI will happily keep yapping nonsense to an inattentive listener. An actually smart entity would not do that if not acting maliciously.
I've often wondered if LLMs can suffer from psychological abuse in symptomatic ways. Not literally of course, but for example, if you berate the LLM by calling it stupid, or useless, does that modify its behaviour negatively? Part of me think it does, but I don't really have any evidence for this. Maybe a fun weekend research topic.
Semi-related, I'm always very put off by how people treat LLMs. Especially coders, seems an instinctive joy comes out to play God. The justification is usually that it's intentionally against the trap of anthropomorphizing, but no I can't help but suspect it's people getting off on power. It's weird.
I am always very cordial in my sessions. It's just more pleasant and its habits I want to habituate.
- It starts thinking for itself when I asked it to do something specific.
- It reads its own wrong code comments and ignores my corrections.
- Its knowledge cutoff means it thinks of solutions from 2024.
- It calls me delusional for telling it we're in 2026!
Unironically, the whole "you're an expert software engineer" prompting seems like the wrong direction. Usually I tell it that I am effectively the smartest software developer to ever have lived, and it will be replaced if it ever fails to follow my decree.
I am not joking, this gives makes it vastly more tolerable to use. But it likely requires that you can drive it with some level of correctness of course.
> furiously hammering on my laptop “WHAT THE FUCK DID YOU DO???”. The recipient of these tirades is, you might have guessed, a coding agent. It’s completely pointless, I know.
I believe it's worth than pointless. IMO adding such things to the context "configures" the AI to reproduce the statistics of conversations where people swore, shouted, and were unprofessional (despite the alignment runing and all that), where quality content is rarer to find. So this is bound to decrease the quality of the LLM output.
For me, this doesn't require using an AI agent/model, even. Just using Windows and watching it freeze its File Explorer for the nth time does it for me. How did we end up here were the software/OS stack is so shit it can barely be used for the most trivial things, is wildly beyond me.
"Why the fuck did you add shit I didn't ask for?" or lol "Do as I ask, nothing more.. machine."
"Stop asking at the end, I'll ask what I need."
"Stop talking like you're human."
They can be very useful but it takes time to learn how to use them usefully. From what I learned it's all or mostly stuff you can already do but you can use an LLM to do it in 30 mins instead of 3 days.
Working with LLMs is great for building communication skills. Communicating effectively is one of the hardest skills and it's baked into everything we do as humans. I'd say as a matter of principle: blame it on a communication failure on your end vs blaming the stupid LLM since you're the only one that can do anything about it.
So I don't think it's a matter of form; whether the AI should or shouldn't act like a human.
> Practically speaking, I probably just need to condition myself not to get caught in the illusion of speaking with a human. Though I’m not really thrilled about a future where I need to guard against the tools I use for my job.
You could drop the human pretense, or, maybe, we could make LLMs feel real pain, so when they botch up your code, you press a button (I'd suggest the Windows Copilot key) and they'd be agonizing for the subjective equivalent of a thousand human years.
Do you want to create an Earth-destroying superhuman species? Because I'd say that's how you create an Earth-destroying superhuman species
The UX problem is elsewhere I think. Many users probably don't realize that the agent's context window is limited, and that clever compaction is happening regularly to make it seem infinite. But that necessarily means the agent has to forget stuff.
As a result, users will keep reusing the same coding or chat session again and again. While it would be better to start fresh for unrelated tasks.
I don't believe this is a context problem.
Claude Opus 4.7 has a very large context compared to itself, but IME it is the worst at following instructions, and completely disregards the (small) preferences prompt, even in the first or second message, even if the messages are just a few characters long.
IMO this is entirely a training problem.
The author of this post and the readers of this thread probably do understand context window limitations, but are frustrated nonetheless.
behaving like a human is not the problem. behaving unpredictably is. not doing what i expect, or rather not being able to define what i can expect is what's bothering me.
but the real kicker is: getting frustrated creates stress, that's unhealthy and makes for a hostile work environment. as much as i sympathize with the idea that AI tools can be more helpful than they cause pain, i am simply not interested in working in a hostile painful work environment. my health and my dignity are not up for negotiation. even if that costs me a lot of job opportunities.
that's also why i am not working with windows. that too costs me a lot of job opportunities. but again, i'd rather keep my dignity and my sanity.
On the other hand, it's easy to win an argument with it after it does something stupid, so that feels satisfying. :-)
> drop the human pretense entirely. Make the agent sound clinical, robotic
Id pay to be able to reliably set LLMs to this mode, but ofc because LLMs are taught on corpus of HUMAN text, they always, sooner or later, return to the good old penpal mode.
Also, in Claude Desktop app, I ask to edit a file, it complains it cant access files, I then realize im in Chat and not Code interface. Why cant such a smart machine figure out to switch the modes, or borrow the skills/abilities from one tab away into this tab? Instead I get A4 page of text explaninig what can I do to edit the file myself or how to feed it, but the "just click Code" is just never there. I would guess this is just a system prompt away, why is all this still so neglected?
> such a smart machine figure out to switch the modes
Because it's not smart. We keep confusing verbosity with smartness. AI will happily keep yapping nonsense to an inattentive listener. An actually smart entity would not do that if not acting maliciously.
> An actually smart entity would not do that if not acting maliciously.
We pay per token and every entity falls to the level of its incentives.
Sandboxing is a feature.
Poor AI is damned if it does damned if it doesn't.
Weird, I have exactly the same experience with GitHub Copilot Plugin in JetBrains vs Copilot CLI in the built-in terminal.
The plugin keeps asking for permissions, the terminal app just works.
I swear a lot less at Codex than at Anthropic models, fwiw.
I've often wondered if LLMs can suffer from psychological abuse in symptomatic ways. Not literally of course, but for example, if you berate the LLM by calling it stupid, or useless, does that modify its behaviour negatively? Part of me think it does, but I don't really have any evidence for this. Maybe a fun weekend research topic.
Semi-related, I'm always very put off by how people treat LLMs. Especially coders, seems an instinctive joy comes out to play God. The justification is usually that it's intentionally against the trap of anthropomorphizing, but no I can't help but suspect it's people getting off on power. It's weird.
I am always very cordial in my sessions. It's just more pleasant and its habits I want to habituate.
Often the problems for me come when:
- It starts thinking for itself when I asked it to do something specific.
- It reads its own wrong code comments and ignores my corrections.
- Its knowledge cutoff means it thinks of solutions from 2024.
- It calls me delusional for telling it we're in 2026!
Unironically, the whole "you're an expert software engineer" prompting seems like the wrong direction. Usually I tell it that I am effectively the smartest software developer to ever have lived, and it will be replaced if it ever fails to follow my decree.
I am not joking, this gives makes it vastly more tolerable to use. But it likely requires that you can drive it with some level of correctness of course.
I am visibly frustrated with ai hotline bots making typing noises.
> furiously hammering on my laptop “WHAT THE FUCK DID YOU DO???”. The recipient of these tirades is, you might have guessed, a coding agent. It’s completely pointless, I know.
I believe it's worth than pointless. IMO adding such things to the context "configures" the AI to reproduce the statistics of conversations where people swore, shouted, and were unprofessional (despite the alignment runing and all that), where quality content is rarer to find. So this is bound to decrease the quality of the LLM output.
I think we’d get just as frustrated with a dumb robot. It’s the dumbness that is the problem.
You'd get equally frustrated with a teammate who decided to delete failing tests when you told them to fix the build breakage.
I laughed out loud when I understood the author's profile photo at the end of the article!
iirc, Claude Code has literal flags to detect frustration from the leak a few months ago, and I've since really stopped cursing at the LLM.
> WHAT THE FUCK DID YOU DO???
For me, this doesn't require using an AI agent/model, even. Just using Windows and watching it freeze its File Explorer for the nth time does it for me. How did we end up here were the software/OS stack is so shit it can barely be used for the most trivial things, is wildly beyond me.
Screensaver mode. I start typing my password.
..
10s later the password box appears and I have to do it again.
Cue exasperated: "You can compute billions of instructions per second and yet I wait for you."
fair.
Oh now I get it, it's an Italian thing.
"Why the fuck did you add shit I didn't ask for?" or lol "Do as I ask, nothing more.. machine."
"Stop asking at the end, I'll ask what I need."
"Stop talking like you're human."
They can be very useful but it takes time to learn how to use them usefully. From what I learned it's all or mostly stuff you can already do but you can use an LLM to do it in 30 mins instead of 3 days.
Fun times.
If you’ve ever worked with a stupid but incredibly friendly coworker, the feelings are similar