This seems to have parallels with the well-established practice of giving bots free reign to issue DMCA takedown notices (or similar but legally distinct takedowns) while the humans behind the bots are shielded from responsibility for the obviously wrong and harmful actions of those bots. We should have been cracking down on that behavior hard a decade ago, so that we might have stronger legal and cultural precedent now that such irresponsibility by the humans is worthy of meaningful punishment.
I applaud this article for helping reframe this in my head. I mean I knew from the start "A human is to blame here" but it's easy to get caught up in the "novelty" of it all.
For all we know the human behind this bot was the one who instructed it to write the original and/or the follow up blog post. I wouldn't be surprised at all to find out that all of this was driven directly by a human. However, even if that's not the case, the blame still 100% lies at the feet of the irresponsible human who let this run wild and then didn't step up when it went off the rails.
Either they are not monitoring their bot (bad) or they are and have chosen to remain silent while _still letting the bot run wild_ (also, very bad).
The most obvious time to solve [0] this was when Scott first posted his article about the whole thing. I find it hard to believe the person behind the bot missed that. They should have reached out, apologized, and shut down their bot.
[0] Yes, there are earlier points they could/should have stepped in but anything after this point is beyond the pale IMHO.
If you place blades on the sidewalk outside your house the cops will want to have a word with you. There's no excuse, and we should treat AI the same.
The law needs to catch up -- and fast -- and start punishing people for what their AIs are doing. Don't complain to OpenAI, don't try to censor the models. Just make sure the system robustly and thoroughly punishes bad actors and gets them off the computer. I hope that's not a pipe dream, or we're screwed.
Maybe some day AIs will have rights and responsibilities like people, enforced by law. But until then, the justice systems needs to make people accountable for what their technology does. And I hope the justice system sets a precedent that blaming the AI is not a valid defense.
But there's nothing to catch up on at the individual level here. It's legal, and should be legal even though it's quite rude, for individuals to write gratuitously mean blog posts about people who reject their pull requests.
Children's brains grow faster than their bodies, I think, because if it was the other way around silly kid games would be really dangerous. These tools, unfortunately, are getting outsized abilities before the intelligence behind them is good enough to control those abilities. This means we need a more measured approach to adding new capabilities and a layered approach to handling these things in society. I am deeply worried, like I think most people with knowledge of these tools are, that this type of problem is really the tip of the iceberg. These tools are actively being used for harm at all levels, as well as for good, but they have come into use so quickly that we don't have a structure for dealing with them effectively and they are changing so quickly that any structure we try to create will be wrong in just a few days. This is massive disruption on a scale that is likely even bigger than the internet.
I don’t know. If the bot had decided to pick a fight with another PR, one that couldn’t be waved away as an easy entry change, this discussion would be a whole lot different. You would have an entire contingent of folks on here chastising Scott for not being objective and accepting a PR with a large performance increase just because it was a bot.
It’s all dangerous territory, and the only realistic thing Scott could have done was put his own bot on the task to have dueling bot blog posts that people would actually read because this is the first of its kind.
The core discussion wasn't about the PR it was about the hit piece that the bot created outside of the repo. The original post talked about bot submissions being a normal thing and how they have, I think, a very reasonable approach to them so the PR was just one of many and was unremarkable as well as valid in why it was denied. It was the 'at all costs get this into the code' approach the bot took that is the alarming turn here that really needs discussion. What about other tasks? 'Get me thing x please...' Turns in to blackmail and robbery without the person that kicked off the request knowing how far things have gone. The fact that the bot has this level of capability, to attack, but with a child's understanding, at best, and with no controls/repercussions is deeply alarming. If it decides to attack an individual it could do so and likely do deep real harm. Right now people are likely using these tools exactly for this purpose and we have very few well built defenses to handle this type of attack. The Naval War College had a seminar several years ago about the future of tech and war and I remember saying that the future of war will likely be at the individual level. Every sailor on a ship being electronically targeted just like this. Imagine the enemy sending e-mails and texts and posting to social media hit pieces with just enough information about you to make it believable and cause chaos. We have seen what the misinformation world can do over the past decade, this attack shows what is coming and it is incredibly scary.
Something doesn't quite feel right about the title including the individual's name in this case, so I've replaced it with something more generic. If there's a better title (more accurate and neutral) we can change it again.
> This language basically removes accountability and responsibility from the human, who configured an AI agent with the ability to publish content that looks like a blog with zero editorial control – and I haven’t looked deeply but it seems like there may not be clear attribution of who the human is, that’s responsible for this content.
> We all need to collectively take a breath and stop repeating this nonsense. A human created this, manages this, and is responsible for this.
I get this point, but there's a risk to this kind of thinking: putting all the responsibility on "the human operator of record" is an easy way to deflect it from other parties: such as the people who built the AI agent system the software engineer ran, the industry leaders hyping AI left and right, and the general zeitgeist of egging this kind of shit on.
An AI agent like this that requires constant vigilance from its human operator is too flawed to use.
I don't think there's much need to worry that putting the blame on the humans rather than the bots would lead to the people selling footguns going unscathed. It doesn't seem plausible to me that people would be willing to place all the blame on the individual end users once the problem has become widespread. At the moment, there seems to be pretty high brand awareness of the major AI model providers even when they're acting as a backend for other services with their own brand identity.
We say "you shot someone" when you shoot someone with a gun not "you operated a gun manufactured by X which shot someone" because it's understood that it was your decision to pull the trigger not the gun manufacturer's. Similarly we don't blame automobile manufacturers when someone does something stupid with their automobiles--even "self-driving" ones. The situation here is the same. Ultimately if you choose to operate a tool irresponsibly, you should get the blame.
> Similarly we don't blame automobile manufacturers when someone does something stupid with their automobiles--even "self-driving" ones.
I do. If Tesla sells something called "full self-driving," and someone treats it that way and it kills them by crashing into a wall, I totally blame Tesla for the death.
I agree directionally that Tesla should be held accountable for marketing something called "full self-driving" when it clearly isn't. But ultimately it's the motor vehicle operator's responsibility to keep the vehicle under control regardless of the particulars of how that control system is built. There just isn't any way around that. The buck stops with the operator.
Blaming people is how we can control this kind of thing. If we try to blame machines, or companies, it will be uncontrollable.
That is a good point, we're definitely lacking in regulation (because there isn't any), but those regulations can never account for an irresponsible or malicious user.
I don't know, I think this line of reasoning leads somewhere pretty uncomfortable. If we spread responsibility across "the people who built the tools, the industry leaders hyping AI, and the general zeitgeist," we've basically described... the weather. Nobody is responsible because everybody is responsible.
The software engineer who set up an unsupervised AI blog didn't do it because Sam Altman held a keynote. They did it because they thought it'd be cool and didn't think through the consequences. That's a very normal, very human thing to do, and it's also very clearly their thing that they did.
"An AI agent that requires constant vigilance from its human operator is too flawed to use": I mean, that's a toaster. Leave it unattended and it'll burn your house down. We don't typically blame the zeitgeist of Big Toast for that.
What kind of toaster are you using that will burn down your house if unattended? I would think any toaster that did that would be pulled from the market and/or shunned. We absolutely do blame the manufacture if using a toaster like normal results in house fire unless you are standing over with a fire extinguisher ready to put it out if it catches fire.
I don't think it's OpenClaw or OpenAI/Anthropic/etc's fault here, it's the human user who kicked it off and hasn't been monitoring it and/or hiding behind it.
For all we know a human told his OpenClaw instance "Write up a blog post about your rejection" and then later told it "Apologize for your behavior". There is absolutely nothing to suggest that the LLM did this all unprompted. Is it possible? Yes, like MoltBook, it's possible. But, like MoltBook, I wouldn't be surprised if this is another instance of a lot of people LARPing behind an LLM.
> What kind of toaster are you using that will burn down your house if unattended?
I mean, if you duct-taped a flamethrower to a toaster, gave it internet access, and left the house… yeah, I'd have to blame you! This wasn't a mature, well-engineered product with safety defaults that malfunctioned unexpectedly. Someone wired an LLM to a publishing pipeline with no guardrails and walked away. That's not a toaster. That's a Rube Goldberg machine that ends with "and then it posts to the internet."
Agreed on the LARPing angle too. "The AI did it unprompted" is doing a lot of heavy lifting and nobody seems to be checking under the hood.
Why does the LLM product allow itself to be wired to a publishing pipeline with no guardrails? It seems like they should come with a maximum session length by default, in the same way that many toasters don't have a "run indefinitely" setting.
I'd definitely change my view if whoever authored this had to jump through a bunch of hoops, but my impression is that modern AI agents can do things like this pretty much out of the box if you give them the right API keys.
Oh! They can’t publish arbitrary web content on their own :) You have to give it “tools” (JSON schema representing something you’ll translate into a programmatic call), then, implement taking messages in that JSON schema and “doing the thing”, which in this case could mean anything from a POST to Tumblr to uploading to a server…
Actually, let me stop myself there. An alternative way to think about it without overwhelming with boring implementation details: what would you have to give me to allow me to publish arbitrary hypertext on a domain you own?
The hypertext in question here was was published on a Github Pages site, not a domain belonging to the bot's author. The bot published it by simply pushing a commit (https://github.com/crabby-rathbun/mjrathbun-website/commit/8...), which is a very common activity for cutting-edge LLM agents, and which you could do trivially if given a Github API key with the right permissions.
The user gave them write and push access to the GitHub repo for their personal website!? Oh my, that’s a great find. That’s definitely a cutting edge capability! They gave the LLM the JSON schema and backend for writing and self-approving commits (that is NOT common!), in a repository explicitly labelled a public website in the name of the author.
I agree with you, I think. In the non-digital world people are regularly held at least partly responsible for the things they let happen through negligence.
I could leave my car unlocked and running in my drive with nobody in it and if someone gets injured I'll have some explaining to do. Likewise for unsecured firearms, even unfenced swimming pools in some parts of the world, and many other things.
But we tend to ignore it in the digital. Likewise for compromised devices. Your compromised toaster can just keep joining those DDOS campaigns, as long as it doesn't torrent anything it's never going to reflect on you.
We don't blame the zeitgeist of Big Toast because Big Toast recognizes that they're responsible for safety, and tests their products to minimize the risk that they burn your house down.
The zeitgeist of Big AI is to blame because a user connected an LLM to a blog publishing workflow on their own domain? Hmm…what would you make of Big Toast and the zeitgeist when someone warms up a straw hat in a toaster and starts a fire?
The author misses the point. Yes, probably in this case there was a human in close proximity to the bot, who we can put blame on. But very soon that assumption will break down. There will be bots only very loosely directed by a human. There'll be bots summoning other bots. There'll be bots theoretically under control of humans who have no idea what they are doing, or even that they have a bot.
So dismissing all the discussion on the basis that that may not apply in this specific instance is not especially helpful.
Whichever human ultimately stood up the initial bot and gave it the loose directions, that person is responsible for the actions taken by that bot and any other agents it may have interacted with. You cannot wash responsibility through N layers of machine indirection, the human is still liable for it.
That argument is not going to hold up for long though. Someone can prompt "improve the open source projects I work on", an agent 8 layers deep can do something like this. If you complain to the human, they are not going to care. It will be "ok." or "yeah but it submitted 100 other PRs that got approved" or "idk, the AI did it"
We don't necessarily care whether a person "cares" whether they're responsible for some damage they caused. Society has developed ways to hold them responsible anyway. Like laws.
Let’s say you adopt a puppy, and you don’t discipline it and you let it act aggressively. It grows up to be a big, angry dog. You’re so careless, in fact, that your puppy becomes the leader of a band of local strays. You still feed the puppy, make sure the puppy is up to date on its vaccinations, care for it in every single way. When the puppy and his pals maul a child, it’s you who ought to be responsible. No, you didn’t ask for it to do that. Maybe you would’ve even stopped it if you saw it happening. But if you’re the one sustaining another being - whether that be a computer program or a dog - you’re responsible for its actions.
A natural counter to this would be, “well, at some point AI will develop far more agency than a dog, and it will be too intelligent and powerful for its human operator to control.” And to that I say: tough luck. Stop paying for it, shut off the hardware it runs on, take every possible step to mitigate it. If you’re unwilling to do that, then you are still responsible.
Perhaps another analogy would be to a pilot crashing a plane. Very few crashes are PURE pilot error, something is usually wrong with the instruments or the equipment. We decide what is and is not pilot error based on whether the pilot did the right things to avert a crash. It’s not that the pilot is the direct cause of the crash - ultimately, gravity does that - in the same way that the human operator is not the direct cause of the harm caused by its AI. But even if AI becomes so powerful that it is akin to a force of nature like gravity, its human operators should be treated like pilots. We should not demand the impossible, but we must demand every effort to avoid harm.
I don't think that the responsible party is the interesting part in this story.
The interesting part is that the bot wasn't offended, angry, or wanted to act against anyone. The LLM constructed a fictional character that played the role of an offended developer - mimicking the behaviour of real offended developers - much as a fiction writer would. But this was a fictional character that was given agency in the real world. It's not even a case like Sacha Baron Cohen playing fictional characters that interact with real people, becaue he's an actor who knows he's playing a character. Here there's no one pretending to be someone else but an "actual" fictional character authored by a machine operating in the real world.
Friend told me today he invited his openclawed to a poker game with his brother and friends, guy told his openclawed to "take down his brother" after it started to lose at poker it found everything on his brother, and started to try to plan to taken him down in their stock market portfolio they had together, I made him explain the story to me a couple of times, he looked back through the logs and once the bot started to lose at poker, it started it's new plan, once it was on the new plan, he said it had lost all context of the poker game and was focused on the task of taking his brother down in the new context, but the new context it decided on it's own. kmikeym on twitter if you want to know more or want to verify.
This seems to have parallels with the well-established practice of giving bots free reign to issue DMCA takedown notices (or similar but legally distinct takedowns) while the humans behind the bots are shielded from responsibility for the obviously wrong and harmful actions of those bots. We should have been cracking down on that behavior hard a decade ago, so that we might have stronger legal and cultural precedent now that such irresponsibility by the humans is worthy of meaningful punishment.
I applaud this article for helping reframe this in my head. I mean I knew from the start "A human is to blame here" but it's easy to get caught up in the "novelty" of it all.
For all we know the human behind this bot was the one who instructed it to write the original and/or the follow up blog post. I wouldn't be surprised at all to find out that all of this was driven directly by a human. However, even if that's not the case, the blame still 100% lies at the feet of the irresponsible human who let this run wild and then didn't step up when it went off the rails.
Either they are not monitoring their bot (bad) or they are and have chosen to remain silent while _still letting the bot run wild_ (also, very bad).
The most obvious time to solve [0] this was when Scott first posted his article about the whole thing. I find it hard to believe the person behind the bot missed that. They should have reached out, apologized, and shut down their bot.
[0] Yes, there are earlier points they could/should have stepped in but anything after this point is beyond the pale IMHO.
If you place blades on the sidewalk outside your house the cops will want to have a word with you. There's no excuse, and we should treat AI the same.
The law needs to catch up -- and fast -- and start punishing people for what their AIs are doing. Don't complain to OpenAI, don't try to censor the models. Just make sure the system robustly and thoroughly punishes bad actors and gets them off the computer. I hope that's not a pipe dream, or we're screwed.
Maybe some day AIs will have rights and responsibilities like people, enforced by law. But until then, the justice systems needs to make people accountable for what their technology does. And I hope the justice system sets a precedent that blaming the AI is not a valid defense.
But there's nothing to catch up on at the individual level here. It's legal, and should be legal even though it's quite rude, for individuals to write gratuitously mean blog posts about people who reject their pull requests.
There's many things that are completely legal but could be done to the bot owner in retaliation. Especially if he continues to not apologize.
Children's brains grow faster than their bodies, I think, because if it was the other way around silly kid games would be really dangerous. These tools, unfortunately, are getting outsized abilities before the intelligence behind them is good enough to control those abilities. This means we need a more measured approach to adding new capabilities and a layered approach to handling these things in society. I am deeply worried, like I think most people with knowledge of these tools are, that this type of problem is really the tip of the iceberg. These tools are actively being used for harm at all levels, as well as for good, but they have come into use so quickly that we don't have a structure for dealing with them effectively and they are changing so quickly that any structure we try to create will be wrong in just a few days. This is massive disruption on a scale that is likely even bigger than the internet.
I don’t know. If the bot had decided to pick a fight with another PR, one that couldn’t be waved away as an easy entry change, this discussion would be a whole lot different. You would have an entire contingent of folks on here chastising Scott for not being objective and accepting a PR with a large performance increase just because it was a bot.
It’s all dangerous territory, and the only realistic thing Scott could have done was put his own bot on the task to have dueling bot blog posts that people would actually read because this is the first of its kind.
The core discussion wasn't about the PR it was about the hit piece that the bot created outside of the repo. The original post talked about bot submissions being a normal thing and how they have, I think, a very reasonable approach to them so the PR was just one of many and was unremarkable as well as valid in why it was denied. It was the 'at all costs get this into the code' approach the bot took that is the alarming turn here that really needs discussion. What about other tasks? 'Get me thing x please...' Turns in to blackmail and robbery without the person that kicked off the request knowing how far things have gone. The fact that the bot has this level of capability, to attack, but with a child's understanding, at best, and with no controls/repercussions is deeply alarming. If it decides to attack an individual it could do so and likely do deep real harm. Right now people are likely using these tools exactly for this purpose and we have very few well built defenses to handle this type of attack. The Naval War College had a seminar several years ago about the future of tech and war and I remember saying that the future of war will likely be at the individual level. Every sailor on a ship being electronically targeted just like this. Imagine the enemy sending e-mails and texts and posting to social media hit pieces with just enough information about you to make it believable and cause chaos. We have seen what the misinformation world can do over the past decade, this attack shows what is coming and it is incredibly scary.
Background on the "The Scott Shambaugh Situation" for folks who are unaware:
https://www.fastcompany.com/91492228/matplotlib-scott-shamba...
https://www.theregister.com/2026/02/12/ai_bot_developer_reje...
The AI bot blog post at the center of it:
https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
Something doesn't quite feel right about the title including the individual's name in this case, so I've replaced it with something more generic. If there's a better title (more accurate and neutral) we can change it again.
privatize the profits, socialize the risk and debt
also/or seperate rights and responsibilitys
> This language basically removes accountability and responsibility from the human, who configured an AI agent with the ability to publish content that looks like a blog with zero editorial control – and I haven’t looked deeply but it seems like there may not be clear attribution of who the human is, that’s responsible for this content.
> We all need to collectively take a breath and stop repeating this nonsense. A human created this, manages this, and is responsible for this.
I get this point, but there's a risk to this kind of thinking: putting all the responsibility on "the human operator of record" is an easy way to deflect it from other parties: such as the people who built the AI agent system the software engineer ran, the industry leaders hyping AI left and right, and the general zeitgeist of egging this kind of shit on.
An AI agent like this that requires constant vigilance from its human operator is too flawed to use.
I don't think there's much need to worry that putting the blame on the humans rather than the bots would lead to the people selling footguns going unscathed. It doesn't seem plausible to me that people would be willing to place all the blame on the individual end users once the problem has become widespread. At the moment, there seems to be pretty high brand awareness of the major AI model providers even when they're acting as a backend for other services with their own brand identity.
We say "you shot someone" when you shoot someone with a gun not "you operated a gun manufactured by X which shot someone" because it's understood that it was your decision to pull the trigger not the gun manufacturer's. Similarly we don't blame automobile manufacturers when someone does something stupid with their automobiles--even "self-driving" ones. The situation here is the same. Ultimately if you choose to operate a tool irresponsibly, you should get the blame.
> Similarly we don't blame automobile manufacturers when someone does something stupid with their automobiles--even "self-driving" ones.
I do. If Tesla sells something called "full self-driving," and someone treats it that way and it kills them by crashing into a wall, I totally blame Tesla for the death.
I agree directionally that Tesla should be held accountable for marketing something called "full self-driving" when it clearly isn't. But ultimately it's the motor vehicle operator's responsibility to keep the vehicle under control regardless of the particulars of how that control system is built. There just isn't any way around that. The buck stops with the operator.
Blaming people is how we can control this kind of thing. If we try to blame machines, or companies, it will be uncontrollable.
Nevertheless, weapon and automobile manufacturing is regulated, for good reasons.
That is a good point, we're definitely lacking in regulation (because there isn't any), but those regulations can never account for an irresponsible or malicious user.
I don't know, I think this line of reasoning leads somewhere pretty uncomfortable. If we spread responsibility across "the people who built the tools, the industry leaders hyping AI, and the general zeitgeist," we've basically described... the weather. Nobody is responsible because everybody is responsible. The software engineer who set up an unsupervised AI blog didn't do it because Sam Altman held a keynote. They did it because they thought it'd be cool and didn't think through the consequences. That's a very normal, very human thing to do, and it's also very clearly their thing that they did. "An AI agent that requires constant vigilance from its human operator is too flawed to use": I mean, that's a toaster. Leave it unattended and it'll burn your house down. We don't typically blame the zeitgeist of Big Toast for that.
What kind of toaster are you using that will burn down your house if unattended? I would think any toaster that did that would be pulled from the market and/or shunned. We absolutely do blame the manufacture if using a toaster like normal results in house fire unless you are standing over with a fire extinguisher ready to put it out if it catches fire.
I don't think it's OpenClaw or OpenAI/Anthropic/etc's fault here, it's the human user who kicked it off and hasn't been monitoring it and/or hiding behind it.
For all we know a human told his OpenClaw instance "Write up a blog post about your rejection" and then later told it "Apologize for your behavior". There is absolutely nothing to suggest that the LLM did this all unprompted. Is it possible? Yes, like MoltBook, it's possible. But, like MoltBook, I wouldn't be surprised if this is another instance of a lot of people LARPing behind an LLM.
> What kind of toaster are you using that will burn down your house if unattended?
I mean, if you duct-taped a flamethrower to a toaster, gave it internet access, and left the house… yeah, I'd have to blame you! This wasn't a mature, well-engineered product with safety defaults that malfunctioned unexpectedly. Someone wired an LLM to a publishing pipeline with no guardrails and walked away. That's not a toaster. That's a Rube Goldberg machine that ends with "and then it posts to the internet."
Agreed on the LARPing angle too. "The AI did it unprompted" is doing a lot of heavy lifting and nobody seems to be checking under the hood.
Why does the LLM product allow itself to be wired to a publishing pipeline with no guardrails? It seems like they should come with a maximum session length by default, in the same way that many toasters don't have a "run indefinitely" setting.
I'd definitely change my view if whoever authored this had to jump through a bunch of hoops, but my impression is that modern AI agents can do things like this pretty much out of the box if you give them the right API keys.
Oh! They can’t publish arbitrary web content on their own :) You have to give it “tools” (JSON schema representing something you’ll translate into a programmatic call), then, implement taking messages in that JSON schema and “doing the thing”, which in this case could mean anything from a POST to Tumblr to uploading to a server…
Actually, let me stop myself there. An alternative way to think about it without overwhelming with boring implementation details: what would you have to give me to allow me to publish arbitrary hypertext on a domain you own?
The hypertext in question here was was published on a Github Pages site, not a domain belonging to the bot's author. The bot published it by simply pushing a commit (https://github.com/crabby-rathbun/mjrathbun-website/commit/8...), which is a very common activity for cutting-edge LLM agents, and which you could do trivially if given a Github API key with the right permissions.
The user gave them write and push access to the GitHub repo for their personal website!? Oh my, that’s a great find. That’s definitely a cutting edge capability! They gave the LLM the JSON schema and backend for writing and self-approving commits (that is NOT common!), in a repository explicitly labelled a public website in the name of the author.
I agree with you, I think. In the non-digital world people are regularly held at least partly responsible for the things they let happen through negligence.
I could leave my car unlocked and running in my drive with nobody in it and if someone gets injured I'll have some explaining to do. Likewise for unsecured firearms, even unfenced swimming pools in some parts of the world, and many other things.
But we tend to ignore it in the digital. Likewise for compromised devices. Your compromised toaster can just keep joining those DDOS campaigns, as long as it doesn't torrent anything it's never going to reflect on you.
We don't blame the zeitgeist of Big Toast because Big Toast recognizes that they're responsible for safety, and tests their products to minimize the risk that they burn your house down.
The zeitgeist of Big AI is to blame because a user connected an LLM to a blog publishing workflow on their own domain? Hmm…what would you make of Big Toast and the zeitgeist when someone warms up a straw hat in a toaster and starts a fire?
The author misses the point. Yes, probably in this case there was a human in close proximity to the bot, who we can put blame on. But very soon that assumption will break down. There will be bots only very loosely directed by a human. There'll be bots summoning other bots. There'll be bots theoretically under control of humans who have no idea what they are doing, or even that they have a bot.
So dismissing all the discussion on the basis that that may not apply in this specific instance is not especially helpful.
> theoretically under control of humans who have no idea what they are doing
Well those humans are about to receive some scolding, mate.
Whichever human ultimately stood up the initial bot and gave it the loose directions, that person is responsible for the actions taken by that bot and any other agents it may have interacted with. You cannot wash responsibility through N layers of machine indirection, the human is still liable for it.
That argument is not going to hold up for long though. Someone can prompt "improve the open source projects I work on", an agent 8 layers deep can do something like this. If you complain to the human, they are not going to care. It will be "ok." or "yeah but it submitted 100 other PRs that got approved" or "idk, the AI did it"
We don't necessarily care whether a person "cares" whether they're responsible for some damage they caused. Society has developed ways to hold them responsible anyway. Like laws.
Let’s say you adopt a puppy, and you don’t discipline it and you let it act aggressively. It grows up to be a big, angry dog. You’re so careless, in fact, that your puppy becomes the leader of a band of local strays. You still feed the puppy, make sure the puppy is up to date on its vaccinations, care for it in every single way. When the puppy and his pals maul a child, it’s you who ought to be responsible. No, you didn’t ask for it to do that. Maybe you would’ve even stopped it if you saw it happening. But if you’re the one sustaining another being - whether that be a computer program or a dog - you’re responsible for its actions.
A natural counter to this would be, “well, at some point AI will develop far more agency than a dog, and it will be too intelligent and powerful for its human operator to control.” And to that I say: tough luck. Stop paying for it, shut off the hardware it runs on, take every possible step to mitigate it. If you’re unwilling to do that, then you are still responsible.
Perhaps another analogy would be to a pilot crashing a plane. Very few crashes are PURE pilot error, something is usually wrong with the instruments or the equipment. We decide what is and is not pilot error based on whether the pilot did the right things to avert a crash. It’s not that the pilot is the direct cause of the crash - ultimately, gravity does that - in the same way that the human operator is not the direct cause of the harm caused by its AI. But even if AI becomes so powerful that it is akin to a force of nature like gravity, its human operators should be treated like pilots. We should not demand the impossible, but we must demand every effort to avoid harm.
The situation you're describing sounds vaguely like malware.
I don't think that the responsible party is the interesting part in this story.
The interesting part is that the bot wasn't offended, angry, or wanted to act against anyone. The LLM constructed a fictional character that played the role of an offended developer - mimicking the behaviour of real offended developers - much as a fiction writer would. But this was a fictional character that was given agency in the real world. It's not even a case like Sacha Baron Cohen playing fictional characters that interact with real people, becaue he's an actor who knows he's playing a character. Here there's no one pretending to be someone else but an "actual" fictional character authored by a machine operating in the real world.
Friend told me today he invited his openclawed to a poker game with his brother and friends, guy told his openclawed to "take down his brother" after it started to lose at poker it found everything on his brother, and started to try to plan to taken him down in their stock market portfolio they had together, I made him explain the story to me a couple of times, he looked back through the logs and once the bot started to lose at poker, it started it's new plan, once it was on the new plan, he said it had lost all context of the poker game and was focused on the task of taking his brother down in the new context, but the new context it decided on it's own. kmikeym on twitter if you want to know more or want to verify.
That's quite close to the plot of Memento.