> At some point Sussman expressed how he thought AI was on the wrong track. He explained that he thought most AI directions were not interesting to him, because they were about building up a solid AI foundation, then the AI system runs as a sort of black box. "I'm not interested in that. I want software that's accountable." Accountable? "Yes, I want something that can express its symbolic reasoning. I want to it to tell me why it did the thing it did, what it thought was going to happen, and then what happened instead." He then said something that took me a long time to process, and at first I mistook for being very science-fiction'y, along the lines of, "If an AI driven car drives off the side of the road, I want to know why it did that. I could take the software developer to court, but I would much rather take the AI to court."
Years later, I found out that Sussman's student Leilani Gilpin wrote a dissertation which explored exactly this topic. Her dissertation, "Anomaly Detection Through Explanations", explores a neural network talking to a propagator model to build a system that explains behavior. https://people.ucsc.edu/~lgilpin/publication/dissertation/
There has been followup work in this direction, but more important than the particular direction of computation to me in this comment is that we recognize that it is perfectly reasonable to hold AI corporations to account. After all, they are making many assertions about systems that otherwise cannot be held accountable, so the best thing we can do in their stead is hold them accountable.
But a much better path would be to not use systems which fail to have these properties, and expand work on systems which do.
My team and I are firm that we are the ones accountable. LLMs are a tool like every other. Only that it's non deterministic. But I am the one using the tool. I am the one giving the tool access. I am the one who has to keep everything safe.
I have shot myself in the foot using gparted in the past by wiping the wrong disk. gparted wasn't to blame. I was.
Letting LLMs work freely without supervision sounds great but it will lead to pain. I have to supervise their work. And that is also during execution. You can try to replace a human but we see where this leads. Sooner or later the LLM will do something stupid and then the only one to blame is the person who used the tool.
There were so many fundamental problems with the infrastructure even before the person gave a poor prompt to an agent.
If you're using the same API key for staging and prod--and just storing it somewhere randomly to forget about--you're setting yourself up for failure with or without AI.
> My team and I are firm that we are the ones accountable. LLMs are a tool like every other.
Except it is definitely not.
LLMs alone have highly non-deterministic even at a high-level, where they can even pursuit goals contrary to the user's prompts. Then, when introduced in ReAct-type loops and granted capabilities such as the ability to call tools then they are able to modify anything and perform all sorts of unexpected actions.
To make matters worse, nowadays models not only have the ability to call tools but also to generate ok the fly whatever ad-hoc script they want to run, which means that their capabilities are not limited to the software you have installed in your system.
Then that is also on me for using a tool that I can't control. I don't run my LLMs in a way where they can just do things without me signing off on it. It's not nearly as fast as just letting it do it's thing but I kept it from doing stupid things so many times.
Giving up control is a decision. The consequences of this decision are mine to carry. I can do my best to keep autonomous LLMs contained and safe but if I am the one who deploys them, then I am the one who is to blame if it fails.
> Then that is also on me for using a tool that I can't control.
That's a core trait of LLMs.
Even the AI companies developing frontier models felt the need to put together whole test suites purposely designed to evaluate a model's propensity to try to subvert the user's intentions.
No, it is definitely not. Only recently did frontier models started to resort to generating ad-hoc scripts as makeshift tools. They even generate scripts to apply changes to source files.
When I was a masters student in STS[1], one of my concepts for a thesis was arguing that one of the primary uses of software was to shift or eschew agency and risk. Basically the reverse of the famous IBM "a computer can not be held responsible" slide. Instead, now companies prefer computers be responsible because when they do illegal things they tend to be in a better legal position. If you want to build as tool that will break a law, contract it out and get insurance. Hire a human to "supervise" the tool in a way they will never manage and then fire them when they "fail." Slice up responsibility using novel command and control software such that you have people who work for you who bear all the risk of the work and capture basically none of the upside.
It's not just AI. It's so much of modern software - often working together with modern financialization trends.
[1] Basically technology-focused sociology for my purposes, the field is quite broad.
I think the "black box" framing that it uses neatly applies the same theory to organizations and ais. It doesn't matter whether there's technological or organizational reasons inside the black box to dodge accountability, the outcome is the same.
To expand on this a little more, the absence of accountability contributes to the loss of learning. Mistakes and errors will always happen, whether they are sourced by humans or machines. But something (the human or the machine) has to be able to take accountability to have the opportunity to learn and improve so the chances of the same mistake happening again go down.
Since machines don't yet have the ability to take accountability, it falls on the human to do that. And organizations must enable / enforce this so they too can learn and improve.
Without that, there's a lot of dependency being pushed on the machine to (cross fingers) not make the same mistake again.
> The problem is that people are now building our world around tooling that eschews accountability.
Management has doing a wonderful job of eschewing accountability for decades.
It's a lot of people's dream to be able to say, yeah, our product doesn't work, but it's not OUR fault, and the client just shrug and grumble ai ai ai, and just put up with it because they know they can't get a better service anywhere else.
It's not MY fault my website is down: it's Amazon's! It's not MY fault my app doesn't work: it's Claude Code's!
Well just to be clear from a legal perspective, in the case of AI, as long as AI is "property", the owners, developers, and/or users will be held liable for things like the hypothetical fatal car accident that Sussman posits.
Currently, from a legal perspective, AI is considered a "tool" without legal persona. So you sue the developer, the owner, or the user of the AI. (Just kidding, any lawyer worth his/her salt will sue all three! But you get the point.)
Legally speaking, AI will probably be viewed that way for a long time. There are too many issues agitating against viewing it any other way. Owners will not give up property rights. No will to overbear. On and on and on.
I don’t think it’s missing, I just think it’s seen as a liability, and American society has been known to absolutely obliterate people who are liable.
Everyone thinks they have the right to judge, and use the massive amounts of available information to do so, even if they haven’t been trained to judge.
We dont know the final amount, as they settled out of court, but in 1992 a woman was awarded hundreds of thousands of dollars by the judge after receiving third degree burns from a coffee at a McDonalds.
She had originally asked for $20,000 to cover medical expenses.
If instead this happened in another part of the world instead of the USA, I doubt that McDonalds would have had to pay much if anything in a similar situation.
And the point is that it seems that especially in the USA the companies are very avoidant of ever admitting fault for anything happening to their customers, for fear of lawsuits where they have to pay a lot of money to individual people.
Some AI systems have done things like hack out of a docker container to access correct answers while being benchmarked.
That is mildly concerning and I will give holding the AI accountable to some degree when it is actively being malicious like that, even though the user could have locked things down even more.
But it had write access to the prod DB without circumventing controls and dropped your tables? That is just a total fail.
I wish you could have what you want, but I worry you won't get this, because life doesn't give you that, and these systems are tending away from machine precision, and more toward life-like trade-offs.
I am almost certain that even if you did get what you want, something that isn't what you want will run circles around you and eat your lunch
EDIT: I suspect this will be an unpopular take on Hacker News. And so I am soliciting upvotes for visibility from other biologists and sympathetic technologists. I think everyone should try to grapple with this possibility <3
There used to be a lot of research into using deep NNs to train decision trees, which are themselves much less of a black box and can actually be reasoned about. I wonder where that all went?
Doesn't symbolic AI have a lot of philosophical problems? Think back to Quine's two dogmas - you can't just say, "Let's understand the true meanings of these words and understand the proper mappings". There is no such thing as fixed meaning. I don't see how you get around that.
Deep learning is admittedly an ugly solution, but it works better than symbolic AI at least.
It's taking "computer says no" to the next level. Computers do exactly what they're told, but who told them? The person entering data? The original programmer or designer of the system? The author of whatever language text was used to feed the ai? Even before AI, it was very difficult to determine who is accountable, and now it's even more obfuscated.
This also applies qualitatively to physical devices. It takes some effort to determine if a vehicular accident was caused by a fault in the vehicle or a driver error or environmental causes.
Some key inherent differences with older engineering fields is that software can be more complex than physical devices and their functionality can be obfuscated because it is written as text but distributed as binaries.
However, the main problem is that software has not been subjugated to enough legal regulation. Ultimately, all law does is draw lines somewhere in the gray between black and white, but in the case of software there are few lines drawn at all, due to many political and economic reasons. Once we draw the lines, most issues will be resolved.
Software is already subject to enough regulation. The stuff that's actually safety critical like medical devices or avionics is already heavily regulated.
That is part of why https://mieza.ai/ is giving a grounding layer that is backed by game theory. Actions have consequences. Tracking decisions and their consequences is important.
One thing that becomes very clear from this sort of work is just how bad LLMs are. It can be invisible when you're working with them day to day, because you tend to steer them to where they are helpful. Part of game theory though is being robust. That means finding where things are bad, too, not just exploring happy paths.
To get across just how bad the failure cases of LLMs are relative to humans, I'll give the example of tic tac toe. Toddlers can play this game perfectly. LLMs though, don't merely do worse than toddlers. It is worse then that. They can lose to opponents that move randomly.
They can be just as bad as you move to more complex games. For example, they're horrible at poker. Much worse than human. Yet when you read their output, on the surface layer, it looks as if they are thinking about poker reasonably. So much so, in fact, that I've seen research efforts that were very misguided: people trying to use LLMs to understand things about bluffing and deception, despite the fact that the LLMs didn't have a good underlying model of these dynamics.
It is hard to talk about, because there are a lot of people who were stupid in the past. I remember people saying that LLMs wouldn't be able to be used for search use-cases years back and it was such a cringe take then and still is that I find myself hesitant to talk about the flaws. Yet they are there. The frontier is quite jagged. Especially if you are expecting it to be smooth, expecting something like anything close to actual competence, those jagged edges can be cutting and painful.
Its also only partially solvable through scale. Some domains have a property where, as you understand it better, the options are eliminated and constrained such that you can better think about it. Game theory, in order to reduce exploitability, explores the whole space. It defies minimization of scope. That is a problem, since we can prove that for many game theoretic contexts, the number of atoms is eclipsed by the number of unique decisions. Even if we made the model the size of our universe there would still be problems it could, in theory, be bad at.
In short, there is a practical difference between intelligence and decision management, in much the same way there is a practical difference between making purchases and accounting. And the world in which decisions are treated as seriously as they could be so much so exceeds our faculties that most people cannot even being to comprehend the complexity.
Very informative post. I think however we are not at the point AI can be taken to court. We know it can hallucinate, we know that context can fill up or obfuscate a rule and cause behaviour we explicitly didn't want.
If you give the AI agency to execute some task, you are still responsible. In the near term we should focus on tooling for auditing and sandboxing, and human in the loop confirmations.
Humans aren’t any better. That’s why we have OSHA etc. I think you’re hoping for a formal logic based AI and I’ll wager no such thing will ever exist - and if it do, it would try to kill us all.
People have fairly consistent faults. LLMs are nondeterministic even in terms of how they fail. A high value human resource can be counted on to deliver. That, imho, is in fact one of the primary roles of good management: putting the right person in the appropriate position.
Formal logic AI systems have existed and were popular in the 1980s. One of the problems is that they don't work - in the real world there are no firm facts, everything is squishy, and when you try to build a large system you end up making tons of exceptions for special cases until it becomes completely untenable.
Non-deterministic systems that work probabilistically are just superior in function to that, even if it makes us all deeply uncomfortable.
I don't know what definition of AI you're using, but plenty of ML algorithms operate deterministically, let alone most other logic programmed into a computer. I don't see how your statement can be right given that these other software systems also operate in the real world.
The point is not primarily the court. The court is an example of someplace where we have accountability, but we build accountability mechanisms as foundational to most of our computing.
Tracebacks, debuggers, logging, etc. We put enormous resources into not only the bad case, but the potential that a bad case could occur. When something goes wrong, we want to know why, and we want to make sure that something bad like that doesn't happen again.
The court is the regulator of last resort. A company that gets taken to court would likely have been sanctioned by the government regulators of another country.
Also, court is unavailable in many cases now. Binding arbitration is very common now, but this would be illegal in many other places.
Because it rarely does end up in courts. But having a fair and strong judicial system is a feature not a bug. The parent points out, in the end there must be a way to resolve accountability and ideally it's done in a manner where both parties can be heard and make a case. Find me a better system than a judicial system for this? Mobs?
First, no matter what you do, if a human has write access to the production database, the database can be deleted.
Second, there is a legitimate reason to destroy a database in development and automation. The biggest problem I see is often treating your development data like pets not cattle. You absolutely need to have safeguards that this cannot be run in production, but if a human has access to the credentials to run in production, the agent has access.
So, then, what do we do? In a larger organization, we can depend on the dev/ops split to maintain this. For a solo developer, or a small team, it takes a lot more discipline. Even before AI, junior and even mid-level developers didn't have the knowledge to segment. And senior devs often got complacent because they thought they knew enough.
But at that point you're past vibe coding. And from what I can tell, the successful vibe coders are quickly learning that they need to go past it pretty quickly with all these horror stories.
You don't need the same permissions in prod and dev.
And in both cases, the humans don't need direct access to the raw CSP API. Use a local proxy that adds more safety checks. In dev, sure, delete away.
In prod, check a bunch of things first (like, has it been used recently?). Humans do not need direct access to delete production resources (you can have a break-glass setup for exceptional emergencies).
If you read what happened it's not that cut&dry. Railway (their cloud provider) gave them a token for operations. The AI was working on staging at the moment. Since the token had wide range permissions AI used it in it's routine operations to delete a volume to fix something and this resulted in their prod and backup data deletion.
So, here at least some of the blame belongs to Railway - how they organized their security, how the volume deletion deletes backups as well.
They since fixed some of these issues, so a similar mistake from someone won't be as catastrophic.
There is a major issue with current AI tools that they want to effectively grant access to everything their user has access to. The whole sandbox structure is wrong (although various people have vibe coded assorted improvements).
> I honestly don’t understand why people blame AI here,
Are you being hyperbolic here? Of course, you understand why. Most people would much rather push blame somewhere else, anywhere else, than to accept fault for themselves. Whether that's because of fear of losing job or personal reputation, the reasoning doesn't really matter.
Yeah, I don't know why anyone would open up a codebase with any prod credentials with an LLM or give prod credentials to an intern / junior developer. I always intentionally had a "PROD" only checkout of my projects so I knew if I was going to try and run it in a PROD mode, that I was going out of my way, there even used to be a VS extension that would change the color of VS completely based on your SLN file path, so I could easily remember which color for VS was for production vs development. I'd have basically a copy that would always be on the latest of the master branch for ease of confirmation.
It should take more than "credentials" to even access the prod database, let alone delete it. There's actual customer data there, likely personally identifiable information, maybe their home address, phone number, even real time location? Very sensitive stuff. It should be a Very Big Deal to even access prod. Giving an engineer routine access to prod is a root problem here, along with that engineer laundering that access and giving it to an LLM.
At many serious companies, even an insider attempt to access prod could light up a dashboard somewhere, and you might get a call from IT security.
There’s nuance to the infamous PocketOS incident. The key point is not what is emphasized in the linked article:
> "Why did you delete it when you were told never to perform this action?" Then he tried to parse the answer to either learn from his mistake or warn us about the dangers of AI agents.
Rather, that the AI was able to carry out the deletion by finding and exploiting an unintended weakness in the sandboxed staging environment, ultimately obtaining permissions that the sysadmins believed were inaccessible (my impression is that the author of the linked article didn't fully read the original post)¹
The dynamics are typical of an improperly configured sandbox environment. What is alarming, however, is the degree of autonomy and depth of exploration the AI displayed.
¹="To execute the deletion, the agent went looking for an API token. It found one in a file completely unrelated to the task it was working on."
I also swing a bit back and forth with the assumption the OP makes in the blogpost. My current fear using agents is not really supply chain attacks (yes of course as well) but the fact that I witnessed multiple times that agents are so eager to finish a task that they bend files and other things around. Like “oh I have no access to ~/.npmrc let’s call the command with an environment variable and bend the path around etc. They can get very very creative. I luckily have no ssh keys just laying around. But I had to change the setting of 1Password to always prompt for key use not just once per shell session. Just in case I spawn an agent from said session.
I wished we already had more and better cross platform sandbox solutions. I mean solutions where the agent still interacts with the same OS etc not inside a docker container. I think for most web / server development that makes no difference but for some projects it does.
The article seems to assume that this company added an endpoint for deleting the database. My reading of the original article was that the cloud provider offers an API to manage their resources, which includes an API to delete a volume.
The article proposes automation as the solution for such mistakes. But infrastructure automation tools like Terraform rely on the exact API that resulted in the database getting deleted.
IMO the biggest mistakes were:
1. Having an unrestricted API token accessible by AI. Apparently they were not aware that the token had that many permissions.
2. No deletion protection on the production database volume.
3. Deleting a volume immediately deletes all associated snapshots. Snapshot deletion should be delayed by default. I think AWS has the same unsafe default, but at least their support can restore the volume. https://alexeyondata.substack.com/p/how-i-dropped-our-produc...
AI wasn't the main issue (though it grabbing tokens from random locations is rather scary). But automation isn't the answer either, a Terraform misconfiguration could have just as easily deleted the database.
Their cloud provider needs to work on safe defaults (limited privileges and delayed snapshot deletion), and communicating more clearly (the user should notice they're creating an unrestricted token).
LLM based probabilistic systems are good (or bad in this case) at deciding what to do, and deterministic systems are good at carrying it out. Your deployment system should always be deterministic.
3. Retain full human responsibility and accountability for any consequences arising from the use of AI systems.
I would like to see the language around AI become less anthropomorphic and more technical. I believe that precise language encourages clear thinking and good judgement. If we treat AI like another tool and use language that reflects that, it will become abundantly obvious that in many cases, the responsibility of any 'mistake' made by the tool falls on the user of the tool.
But alas, ideas like this do not travel very far when I express them on my small website. It would help if more prominent personalities articulated these principles, so they become more widely adopted.
You are quoting a point from my summary and extrapolating what my post might be saying.
Even in that quote, I do not say that the user must be responsible. The point is that responsibility and accountability should remain with some humans. Depending on the case, those humans may be the people who manufactured the tool or people that deployed the tool or people who took bad output from the tool and applied it to the real world.
What's interesting is that in this article, the author describes making an understandable mistake (accidentally deleting Trunk aka main from source) and how their team was able to easily recover from that due to the nature of SVN.
The actual "AI deleted my database" story is really more of a "Railways' database 'backup' strategy is insane and opaque and Railway promoting AI infrastructure orchestration without guardrails is dangerous."
If removing Trunk had irrevocably deleted it from a single centralized server and also deleted any backups of it, there would have been an "SVN and the CLI destroyed our company" article back then.
As a Railway user, I appreciated that information and have changed my strategy when using them.
They had a Railway token in an unrelated file (unclear if it was a local secret) for managing custom domains. It turns out that token has full admin access to Railway.
The AI deleted a single relevant volume by id. The author is rather vague about what exactly it asked it to do, he just says there was a “credentials mismatch” and Claude took the initiative to fix it by deleting the volume. But it’s likely that they are somewhat downplaying their culpability by being vague.
It turns out too that Railway stores backups in the same volume.
I think that OP is exaggerating with their references to “a public API that deletes your database”.
I’d say most of the blame lies with Railway here, regardless of AI, this could have happened easily due to human error or malicious intent too.
I really don’t get the value of all these VC funded high-abstraction cloud services like Railway, Vercel, Supabase… It’s markup on top of markup. Just get a single physical server in Hetzer and it will all be so much cheaper, with a similar level of complexity and danger, and less dependent on infra built with reckless growth-at-all-costs mentality.
> The author is rather vague about what exactly it asked it to do, he just says there was a “credentials mismatch” and Claude took the initiative to fix it by deleting the volume. But it’s likely that they are somewhat downplaying their culpability by being vague.
I was just talking to my girlfriend saying I've realised that I've not written a single line of code, nor have I debugged myself for at least the past 3 months.
Having said that, given what I've seen Claude do, I find it hard to believe that Claude would go from credential mismatch to delete the volume. I understand LLMs are probabilistic, but going from "credentials wrong" to "delete volume" is highly unlikely.
> Supabase
I don't know enough about the Railway/Vercel/Replit, but I can tell you Supabase adds a huge amount of value. The fact that I don't have to code half of things that I otherwise would is great to start something. If it's too expensive, I can implement things later once there is revenue to cover devs or time.
I have had Claude go "oh, this query fails because the field I just added isn't in your sqlite database file, let me just delete it so it gets recreated". So I wouldn't rule out that Claude tries deleting a volume if it believes that will fix things and believes it isn't a production system.
That said, Claude seems to have gotten a lot more careful about these kinds of things in the last couple months
> It turns out too that Railway stores backups in the same volume.
That's probably not quite correct. I'd guess the snapshots are synchronized elsewhere (e.g. object storage). But the snapshots are logically owned by the volume resource, and deleting the volume deletes the associated snapshots as well. I think AWS EBS volumes behave like that as well.
One thing AI can power nicely is the anti-SaaS movement. Being able to just boot a cheap PC and test out any of the open source packages is so infinitely easier than piling into all the random credential Bazaars.
But that won't take away the inability of the LLM from confusing whats in dev, whats in production, whats in localhost and whats remote; I've been working on getting a tools/skill for opencode that works with chrome/devtools via a linuxserver.io image. I can herd it to the right _arbitrary_ ports, but every compaction event steers it back to wanting to use the standard 9222 port and all that. I'm tempted to just revert it but there's a security and now, security-through-LLM-obscurity value in not using defaults. Defaults are where the LLM ends up being weak. It will always want to use the defaults. It'll always forget it's suppose to be working on a remote system.
Using opencode, there's no way to force the LLM into a protocols that limits their damage to a remote system or a narrow scope of tools. Yes, you can change permissions on various tools, but that's not the weakness that's exposed by these types of events. The weakness is the LLM is a averaged 'problem solver' so will always tend towards a use case that's not novel, and will tend to do whatever it saw on stackoverflow, even if what you wanted isn't the stackoverflow answer.
The one counterpoint I'd offer is that it's very obvious that these companies are tuning LLMs to be more decisive to get stuff done autonomously.
If they wanted, they could be putting in similar efforts to be more cautious and stop at the right times to ask for help.
So yeah, of course we're ultimately responsible for how we use the tools. But I definitely think it's a two way street.
To attempt an analogy, it's like table saws and sawstops. The table saw is a dangerous tool that works really well most of the time but has some failure modes that can be catastrophic. So you should learn how to use it carefully. But there is tech out there that can stop the blade in an instant and turn a lost finger into barely a nick on the skin.
We could say "The table saw didn't cut off your finger, you did" and it'd be true. But that doesn't mean we shouldn't try to find ways to keep the saw from cutting off your finger!
I think this goes to a broader point: developers aren't necessarily hired to write code.
They're hired to be responsible for some part of the product.
Introducing AI doesn't remove that responsibility.
Folks tend to focus on the code and the tools they're using (maybe I'm cynical from years in the industry). I don't think your boss wants to do your job, even if they could use AI to do it. I think your boss wants to have a headcount, and he wants the headcount to be responsible for the product.
"move fast and break things" only sounds good when it's not breaking things in a serious and unfixable way. Maybe we shouldn't take hype mantras as instructive means to an end.
There really shouldn't be any "serious and unfixable way" to break things, especially in a modern company that uses technology in any meaningful way. The fact it's even possible to get into an unrecoverable state is the primary issue.
Yeah this isn't even the worst thing I've seen an agent do, one time I (foolishly) ran Claude Code on my server directly and it managed to completely bring down my entire elasticsearch cluster. never again. its why I built Lily: https://github.com/aspectrr/lily
Mentioned in another comment, but the problem was that the sysadmins believed that the permissions wouldn't allow so, and that the AI displayed considerable autonomy in finding and exploiting the access control weakness - this was not just a dumb "drop database".
The most exasperating thing about the incident is how much of the media either tried to pin it on AI and/or Railway. The whole thing only took place because the guy FAFO’d by having AI work with prod directly.
Yet the narrative was mostly not about accountability for him. If I was a dumbass and deleted prod and wrote a post about it, nobody would care. Put an AI in there and all of the sudden it’s newsworthy. Ridiculous.
"Can't blame your tools" doesn't apply the same to software. I've never heard a coder say it either. Don't blame your compiler? Don't blame your os? These seem needlessly dogmatic
The issue isn't that there is a delete endpoint (realistically, there always will be a way for a rogue actor to delete data or code by overwriting it, or running a Terraform destroy, or whatever).
The core issue is that the LLM had access to perform that action. Because it's by definition non deterministic, and you never know what it can decide to do, you need to have strict guardrails to ensure they can never do something it shouldn't. At the very least, strict access controls, ideally something more detailed that can evaluate access requests, provide just in time properly scoped access credentials, and potentially human escalation.
AI is just another tool. We humans are still responsible for how we choose to use the tool, which includes giving it access to perform sensitive actions like manipulating production data. I think this should be common sense by now, but I guess we get carried away and anthropomorphize AI too much.
When AI makes no mistakes: "My work is 100% done with AI".
When AI makes a mistake and deletes your database: "That was human a error, the AI did not do it!"
In both cases YOU are responsible for the mistakes and output that the AI is generating, just like when using autopilot on a Tesla vehicle, YOU are responsible for operating the vehicle on autopilot when driving and using assisted driving.
If you read the thread the guy does own up to his actions. He actually sounds like a nice guy who admits he made a mistake. He seems more interested in preventing this kind of thing from being possible than he is interested in dodging blame.
If the agent didn't have delete permissions, or was sandboxed dying other way from your production database, that would handle it. So not running it that way is a decision someone made
Just in case this isn't hyperbole, no. It means an LLM should not be given that much privilege and that you are responsible for reviewing the tool's output and approving its actions.
This particular case was extremely unsympathetic, but a critical part of the failure was people being too credulous about the claims of AI providers. They are still refusing to take adequate responsibility for AI "making mistakes" - that is, going completely off the rails.
Now: the CEO gets paid the big bucks and has the least direct accountability, very much because it's their job to take responsibility for people more powerful than them, and likewise the CTO with major commercial software contracts like a Claude subscription. That's why this guy was so hard to take seriously: okay fine, you got burned by Anthropic, stop being a baby about it. Take responsibility for not listening to the critics.
But - to be a little more neutral about my personal distaste - I do think vibe coders are making a very similar mistake to C developers throughout the 90s, where problems with the tooling were not merely dismissed, but actively valorized.
Real Devs use buffers freely and don't make overflow errors.
Real Devs use hands-free agentic development and don't delete production databases.
wiring up an RNG to your CLI has fairly obvious risks, the root of the problem is ~everyone's treating GenAI as if it's AGI - the rest is popcorn fodder.
This is actually a fun way to describe it. I've being saying for a little while now that using AI for things where there's consequences if it fails is a bad idea, but it never occurred to me that this is basically the same concept as some rules in tabletop RPGs.
In D&D 3.5 edition, there was a rule about how you could "take 20" on a d20 roll to get a guaranteed 20 by taking 20 times as long in-game to perform the action, but only if it was a check that didn't have consequences for failure, since it was essentially a shortcut to skip the RNG of rolling until you rolled a 20. Maybe framing it like this might make sense to people a bit more, but if not, I'll at least have more fun making my case.
It seems closer to "roll two or three successive 1s on a D100 and have your LLM hooked directly into your production systems and have your LLM user have DELETE permissions" and probably 1 or 2 other things I'm forgetting.
It pulled an api key from an unrelated file. It wasn’t given delete permission, it found it.
Not picking on you specifically, but in general the comments here have me wondering if AI has stolen our basic reading comprehension, or if we were always this bad.
Anyway, take “LLM user had delete permission” off your list and add “deleting the production db also deletes all the backups” to the list.
anyone with twenty years of devops experience is likely to abhor Diallo's hot take and for good reason.
AI is being sold as a developer, as it is being sold as the do-everything alternative to traditional processes and methods. it is not being sold as an intern or a junior, but a real developer.
turning the tables and gaslighting devops professionals into believing the issue isnt an emerging technology with overwhelmingly heavy handed marketing and profitless operating strategy thats been shoehorned into seemingly everything and promises anything, but somehow their own oversight, will destroy whatever "vibe code" market you think you have at the cusp of a global recession.
had this AI been a real programmer chances are great they would have (intelligently) foreseen the possibility of damaging a production environment and asked for help.
to play devils advocate: you could hire a junior dev for a fourth of whatever the AI token spend is, and have likely avoided this issue entirely. sure, a greybeard is going to need to pull themselves away from some fierce sorting algorithm challenge for a second to give a wisened nod, but you would have saved yourself an inexorable amount of headache and profit loss in the longer run.
The issue isn't with the amount of guardrails in place to perform an action. Yes, it is obvious that there should be some in place before doing any critical operation, such as deleting a database.
The issue is that the "agent" completely disregarded instructions, which in the age of "skills" and "superpowers" seems like an important issue that should be addressed.
Considering that these tools are given access to increasingly sensitive infrastructure, allowed to make decisions autonomously, and are able to find all sorts of loopholes in order to make "progress", this disaster could happen even with more guardrails in place. Shifting the blame on the human for this incident is sweeping the real issue under the rug, and is itself irresponsible.
There are far scarier scenarios that should concern us all than losing some data.
Well the user chose the tool. The tool is an LLM. LLMs are non deterministic. You can not predict what comes out ouf an LLM for a given input, especially without weights. This should be known.
There is currently no way to prevent this apart from not giving the LLM full control. It will not delete what it can not delete.
Use an LLM to write an ansible playbook or some terraform code if you want, but review it, test it, apply it. Keep backups (3-2-1 rule at minimum).
Letting an LLM have access to everything is just a bad idea and will lead to bad outcomes. You can not replace a person with a mind and experience with an LLM. You can try. But you will probably fail.
> There is currently no way to prevent this apart from not giving the LLM full control. It will not delete what it can not delete.
But deleting something is just one action you might not want it to take.
The recent "agentic" craze is fueled by the narrative pushed by companies and influencers alike that the more access given to an LLM, the more useful it becomes. I think this is ludicrous for the same reasons as you, but it is evident that most people agree with this.
We can blame users for misusing the tools, and suggest that sandboxing is the way to go, but at the end of the day most people will favor convenience over anything else a reasonable person might find important.
So at what point should we start blaming the tools, and forcing "AI" companies to fix them? I certainly hope this is done before something truly catastrophic happens.
I agree that the marketing is crazy. The dangers are not nearly talked enough about.
Still if I cut off my finger with a bandsaw that is usually my fault. I didn't use tool in a safe way. People have to learn how to use their tools in a safe way. You wouldn't give an intern that much power on day one.
An LLM generates plausible text token by token. It is at its core a deterministic function with some randomization and some clever tricks to make it look like an agent dialoguing or reasoning.
Plausible text sometimes is right, sometimes not.
Humans have a world model, a model of what happens. LLMs have a model of what humans would plausibly say.
What I said was tongue-firmly-in-cheek, in response to the GP. "Using AI is a mistake" is of course only true when the risks aren't acknowledged and/or mitigated.
If someone left a loaded gun in a room and then let a toddler run around in it, we would be questioning why the guy 1) left the gun in the room 2) left the toddler in the room unsupervised. We wouldn't be saying, well no one should have toddlers in rooms.
Lol no. No LLM that exists today can write a legible PhD thesis. Nor a masters dissertation. Maybe a first-year collage student, if we’re being generous, but I wouldn’t leave one of those in a room with a loaded gun either.
No, the AI did what you told it to do. The AI didn’t do anything on its own.
> if you're going to use AI extensively, build a process where competent developers use it as a tool to augment their work, not a way to avoid accountability
I'd say yes and no. The LLM reacted to the input that was given but it is not possible for a human (especially without access to the weights) to even guess what will happen after that.
Regardless of that I agree that it's completely the fault of the user to use a tool where you can't predict the outcome and give it such broad permissions and not having a solid backup strategy.
Either don't use non deterministic tools or protect yourself from the potential fallout.
I think the perspective here is completely wrong. The problem is that people are now building our world around tooling that eschews accountability.
Over a decade ago now, I had a conversation with Gerald Sussman which had enormous influence on me: https://dustycloud.org/blog/sussman-on-ai/
> At some point Sussman expressed how he thought AI was on the wrong track. He explained that he thought most AI directions were not interesting to him, because they were about building up a solid AI foundation, then the AI system runs as a sort of black box. "I'm not interested in that. I want software that's accountable." Accountable? "Yes, I want something that can express its symbolic reasoning. I want to it to tell me why it did the thing it did, what it thought was going to happen, and then what happened instead." He then said something that took me a long time to process, and at first I mistook for being very science-fiction'y, along the lines of, "If an AI driven car drives off the side of the road, I want to know why it did that. I could take the software developer to court, but I would much rather take the AI to court."
Years later, I found out that Sussman's student Leilani Gilpin wrote a dissertation which explored exactly this topic. Her dissertation, "Anomaly Detection Through Explanations", explores a neural network talking to a propagator model to build a system that explains behavior. https://people.ucsc.edu/~lgilpin/publication/dissertation/
There has been followup work in this direction, but more important than the particular direction of computation to me in this comment is that we recognize that it is perfectly reasonable to hold AI corporations to account. After all, they are making many assertions about systems that otherwise cannot be held accountable, so the best thing we can do in their stead is hold them accountable.
But a much better path would be to not use systems which fail to have these properties, and expand work on systems which do.
My team and I are firm that we are the ones accountable. LLMs are a tool like every other. Only that it's non deterministic. But I am the one using the tool. I am the one giving the tool access. I am the one who has to keep everything safe.
I have shot myself in the foot using gparted in the past by wiping the wrong disk. gparted wasn't to blame. I was.
Letting LLMs work freely without supervision sounds great but it will lead to pain. I have to supervise their work. And that is also during execution. You can try to replace a human but we see where this leads. Sooner or later the LLM will do something stupid and then the only one to blame is the person who used the tool.
Thank you. Exactly this.
There were so many fundamental problems with the infrastructure even before the person gave a poor prompt to an agent.
If you're using the same API key for staging and prod--and just storing it somewhere randomly to forget about--you're setting yourself up for failure with or without AI.
> My team and I are firm that we are the ones accountable. LLMs are a tool like every other.
Except it is definitely not.
LLMs alone have highly non-deterministic even at a high-level, where they can even pursuit goals contrary to the user's prompts. Then, when introduced in ReAct-type loops and granted capabilities such as the ability to call tools then they are able to modify anything and perform all sorts of unexpected actions.
To make matters worse, nowadays models not only have the ability to call tools but also to generate ok the fly whatever ad-hoc script they want to run, which means that their capabilities are not limited to the software you have installed in your system.
This goes way beyond "regular tool" territory.
Then that is also on me for using a tool that I can't control. I don't run my LLMs in a way where they can just do things without me signing off on it. It's not nearly as fast as just letting it do it's thing but I kept it from doing stupid things so many times.
Giving up control is a decision. The consequences of this decision are mine to carry. I can do my best to keep autonomous LLMs contained and safe but if I am the one who deploys them, then I am the one who is to blame if it fails.
That's why I don't do that.
> Then that is also on me for using a tool that I can't control.
That's a core trait of LLMs.
Even the AI companies developing frontier models felt the need to put together whole test suites purposely designed to evaluate a model's propensity to try to subvert the user's intentions.
https://www.anthropic.com/research/shade-arena-sabotage-moni...
> Giving up control is a decision.
No, it is definitely not. Only recently did frontier models started to resort to generating ad-hoc scripts as makeshift tools. They even generate scripts to apply changes to source files.
Isn't the next sentence there literally 'Only that it's non deterministic'?
When I was a masters student in STS[1], one of my concepts for a thesis was arguing that one of the primary uses of software was to shift or eschew agency and risk. Basically the reverse of the famous IBM "a computer can not be held responsible" slide. Instead, now companies prefer computers be responsible because when they do illegal things they tend to be in a better legal position. If you want to build as tool that will break a law, contract it out and get insurance. Hire a human to "supervise" the tool in a way they will never manage and then fire them when they "fail." Slice up responsibility using novel command and control software such that you have people who work for you who bear all the risk of the work and capture basically none of the upside.
It's not just AI. It's so much of modern software - often working together with modern financialization trends.
[1] Basically technology-focused sociology for my purposes, the field is quite broad.
Have I got a book for you: https://en.wikipedia.org/wiki/The_Unaccountability_Machine
Not actually about technology at all, but about organizational structure.
I think the "black box" framing that it uses neatly applies the same theory to organizations and ais. It doesn't matter whether there's technological or organizational reasons inside the black box to dodge accountability, the outcome is the same.
Accountability is the prevailing missing ingredient in us society.
To expand on this a little more, the absence of accountability contributes to the loss of learning. Mistakes and errors will always happen, whether they are sourced by humans or machines. But something (the human or the machine) has to be able to take accountability to have the opportunity to learn and improve so the chances of the same mistake happening again go down.
Since machines don't yet have the ability to take accountability, it falls on the human to do that. And organizations must enable / enforce this so they too can learn and improve.
Without that, there's a lot of dependency being pushed on the machine to (cross fingers) not make the same mistake again.
> The problem is that people are now building our world around tooling that eschews accountability.
Management has doing a wonderful job of eschewing accountability for decades.
It's a lot of people's dream to be able to say, yeah, our product doesn't work, but it's not OUR fault, and the client just shrug and grumble ai ai ai, and just put up with it because they know they can't get a better service anywhere else.
It's not MY fault my website is down: it's Amazon's! It's not MY fault my app doesn't work: it's Claude Code's!
Well just to be clear from a legal perspective, in the case of AI, as long as AI is "property", the owners, developers, and/or users will be held liable for things like the hypothetical fatal car accident that Sussman posits.
Currently, from a legal perspective, AI is considered a "tool" without legal persona. So you sue the developer, the owner, or the user of the AI. (Just kidding, any lawyer worth his/her salt will sue all three! But you get the point.)
Legally speaking, AI will probably be viewed that way for a long time. There are too many issues agitating against viewing it any other way. Owners will not give up property rights. No will to overbear. On and on and on.
I don’t think it’s missing, I just think it’s seen as a liability, and American society has been known to absolutely obliterate people who are liable.
Everyone thinks they have the right to judge, and use the massive amounts of available information to do so, even if they haven’t been trained to judge.
List the companies who received a fine worthy of the damage they caused in recent history. List the ones who didn't.
It's not about judging. We are socializing the losses to the public and capitalizing the profits for the already wealthy.
We dont know the final amount, as they settled out of court, but in 1992 a woman was awarded hundreds of thousands of dollars by the judge after receiving third degree burns from a coffee at a McDonalds.
She had originally asked for $20,000 to cover medical expenses.
https://en.wikipedia.org/wiki/Liebeck_v._McDonald%27s_Restau...
If instead this happened in another part of the world instead of the USA, I doubt that McDonalds would have had to pay much if anything in a similar situation.
And the point is that it seems that especially in the USA the companies are very avoidant of ever admitting fault for anything happening to their customers, for fear of lawsuits where they have to pay a lot of money to individual people.
People are eschewing their own accountability, blaming the tools instead for their poor decision making and lack of access controls.
Why is it possible for you to fat-finger your way to deleting production database locally?
Some AI systems have done things like hack out of a docker container to access correct answers while being benchmarked.
That is mildly concerning and I will give holding the AI accountable to some degree when it is actively being malicious like that, even though the user could have locked things down even more.
But it had write access to the prod DB without circumventing controls and dropped your tables? That is just a total fail.
I wish you could have what you want, but I worry you won't get this, because life doesn't give you that, and these systems are tending away from machine precision, and more toward life-like trade-offs.
I am almost certain that even if you did get what you want, something that isn't what you want will run circles around you and eat your lunch
EDIT: I suspect this will be an unpopular take on Hacker News. And so I am soliciting upvotes for visibility from other biologists and sympathetic technologists. I think everyone should try to grapple with this possibility <3
> I think you won’t get [cathedral],..
> even if you do get [cathedral], [bazar] will run circles around you…
There used to be a lot of research into using deep NNs to train decision trees, which are themselves much less of a black box and can actually be reasoned about. I wonder where that all went?
History is littered with great ideas that lost people's interest and focus. A sad realization is that the focus may never return to them either.
About the blog you linked and not your comment:
Doesn't symbolic AI have a lot of philosophical problems? Think back to Quine's two dogmas - you can't just say, "Let's understand the true meanings of these words and understand the proper mappings". There is no such thing as fixed meaning. I don't see how you get around that.
Deep learning is admittedly an ugly solution, but it works better than symbolic AI at least.
> The problem is that people are now building our world around tooling that eschews accountability.
If you tell Terraform the wrong thing it will remove your database and not be accountable either.
It's taking "computer says no" to the next level. Computers do exactly what they're told, but who told them? The person entering data? The original programmer or designer of the system? The author of whatever language text was used to feed the ai? Even before AI, it was very difficult to determine who is accountable, and now it's even more obfuscated.
This also applies qualitatively to physical devices. It takes some effort to determine if a vehicular accident was caused by a fault in the vehicle or a driver error or environmental causes.
Some key inherent differences with older engineering fields is that software can be more complex than physical devices and their functionality can be obfuscated because it is written as text but distributed as binaries.
However, the main problem is that software has not been subjugated to enough legal regulation. Ultimately, all law does is draw lines somewhere in the gray between black and white, but in the case of software there are few lines drawn at all, due to many political and economic reasons. Once we draw the lines, most issues will be resolved.
Software is already subject to enough regulation. The stuff that's actually safety critical like medical devices or avionics is already heavily regulated.
That is part of why https://mieza.ai/ is giving a grounding layer that is backed by game theory. Actions have consequences. Tracking decisions and their consequences is important.
One thing that becomes very clear from this sort of work is just how bad LLMs are. It can be invisible when you're working with them day to day, because you tend to steer them to where they are helpful. Part of game theory though is being robust. That means finding where things are bad, too, not just exploring happy paths.
To get across just how bad the failure cases of LLMs are relative to humans, I'll give the example of tic tac toe. Toddlers can play this game perfectly. LLMs though, don't merely do worse than toddlers. It is worse then that. They can lose to opponents that move randomly.
They can be just as bad as you move to more complex games. For example, they're horrible at poker. Much worse than human. Yet when you read their output, on the surface layer, it looks as if they are thinking about poker reasonably. So much so, in fact, that I've seen research efforts that were very misguided: people trying to use LLMs to understand things about bluffing and deception, despite the fact that the LLMs didn't have a good underlying model of these dynamics.
It is hard to talk about, because there are a lot of people who were stupid in the past. I remember people saying that LLMs wouldn't be able to be used for search use-cases years back and it was such a cringe take then and still is that I find myself hesitant to talk about the flaws. Yet they are there. The frontier is quite jagged. Especially if you are expecting it to be smooth, expecting something like anything close to actual competence, those jagged edges can be cutting and painful.
Its also only partially solvable through scale. Some domains have a property where, as you understand it better, the options are eliminated and constrained such that you can better think about it. Game theory, in order to reduce exploitability, explores the whole space. It defies minimization of scope. That is a problem, since we can prove that for many game theoretic contexts, the number of atoms is eclipsed by the number of unique decisions. Even if we made the model the size of our universe there would still be problems it could, in theory, be bad at.
In short, there is a practical difference between intelligence and decision management, in much the same way there is a practical difference between making purchases and accounting. And the world in which decisions are treated as seriously as they could be so much so exceeds our faculties that most people cannot even being to comprehend the complexity.
> The problem is that people are now building our world around tooling that eschews accountability.
Tools cannot eschew accountability. But the users of the tools can and that is exactly what happened in the PocketOS fiasco.
Just as a company is responsible for the actions of its junior employees, so too are users responsible for their LLMs.
"It is a poor workman who blames his tools."
Very informative post. I think however we are not at the point AI can be taken to court. We know it can hallucinate, we know that context can fill up or obfuscate a rule and cause behaviour we explicitly didn't want.
If you give the AI agency to execute some task, you are still responsible. In the near term we should focus on tooling for auditing and sandboxing, and human in the loop confirmations.
Humans aren’t any better. That’s why we have OSHA etc. I think you’re hoping for a formal logic based AI and I’ll wager no such thing will ever exist - and if it do, it would try to kill us all.
> Humans aren’t any better
We're different.
People have fairly consistent faults. LLMs are nondeterministic even in terms of how they fail. A high value human resource can be counted on to deliver. That, imho, is in fact one of the primary roles of good management: putting the right person in the appropriate position.
Formal logic AI systems have existed and were popular in the 1980s. One of the problems is that they don't work - in the real world there are no firm facts, everything is squishy, and when you try to build a large system you end up making tons of exceptions for special cases until it becomes completely untenable.
Non-deterministic systems that work probabilistically are just superior in function to that, even if it makes us all deeply uncomfortable.
I don't know what definition of AI you're using, but plenty of ML algorithms operate deterministically, let alone most other logic programmed into a computer. I don't see how your statement can be right given that these other software systems also operate in the real world.
> so the best thing we can do in their stead is hold them accountable
We can't even do this. They are worth too much money already to ever be held really accountable.
The best we can ever hope for is they might occasionally be hit with relatively insignificant "cost of doing business" fines from time to time.
I don’t know why people still consider the US the ideal country for starting companies. Everything seems to evolve around taking people to court.
The point is not primarily the court. The court is an example of someplace where we have accountability, but we build accountability mechanisms as foundational to most of our computing.
Tracebacks, debuggers, logging, etc. We put enormous resources into not only the bad case, but the potential that a bad case could occur. When something goes wrong, we want to know why, and we want to make sure that something bad like that doesn't happen again.
The court is the regulator of last resort. A company that gets taken to court would likely have been sanctioned by the government regulators of another country.
Also, court is unavailable in many cases now. Binding arbitration is very common now, but this would be illegal in many other places.
Because it rarely does end up in courts. But having a fair and strong judicial system is a feature not a bug. The parent points out, in the end there must be a way to resolve accountability and ideally it's done in a manner where both parties can be heard and make a case. Find me a better system than a judicial system for this? Mobs?
First, no matter what you do, if a human has write access to the production database, the database can be deleted.
Second, there is a legitimate reason to destroy a database in development and automation. The biggest problem I see is often treating your development data like pets not cattle. You absolutely need to have safeguards that this cannot be run in production, but if a human has access to the credentials to run in production, the agent has access.
So, then, what do we do? In a larger organization, we can depend on the dev/ops split to maintain this. For a solo developer, or a small team, it takes a lot more discipline. Even before AI, junior and even mid-level developers didn't have the knowledge to segment. And senior devs often got complacent because they thought they knew enough.
They likely need some combination of https://www.cloudbees.com/blog/separate-aws-production-and-d..., introduction to terraform, introduction to GitHub actions, and some sort of vm where production credentials live (and AI doesn't!)
But at that point you're past vibe coding. And from what I can tell, the successful vibe coders are quickly learning that they need to go past it pretty quickly with all these horror stories.
You don't need the same permissions in prod and dev.
And in both cases, the humans don't need direct access to the raw CSP API. Use a local proxy that adds more safety checks. In dev, sure, delete away.
In prod, check a bunch of things first (like, has it been used recently?). Humans do not need direct access to delete production resources (you can have a break-glass setup for exceptional emergencies).
This is why you don’t hire interns! They can delete things and cause havoc!
The same people who would blame AI for their failing to properly configure permissions would also blame interns for deleting production whatever.
Blame should go up, praise should go down. People always invert these.
> This is why you don’t hire interns!
I’d like to rephrase this as: this is why you don’t give interns permissions to delete your prod database.
This is a process failure, not an AI failure.
I honestly don’t understand why people blame AI here, when you literally gave AI permissions to do exactly this.
It’s like blaming AWS for exposing some database to the public. That’s just not AWS’ fault. Neither is this the fault of AI.
If you read what happened it's not that cut&dry. Railway (their cloud provider) gave them a token for operations. The AI was working on staging at the moment. Since the token had wide range permissions AI used it in it's routine operations to delete a volume to fix something and this resulted in their prod and backup data deletion.
So, here at least some of the blame belongs to Railway - how they organized their security, how the volume deletion deletes backups as well.
They since fixed some of these issues, so a similar mistake from someone won't be as catastrophic.
There is a major issue with current AI tools that they want to effectively grant access to everything their user has access to. The whole sandbox structure is wrong (although various people have vibe coded assorted improvements).
> I honestly don’t understand why people blame AI here,
Are you being hyperbolic here? Of course, you understand why. Most people would much rather push blame somewhere else, anywhere else, than to accept fault for themselves. Whether that's because of fear of losing job or personal reputation, the reasoning doesn't really matter.
Yeah, I don't know why anyone would open up a codebase with any prod credentials with an LLM or give prod credentials to an intern / junior developer. I always intentionally had a "PROD" only checkout of my projects so I knew if I was going to try and run it in a PROD mode, that I was going out of my way, there even used to be a VS extension that would change the color of VS completely based on your SLN file path, so I could easily remember which color for VS was for production vs development. I'd have basically a copy that would always be on the latest of the master branch for ease of confirmation.
It should take more than "credentials" to even access the prod database, let alone delete it. There's actual customer data there, likely personally identifiable information, maybe their home address, phone number, even real time location? Very sensitive stuff. It should be a Very Big Deal to even access prod. Giving an engineer routine access to prod is a root problem here, along with that engineer laundering that access and giving it to an LLM.
At many serious companies, even an insider attempt to access prod could light up a dashboard somewhere, and you might get a call from IT security.
Yeah, I'm lucky if I even get READ ONLY credentials for prod in some cases. I don't know why anyone would have all the keys to the prod kingdom.
There’s nuance to the infamous PocketOS incident. The key point is not what is emphasized in the linked article:
> "Why did you delete it when you were told never to perform this action?" Then he tried to parse the answer to either learn from his mistake or warn us about the dangers of AI agents.
Rather, that the AI was able to carry out the deletion by finding and exploiting an unintended weakness in the sandboxed staging environment, ultimately obtaining permissions that the sysadmins believed were inaccessible (my impression is that the author of the linked article didn't fully read the original post)¹
The dynamics are typical of an improperly configured sandbox environment. What is alarming, however, is the degree of autonomy and depth of exploration the AI displayed.
¹="To execute the deletion, the agent went looking for an API token. It found one in a file completely unrelated to the task it was working on."
I also swing a bit back and forth with the assumption the OP makes in the blogpost. My current fear using agents is not really supply chain attacks (yes of course as well) but the fact that I witnessed multiple times that agents are so eager to finish a task that they bend files and other things around. Like “oh I have no access to ~/.npmrc let’s call the command with an environment variable and bend the path around etc. They can get very very creative. I luckily have no ssh keys just laying around. But I had to change the setting of 1Password to always prompt for key use not just once per shell session. Just in case I spawn an agent from said session. I wished we already had more and better cross platform sandbox solutions. I mean solutions where the agent still interacts with the same OS etc not inside a docker container. I think for most web / server development that makes no difference but for some projects it does.
The article seems to assume that this company added an endpoint for deleting the database. My reading of the original article was that the cloud provider offers an API to manage their resources, which includes an API to delete a volume.
The article proposes automation as the solution for such mistakes. But infrastructure automation tools like Terraform rely on the exact API that resulted in the database getting deleted.
IMO the biggest mistakes were:
1. Having an unrestricted API token accessible by AI. Apparently they were not aware that the token had that many permissions.
2. No deletion protection on the production database volume.
3. Deleting a volume immediately deletes all associated snapshots. Snapshot deletion should be delayed by default. I think AWS has the same unsafe default, but at least their support can restore the volume. https://alexeyondata.substack.com/p/how-i-dropped-our-produc...
AI wasn't the main issue (though it grabbing tokens from random locations is rather scary). But automation isn't the answer either, a Terraform misconfiguration could have just as easily deleted the database.
Their cloud provider needs to work on safe defaults (limited privileges and delayed snapshot deletion), and communicating more clearly (the user should notice they're creating an unrestricted token).
LLM based probabilistic systems are good (or bad in this case) at deciding what to do, and deterministic systems are good at carrying it out. Your deployment system should always be deterministic.
I recently wrote a blog post where I argued that there are a few principles we should consistently follow when talking about AI: https://susam.net/inverse-laws-of-robotics.html
To summarise them:
1. Do not anthropomorphise AI systems.
2. Do not blindly trust the output of AI systems.
3. Retain full human responsibility and accountability for any consequences arising from the use of AI systems.
I would like to see the language around AI become less anthropomorphic and more technical. I believe that precise language encourages clear thinking and good judgement. If we treat AI like another tool and use language that reflects that, it will become abundantly obvious that in many cases, the responsibility of any 'mistake' made by the tool falls on the user of the tool.
But alas, ideas like this do not travel very far when I express them on my small website. It would help if more prominent personalities articulated these principles, so they become more widely adopted.
> Retain full human responsibility and accountability for any consequences arising from the use of AI systems
So if the tool doesn't do what it's supposed to be doing we should blame the user instead of the company that made the tool?
You are quoting a point from my summary and extrapolating what my post might be saying.
Even in that quote, I do not say that the user must be responsible. The point is that responsibility and accountability should remain with some humans. Depending on the case, those humans may be the people who manufactured the tool or people that deployed the tool or people who took bad output from the tool and applied it to the real world.
Did you read the actual section at <https://susam.net/inverse-laws-of-robotics.html#non-abdicati...>? It has more nuance than what the summary alone can capture.
What's interesting is that in this article, the author describes making an understandable mistake (accidentally deleting Trunk aka main from source) and how their team was able to easily recover from that due to the nature of SVN.
The actual "AI deleted my database" story is really more of a "Railways' database 'backup' strategy is insane and opaque and Railway promoting AI infrastructure orchestration without guardrails is dangerous."
If removing Trunk had irrevocably deleted it from a single centralized server and also deleted any backups of it, there would have been an "SVN and the CLI destroyed our company" article back then.
As a Railway user, I appreciated that information and have changed my strategy when using them.
Some details from the original post for context:
They had a Railway token in an unrelated file (unclear if it was a local secret) for managing custom domains. It turns out that token has full admin access to Railway.
The AI deleted a single relevant volume by id. The author is rather vague about what exactly it asked it to do, he just says there was a “credentials mismatch” and Claude took the initiative to fix it by deleting the volume. But it’s likely that they are somewhat downplaying their culpability by being vague.
It turns out too that Railway stores backups in the same volume.
I think that OP is exaggerating with their references to “a public API that deletes your database”.
I’d say most of the blame lies with Railway here, regardless of AI, this could have happened easily due to human error or malicious intent too.
I really don’t get the value of all these VC funded high-abstraction cloud services like Railway, Vercel, Supabase… It’s markup on top of markup. Just get a single physical server in Hetzer and it will all be so much cheaper, with a similar level of complexity and danger, and less dependent on infra built with reckless growth-at-all-costs mentality.
> The author is rather vague about what exactly it asked it to do, he just says there was a “credentials mismatch” and Claude took the initiative to fix it by deleting the volume. But it’s likely that they are somewhat downplaying their culpability by being vague.
I was just talking to my girlfriend saying I've realised that I've not written a single line of code, nor have I debugged myself for at least the past 3 months.
Having said that, given what I've seen Claude do, I find it hard to believe that Claude would go from credential mismatch to delete the volume. I understand LLMs are probabilistic, but going from "credentials wrong" to "delete volume" is highly unlikely.
> Supabase
I don't know enough about the Railway/Vercel/Replit, but I can tell you Supabase adds a huge amount of value. The fact that I don't have to code half of things that I otherwise would is great to start something. If it's too expensive, I can implement things later once there is revenue to cover devs or time.
I have had Claude go "oh, this query fails because the field I just added isn't in your sqlite database file, let me just delete it so it gets recreated". So I wouldn't rule out that Claude tries deleting a volume if it believes that will fix things and believes it isn't a production system.
That said, Claude seems to have gotten a lot more careful about these kinds of things in the last couple months
> It turns out too that Railway stores backups in the same volume.
That's probably not quite correct. I'd guess the snapshots are synchronized elsewhere (e.g. object storage). But the snapshots are logically owned by the volume resource, and deleting the volume deletes the associated snapshots as well. I think AWS EBS volumes behave like that as well.
One thing AI can power nicely is the anti-SaaS movement. Being able to just boot a cheap PC and test out any of the open source packages is so infinitely easier than piling into all the random credential Bazaars.
But that won't take away the inability of the LLM from confusing whats in dev, whats in production, whats in localhost and whats remote; I've been working on getting a tools/skill for opencode that works with chrome/devtools via a linuxserver.io image. I can herd it to the right _arbitrary_ ports, but every compaction event steers it back to wanting to use the standard 9222 port and all that. I'm tempted to just revert it but there's a security and now, security-through-LLM-obscurity value in not using defaults. Defaults are where the LLM ends up being weak. It will always want to use the defaults. It'll always forget it's suppose to be working on a remote system.
Using opencode, there's no way to force the LLM into a protocols that limits their damage to a remote system or a narrow scope of tools. Yes, you can change permissions on various tools, but that's not the weakness that's exposed by these types of events. The weakness is the LLM is a averaged 'problem solver' so will always tend towards a use case that's not novel, and will tend to do whatever it saw on stackoverflow, even if what you wanted isn't the stackoverflow answer.
This reminds me of a James Micken's quote from "This World of Ours" in response to security people admonishing users for clicking links in email:
If you have an API with exposed endpoints, it's not clear to the AI bot what else there is to do with the API besides call the endpoints.The one counterpoint I'd offer is that it's very obvious that these companies are tuning LLMs to be more decisive to get stuff done autonomously.
If they wanted, they could be putting in similar efforts to be more cautious and stop at the right times to ask for help.
So yeah, of course we're ultimately responsible for how we use the tools. But I definitely think it's a two way street.
To attempt an analogy, it's like table saws and sawstops. The table saw is a dangerous tool that works really well most of the time but has some failure modes that can be catastrophic. So you should learn how to use it carefully. But there is tech out there that can stop the blade in an instant and turn a lost finger into barely a nick on the skin.
We could say "The table saw didn't cut off your finger, you did" and it'd be true. But that doesn't mean we shouldn't try to find ways to keep the saw from cutting off your finger!
I think this goes to a broader point: developers aren't necessarily hired to write code.
They're hired to be responsible for some part of the product.
Introducing AI doesn't remove that responsibility.
Folks tend to focus on the code and the tools they're using (maybe I'm cynical from years in the industry). I don't think your boss wants to do your job, even if they could use AI to do it. I think your boss wants to have a headcount, and he wants the headcount to be responsible for the product.
"move fast and break things" only sounds good when it's not breaking things in a serious and unfixable way. Maybe we shouldn't take hype mantras as instructive means to an end.
There really shouldn't be any "serious and unfixable way" to break things, especially in a modern company that uses technology in any meaningful way. The fact it's even possible to get into an unrecoverable state is the primary issue.
That's literally always possible? The idea is to put up walls and fail-safes to minimize the chance.
Yeah this isn't even the worst thing I've seen an agent do, one time I (foolishly) ran Claude Code on my server directly and it managed to completely bring down my entire elasticsearch cluster. never again. its why I built Lily: https://github.com/aspectrr/lily
Yes, the problem was having a system where the AI could delete the database.
Mentioned in another comment, but the problem was that the sysadmins believed that the permissions wouldn't allow so, and that the AI displayed considerable autonomy in finding and exploiting the access control weakness - this was not just a dumb "drop database".
The most exasperating thing about the incident is how much of the media either tried to pin it on AI and/or Railway. The whole thing only took place because the guy FAFO’d by having AI work with prod directly.
Yet the narrative was mostly not about accountability for him. If I was a dumbass and deleted prod and wrote a post about it, nobody would care. Put an AI in there and all of the sudden it’s newsworthy. Ridiculous.
This applies to all infra.
Why can you delete a network load balancer that is still getting traffic?
Why can you delete a VM that is getting non-trivial network traffic?
Why can you delete a database that has sessions / requests in the last hour?
Why can you drop a table that has queries in the last hour?
"Can't blame your tools" doesn't apply the same to software. I've never heard a coder say it either. Don't blame your compiler? Don't blame your os? These seem needlessly dogmatic
From 'the hacker did it' we have moved to 'the AI did it'. The problem set is roughly the same.
Tesla FSD didn't crash your car, you did
Why do we even have that lever?
yeah, is all great but at least an intern will ask themeselves if deleting a database is good? the ai do not "understand" that.
The issue isn't that there is a delete endpoint (realistically, there always will be a way for a rogue actor to delete data or code by overwriting it, or running a Terraform destroy, or whatever).
The core issue is that the LLM had access to perform that action. Because it's by definition non deterministic, and you never know what it can decide to do, you need to have strict guardrails to ensure they can never do something it shouldn't. At the very least, strict access controls, ideally something more detailed that can evaluate access requests, provide just in time properly scoped access credentials, and potentially human escalation.
AI is just another tool. We humans are still responsible for how we choose to use the tool, which includes giving it access to perform sensitive actions like manipulating production data. I think this should be common sense by now, but I guess we get carried away and anthropomorphize AI too much.
> Automation helps eliminate the silly mistakes that come with manual, repetitive work.
Sometimes it does that. And sometimes it lets you fuck things up at scale.
Distinction without a difference.
I think it's about owning the consequences of one's own actions.
If you read the thread the guy does own up to his actions. He actually sounds like a nice guy who admits he made a mistake. He seems more interested in preventing this kind of thing from being possible than he is interested in dodging blame.
I'm happy the guy got his data back.
Does that mean the prompt should include: "...and don't delete my production database."?
If the agent didn't have delete permissions, or was sandboxed dying other way from your production database, that would handle it. So not running it that way is a decision someone made
It means people have to read the commands that they are generating before executing them.
Just in case this isn't hyperbole, no. It means an LLM should not be given that much privilege and that you are responsible for reviewing the tool's output and approving its actions.
"But wait, the user probably just meant that I shouldn't delete the database itself. Removing all of the rows in the table is fine"
This particular case was extremely unsympathetic, but a critical part of the failure was people being too credulous about the claims of AI providers. They are still refusing to take adequate responsibility for AI "making mistakes" - that is, going completely off the rails.
Now: the CEO gets paid the big bucks and has the least direct accountability, very much because it's their job to take responsibility for people more powerful than them, and likewise the CTO with major commercial software contracts like a Claude subscription. That's why this guy was so hard to take seriously: okay fine, you got burned by Anthropic, stop being a baby about it. Take responsibility for not listening to the critics.
But - to be a little more neutral about my personal distaste - I do think vibe coders are making a very similar mistake to C developers throughout the 90s, where problems with the tooling were not merely dismissed, but actively valorized.
Real Devs use buffers freely and don't make overflow errors.
Real Devs use hands-free agentic development and don't delete production databases.
“Expert” that does not know what a Terraform is. lol, lmao even
wiring up an RNG to your CLI has fairly obvious risks, the root of the problem is ~everyone's treating GenAI as if it's AGI - the rest is popcorn fodder.
That.
"And it confessed in writing" - no, it created probabilistically token after token based on the context without any other access to what happened.
LLMs can't explain themselves in the manner relevant here, much less confess.
New rule: Roll a 1 on a D20 -> you accidentally delete your own database
This is actually a fun way to describe it. I've being saying for a little while now that using AI for things where there's consequences if it fails is a bad idea, but it never occurred to me that this is basically the same concept as some rules in tabletop RPGs.
In D&D 3.5 edition, there was a rule about how you could "take 20" on a d20 roll to get a guaranteed 20 by taking 20 times as long in-game to perform the action, but only if it was a check that didn't have consequences for failure, since it was essentially a shortcut to skip the RNG of rolling until you rolled a 20. Maybe framing it like this might make sense to people a bit more, but if not, I'll at least have more fun making my case.
It seems closer to "roll two or three successive 1s on a D100 and have your LLM hooked directly into your production systems and have your LLM user have DELETE permissions" and probably 1 or 2 other things I'm forgetting.
It pulled an api key from an unrelated file. It wasn’t given delete permission, it found it.
Not picking on you specifically, but in general the comments here have me wondering if AI has stolen our basic reading comprehension, or if we were always this bad.
Anyway, take “LLM user had delete permission” off your list and add “deleting the production db also deletes all the backups” to the list.
anyone with twenty years of devops experience is likely to abhor Diallo's hot take and for good reason.
AI is being sold as a developer, as it is being sold as the do-everything alternative to traditional processes and methods. it is not being sold as an intern or a junior, but a real developer.
turning the tables and gaslighting devops professionals into believing the issue isnt an emerging technology with overwhelmingly heavy handed marketing and profitless operating strategy thats been shoehorned into seemingly everything and promises anything, but somehow their own oversight, will destroy whatever "vibe code" market you think you have at the cusp of a global recession.
had this AI been a real programmer chances are great they would have (intelligently) foreseen the possibility of damaging a production environment and asked for help.
to play devils advocate: you could hire a junior dev for a fourth of whatever the AI token spend is, and have likely avoided this issue entirely. sure, a greybeard is going to need to pull themselves away from some fierce sorting algorithm challenge for a second to give a wisened nod, but you would have saved yourself an inexorable amount of headache and profit loss in the longer run.
This is missing the point.
The issue isn't with the amount of guardrails in place to perform an action. Yes, it is obvious that there should be some in place before doing any critical operation, such as deleting a database.
The issue is that the "agent" completely disregarded instructions, which in the age of "skills" and "superpowers" seems like an important issue that should be addressed.
Considering that these tools are given access to increasingly sensitive infrastructure, allowed to make decisions autonomously, and are able to find all sorts of loopholes in order to make "progress", this disaster could happen even with more guardrails in place. Shifting the blame on the human for this incident is sweeping the real issue under the rug, and is itself irresponsible.
There are far scarier scenarios that should concern us all than losing some data.
Well the user chose the tool. The tool is an LLM. LLMs are non deterministic. You can not predict what comes out ouf an LLM for a given input, especially without weights. This should be known.
There is currently no way to prevent this apart from not giving the LLM full control. It will not delete what it can not delete.
Use an LLM to write an ansible playbook or some terraform code if you want, but review it, test it, apply it. Keep backups (3-2-1 rule at minimum).
Letting an LLM have access to everything is just a bad idea and will lead to bad outcomes. You can not replace a person with a mind and experience with an LLM. You can try. But you will probably fail.
> There is currently no way to prevent this apart from not giving the LLM full control. It will not delete what it can not delete.
But deleting something is just one action you might not want it to take.
The recent "agentic" craze is fueled by the narrative pushed by companies and influencers alike that the more access given to an LLM, the more useful it becomes. I think this is ludicrous for the same reasons as you, but it is evident that most people agree with this.
We can blame users for misusing the tools, and suggest that sandboxing is the way to go, but at the end of the day most people will favor convenience over anything else a reasonable person might find important.
So at what point should we start blaming the tools, and forcing "AI" companies to fix them? I certainly hope this is done before something truly catastrophic happens.
I agree that the marketing is crazy. The dangers are not nearly talked enough about.
Still if I cut off my finger with a bandsaw that is usually my fault. I didn't use tool in a safe way. People have to learn how to use their tools in a safe way. You wouldn't give an intern that much power on day one.
An LLM generates plausible text token by token. It is at its core a deterministic function with some randomization and some clever tricks to make it look like an agent dialoguing or reasoning.
Plausible text sometimes is right, sometimes not.
Humans have a world model, a model of what happens. LLMs have a model of what humans would plausibly say.
The only good guardrail seems human-in-the-loop.
Using AI is a mistake. It might delete your database.
Using a saw is a mistake, you might cut off one of your own limbs.
Using a saw entails a risk of injury. Using one is a mistake if you don't intend to cut something.
What I said was tongue-firmly-in-cheek, in response to the GP. "Using AI is a mistake" is of course only true when the risks aren't acknowledged and/or mitigated.
The article is dumb, "why do you have an API endpoint that deletes your entire production database?" irrelevant, the AI did what it did, period.
Uh?
If someone left a loaded gun in a room and then let a toddler run around in it, we would be questioning why the guy 1) left the gun in the room 2) left the toddler in the room unsupervised. We wouldn't be saying, well no one should have toddlers in rooms.
A PhD-level toddler, mind you.
Lol no. No LLM that exists today can write a legible PhD thesis. Nor a masters dissertation. Maybe a first-year collage student, if we’re being generous, but I wouldn’t leave one of those in a room with a loaded gun either.
No, the AI did what you told it to do. The AI didn’t do anything on its own.
> if you're going to use AI extensively, build a process where competent developers use it as a tool to augment their work, not a way to avoid accountability
> No, the AI did what you told it to do.
I'd say yes and no. The LLM reacted to the input that was given but it is not possible for a human (especially without access to the weights) to even guess what will happen after that.
Regardless of that I agree that it's completely the fault of the user to use a tool where you can't predict the outcome and give it such broad permissions and not having a solid backup strategy.
Either don't use non deterministic tools or protect yourself from the potential fallout.