First so called vulnerability, isn't how a lot platforms are actually built? Share a link/copy a link, and more often than not, I am sure to have read a warning like "anyone with that link may access that file".
Now should I mention all the screw up I have seen in several Saas 1b+ valuation, including DocuSign/ and more security oriented ones (PIM related etc?).
For any softwares, you need a minimum critical mindset and experiences that you don't usually see.
Something worth noting is that the types of vulnerabilities LLMs introduce are notably different from what humans introduce, way fewer local issues like syntax mistakes, simple memory problems, etc and far more broad issues like authn/authz
> prompting for test-driven development is not the same as enforcing code coverage thresholds in your build tool
Are they actually different? I would guess they have roughly the same efficacy. 100% code coverage means nothing, and this is especially true with LLMs.
I mean, isn't introducing safety guardrails as part of the system prompt actually a REALLY bad idea? This way you basically fully rely on the model to follow the rule, but its clear that even frontier models like Opus will start ignoring these things after a certain context length...
In our company we are just running agents inside isolated containers with isolated network access so it cannot even SSH or fuck up anything even if it gets access into it... That's the only and safest way... inconvenient, true, but the only safe option.
PS: At the same time I've observed this way actually people uses the agent in a more reasonable way, e.g. producing helper scripts to help them with their daily stuff, produce very specific things, create simple PoCs, but they don't commit to vibe-code all the functionality in their corresponding software products.
> "To combat this we need to write a security context file to guide the AI, be cautious with AI permission requests, create a daily security intelligence feed, and provide builders with a secure-by-default harness and templates."
Edit: To combat this we need to actually write and understand our code.
First so called vulnerability, isn't how a lot platforms are actually built? Share a link/copy a link, and more often than not, I am sure to have read a warning like "anyone with that link may access that file".
Now should I mention all the screw up I have seen in several Saas 1b+ valuation, including DocuSign/ and more security oriented ones (PIM related etc?).
For any softwares, you need a minimum critical mindset and experiences that you don't usually see.
Something worth noting is that the types of vulnerabilities LLMs introduce are notably different from what humans introduce, way fewer local issues like syntax mistakes, simple memory problems, etc and far more broad issues like authn/authz
Vibe coding into production? You don't need to wait for scientists to produce research to know that's not a great idea.
You played yaself
> prompting for test-driven development is not the same as enforcing code coverage thresholds in your build tool
Are they actually different? I would guess they have roughly the same efficacy. 100% code coverage means nothing, and this is especially true with LLMs.
We will learn the hard way... like always.
I mean, isn't introducing safety guardrails as part of the system prompt actually a REALLY bad idea? This way you basically fully rely on the model to follow the rule, but its clear that even frontier models like Opus will start ignoring these things after a certain context length...
In our company we are just running agents inside isolated containers with isolated network access so it cannot even SSH or fuck up anything even if it gets access into it... That's the only and safest way... inconvenient, true, but the only safe option.
PS: At the same time I've observed this way actually people uses the agent in a more reasonable way, e.g. producing helper scripts to help them with their daily stuff, produce very specific things, create simple PoCs, but they don't commit to vibe-code all the functionality in their corresponding software products.
> "To combat this we need to write a security context file to guide the AI, be cautious with AI permission requests, create a daily security intelligence feed, and provide builders with a secure-by-default harness and templates."
Edit: To combat this we need to actually write and understand our code.