CVE-Bench: testing LLM agents on real-world vulnerability patches

(giovannigatti.github.io)

8 points | by logickkk1  2 hours ago

1 comments