Show HN: CATArena – Evaluating LLM agents via dynamic enviroment interactions

(github.com)

3 points | by jinqueeny  10 hours ago

No comments yet.