Getting agents used to using `--force` to bypass prompts seems like a bad idea. `--force` is for when the action failed (or would fail) for some reason and you want it to definitely happen this time.
I think `--yes` or `--yes-do-the-dangerous-thing` is leagues better.
It also in the case of an LLM can bias it towards using that sort of flag more commonly, which is less than ideal when it then uses a more ordinary Unix command that uses that to mean something dangerous.
I dont want "agent-native CLIs" to proliferate because I'd rather we design CLIs for human use and programmatic (automation) use first. Agents are good at vomiting json between tool calls, I am not, and never will be.
Too many tools stray so wildly from UNIX principles. If we design for agents first we will likely see more and more of this.
> Let the Agent use the CLI and if it guesses the wrong option, you make that the RIGHT option
This sounds backwards and presumes that the statistics machines which are LLMs are getting it right when they "average" out to the wrong command. No, fix the agents behavior, dont change the CLI to accommodate it.
I don’t remember exactly the specific examples off the top of my head (some are definitely ffmpeg commands) but I do know that when LLMs keep hallucinating command line flags that don’t exist for that specific command their “suggestion” is actually very reasonable and so many developers are adding support to their tools for common hallucinations.
It’s also likely that agents would also be better if they didn’t deal with json vomit either. I’m optimistic that agent frameworks will eventually come full circle and realize concise teletype linear CLIs aka old school UNIX is actually very effective and efficient for agents as well as humans!
I think every CLI is agent native when invoked from claude or any coding agents.
I was really suprised today. We at adaptive [1], is an access management platform to access psql, mysql, vms, k8s etc. When you use `adaptive connect <db-name>` it would connect create just-in-time tunnel and connect the user to the database. You cannot do traditional psql operation etc. That design is by choice.
Today I was trying to invoke it via claude, and, god damn, it found a way to connect. It create a pseudo shell in python, pass the queries and treat our cli like a tool. This would have been humanly not possible. Partly because, you would like about risks, good practice/bad practice, would be scared to execute and write code like that, and it just did it and acheived the goal.
Partially, but I think if you design for agents, their needs are different enough from a human's that you end up making different choices.
I found myself nodding along to the linked tweet/article. Recently I did many rounds of iterative user-centered design with an agent to improve the CLI interface in Jobs [0], a task manager for LLMs. The resulting CLI follows most of these principles.
One great idea from the tweet that I will be adding: a `feedback` subcommand, for the agent to capture feedback while they work.
Definitively super human ultra intelligence by the end of Q4!!!!11 Also not able to use tools, which are not explicitly built for machine consumption.
Getting agents used to using `--force` to bypass prompts seems like a bad idea. `--force` is for when the action failed (or would fail) for some reason and you want it to definitely happen this time.
I think `--yes` or `--yes-do-the-dangerous-thing` is leagues better.
It also in the case of an LLM can bias it towards using that sort of flag more commonly, which is less than ideal when it then uses a more ordinary Unix command that uses that to mean something dangerous.
`--non-interactive` has precedent too.
I dont want "agent-native CLIs" to proliferate because I'd rather we design CLIs for human use and programmatic (automation) use first. Agents are good at vomiting json between tool calls, I am not, and never will be.
Too many tools stray so wildly from UNIX principles. If we design for agents first we will likely see more and more of this.
The point IMO in "agent-native CLIs" is to make them match the statistical average.
Let the Agent use the CLI and if it guesses the wrong option, you make that the RIGHT option.
Every time it doesn't guess something right, you change it.
> Let the Agent use the CLI and if it guesses the wrong option, you make that the RIGHT option
This sounds backwards and presumes that the statistics machines which are LLMs are getting it right when they "average" out to the wrong command. No, fix the agents behavior, dont change the CLI to accommodate it.
I don’t remember exactly the specific examples off the top of my head (some are definitely ffmpeg commands) but I do know that when LLMs keep hallucinating command line flags that don’t exist for that specific command their “suggestion” is actually very reasonable and so many developers are adding support to their tools for common hallucinations.
It’s also likely that agents would also be better if they didn’t deal with json vomit either. I’m optimistic that agent frameworks will eventually come full circle and realize concise teletype linear CLIs aka old school UNIX is actually very effective and efficient for agents as well as humans!
I think every CLI is agent native when invoked from claude or any coding agents.
I was really suprised today. We at adaptive [1], is an access management platform to access psql, mysql, vms, k8s etc. When you use `adaptive connect <db-name>` it would connect create just-in-time tunnel and connect the user to the database. You cannot do traditional psql operation etc. That design is by choice.
Today I was trying to invoke it via claude, and, god damn, it found a way to connect. It create a pseudo shell in python, pass the queries and treat our cli like a tool. This would have been humanly not possible. Partly because, you would like about risks, good practice/bad practice, would be scared to execute and write code like that, and it just did it and acheived the goal.
[1] https://adaptive.live
This reminds me that agents sometimes really like heredoc in shells, and waste tokens retrying with a file.
Is it me or are all these articles about using AI effectively and building for AI just, you know, things that we should have been doing all along?
It feels like most of the “rules” are “don’t be an ass to your consumer”.
Partially, but I think if you design for agents, their needs are different enough from a human's that you end up making different choices.
I found myself nodding along to the linked tweet/article. Recently I did many rounds of iterative user-centered design with an agent to improve the CLI interface in Jobs [0], a task manager for LLMs. The resulting CLI follows most of these principles.
One great idea from the tweet that I will be adding: a `feedback` subcommand, for the agent to capture feedback while they work.
[0]: https://github.com/bensyverson/jobs
Non-x link: https://trevinsays.com/p/10-principles-for-agent-native-clis