the interesting design tension i ran into building in this space is context management for longer sessions. the model accumulates tool call history that degrades output quality well before you hit the hard context limit - you start seeing "let me check that again" loops and increasingly hedged tool selection.a few things that helped: (1) summarizing completed sub-task outputs into a compact working-memory block that replaces the full tool call history, (2) being aggressive about dropping intermediate file read results once the relevant information has been extracted, and (3) structuring the initial system prompt so the model has a clear mental model of what "done" looks like before it starts exploring.the swift angle is actually a nice fit - the structured concurrency model maps well to the agent loop, and the strong type system makes tool schema definition less error-prone than JSON string wrangling in most other languages.
I think this is a good learning project, based in a long perusal of the github repo. One suggestion: don’t call the CLI component of the project ‘claude’ - that seems like asking for legal takedown problems.
IIUC Gemini will run in Apple's cloud infra, not on device. The only "gemini" local model is really old by today's standards, and is not that smart for local inference (newer open source models are better).
the interesting design tension i ran into building in this space is context management for longer sessions. the model accumulates tool call history that degrades output quality well before you hit the hard context limit - you start seeing "let me check that again" loops and increasingly hedged tool selection.a few things that helped: (1) summarizing completed sub-task outputs into a compact working-memory block that replaces the full tool call history, (2) being aggressive about dropping intermediate file read results once the relevant information has been extracted, and (3) structuring the initial system prompt so the model has a clear mental model of what "done" looks like before it starts exploring.the swift angle is actually a nice fit - the structured concurrency model maps well to the agent loop, and the strong type system makes tool schema definition less error-prone than JSON string wrangling in most other languages.
I built a Swift library called Operator [0] to run the core agent loop, if it would save anyone time.
[0]: https://github.com/bensyverson/Operator
I think this is a good learning project, based in a long perusal of the github repo. One suggestion: don’t call the CLI component of the project ‘claude’ - that seems like asking for legal takedown problems.
Interesting, I'm also building one in Swift :D Seems like a good learning experience.
How practically could we drop in Apple Intelligence once it's using Gemini as its core for a 100% local AI agent in a box?
IIUC Gemini will run in Apple's cloud infra, not on device. The only "gemini" local model is really old by today's standards, and is not that smart for local inference (newer open source models are better).
That's what I figured. Some day eventually it will be possible. Until then, it's only LM Studio or Ollama as a potential hookup.
I've got some ideas inspired by this project. It's promising.
[dead]