The way this works is that it stores workstreams and session state in a local SQLite DB, and links each ctx session to the exact local Claude Code and/or Codex raw session log it came from (also stored locally).
Prompt caching is done on the provider side. If you send two requests to a provider in short succession and the beginning of your second request is the same as your first (for example, because your second request is the continuation of an ongoing chat), the repeated tokens are much less expensive the second time.
Obviously, your tool does not provide this. But I think GP is undervaluing the UX advantages of having your conversation history.
Since prompt caching won't work across different models, how is this approach better than dropping a PR for the other harnesses to review?
Sorry, I may be misunderstanding the question.
The way this works is that it stores workstreams and session state in a local SQLite DB, and links each ctx session to the exact local Claude Code and/or Codex raw session log it came from (also stored locally).
What do you mean by prompt caching?
Prompt caching is done on the provider side. If you send two requests to a provider in short succession and the beginning of your second request is the same as your first (for example, because your second request is the continuation of an ongoing chat), the repeated tokens are much less expensive the second time.
Obviously, your tool does not provide this. But I think GP is undervaluing the UX advantages of having your conversation history.
Have you considered making it possible to share a stream/context? As an export/import function.
that's interesting, I hadn't at this point but this sounds potentially useful