> I actually like the editable format of the chat interface because it allows fi...

freehorse · 2025-05-08T08:26:48 1746692808

The only issue I would imagine is not being able to use prompt caching, which can increase the cost of API calls, but I am not sure if prompt caching is used in general in such a context in the first place. Otherwise you just send the "history" in a json file, there is nothing mystical about llm chats really. If you use an API you can just send to autocomplete whatever you want.

senko · 2025-05-08T05:51:52 1746683512

> I am not sure if cloud-based LLMs even allow modifying assistant output.

In general they do. For each request, you include the complete context as JSON, including previous assistant output. You can change that as you wish.