that seems token inefficient. why have the llm do a full round trip. load the skill which contains the potentially hundreds of lines code then copy and paste the code back into the compiler when it could just run it?
not that i care too too much about small amounts of tokens but depleting your context rapidly seems bad. what is the positive tradeoff here?
I don't understand. The Skill runs the tools. In the cases there are problems where you can have programs replace the LLM, I think we should maximully do that.
That uses less tokens. The LLM is just calling the script, and getting the response, and then using that to continue to reason.