I tried this approach when attempting to get Deepseek-r1 and GrokV3 to create a simple CUDA application. It was necessary because the iterative approach kept leading to hangs and divergent behaviors. I still wasn't able to get a working application, however.