This is where supporting machinery & RAG are very useful.
You can auto- lint and test code before you set eyes on it, then re-run the prompt with either more context or an altered prompt. With local models there are options like steering vectors, fine-tuning, and constrained decoding as well.
There's also evidence that multiple models of different lineages, when their outputs are rated and you take the best one at each input step, can surpass the performance of better models. So if one model knows something the others don't you can automatically fail over to the one that can actually handle the problem, and typically once the knowledge is in the chat the other models will pick it up.
Not saying we have the solution to your specific problem in any readily available software, but that there are approaches specific to your problem that go beyond current methods.
You can auto- lint and test code before you set eyes on it, then re-run the prompt with either more context or an altered prompt. With local models there are options like steering vectors, fine-tuning, and constrained decoding as well.
There's also evidence that multiple models of different lineages, when their outputs are rated and you take the best one at each input step, can surpass the performance of better models. So if one model knows something the others don't you can automatically fail over to the one that can actually handle the problem, and typically once the knowledge is in the chat the other models will pick it up.
Not saying we have the solution to your specific problem in any readily available software, but that there are approaches specific to your problem that go beyond current methods.