Something else to consider. I often have much better success with something like: Create a prompt that creates a specification for a pacman game in a single html page. Consider edge cases and key implementation details that result in bugs. <take prompt>, execute prompt. It will often yield a much better result than one generic prompt. Now that models are trained on how to generate prompts for themselves this is quite productive. You can also ask it to implement everything in stages and implement tests, and even evaluate its tests! I know that isn't quite the same as "Implement pacman on an HTML page" but still, with very minimal human effort you can get the intended result.
It can be, but the more specific context you can give the better, especially on your initial prompting. If it is opaque to you who knows what it is doing. Dialing in the initial spec/prompt for 5 minutes is still important. Different LLMs and models will do better or worse on this and by being a human in the loop on this initial stuff my experience is much higher quality, which indicates to me, the LLM tries, but just doesn't always have enough info to implement your intentions in many cases yet.