Is a fair queue worth it vs spinning up more capacity? I've worked on multiple projects where we've ended up ripping out a queue and just spinning up more machines to handle the load synchronously instead.
More capacity won't address operations that the originator isn't willing to (or can't) hang around to wait for and/or that are long-running enough that restarts due to failures might be needed. That's the most immediate reason: Tasks where no amount of capacity will remove the need to have some form of queueing mechanism.
For complex enough workflows, queues are also often helpful at addressing potentially failing stages in ways that are easier to debug. But in that case you want your queue to be closer to a state machine where actually waiting in the queue for much time is the exception. You can just build a state machine for that too, ensuring the inputs to the stage about to execute are recorded in a durable, restartable way. But sometimes you may need more copies of the same type of job, and soon you have something that looks and smells much like a queue anyway.
Then lastly, spikes. But they only really help well enough if you still spin up more machines aggressively enough that the wait time doesn't get long enough to be perceived as just as bad as or worse than an immediate error, so it does make sense to ask your question.
A queue also doesn't need to be complex. If it gets complex, that increases the reasons to ask your question for that specific system. If it potentially grows large, as well (sometimes the solution to that is simply to refuse to queue if the queue exceeds a certain size or the processing time goes above a certain threshold).
Queues are great when appropriate, but they do often get used as a "solution" to a scaling problem that hasn't been sufficiently analyzed, which sounds like might have been the case in your examples.
Is it a choice? Most projects I've worked on had times when they became overwhelmed with requests; a queue handles this case, but more capacity just makes it rarer. Ideally you want enough capacity to handle X% of requests within Y milliseconds and a queue to deal with the leftovers, and I suppose if your X is low enough then a fair queue becomes a necessity.