I think one factor is that all these LLMs are tuned to be ridiculously agreeable...

I think one factor is that all these LLMs are tuned to be ridiculously agreeable, almost everything you say will be met with some variation of “you’re absolutely right!”.

It’s like, look, I’m definitely not “absolutely right” 90% of the time, so how the hell am I supposed to trust what you’re saying?

I would prefer a model that’s tuned to prefix answers with “no, dumbass. Here’s why you’re an idiot:”. And yes you can promot them to answer this way, but they’re simply not wired to challenge you except for very trivial things.