Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This makes sense. I recently did an experiment to test GPT5 on hallucinations on cricket data where there is a lot of statistical pressure. It is far better to say idk than a wrong answer. Most current benchmarks don’t test for that. https://kaamvaam.com/machine-learning-ai/llm-eval-hallucinat...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: