perspective
Kalai et al. (2025) argue that post-training benchmarks exacerbate hallucinations in Large Language Models by penalizing uncertainty, which incentivizes models to guess rather than abstain from answering.

Authors

Sources

Referenced by nodes (1)