claim
The paper 'The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs' argues that current evaluation practices for hallucination detection in large language models are fundamentally flawed because they rely on metrics like ROUGE that misalign with human judgments.
Authors
Sources
- The Illusion of Progress: Re-evaluating Hallucination Detection in ... arxiv.org via serper
Referenced by nodes (4)
- Large Language Models concept
- hallucination detection concept
- ROUGE concept
- The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs concept