claim
The DROP dataset contains difficult questions, such as asking for the number of touchdown runs of 5 yards or less in a 49ers football game, which requires an LLM to read and compare data against a specific requirement.
Authors
Sources
- Benchmarking Hallucination Detection Methods in RAG - Cleanlab cleanlab.ai via serper
Referenced by nodes (1)
- DROP concept