claim
MedHallu is a benchmark designed for detecting medical hallucinations in large language models, consisting of 10,000 high-quality question-answer pairs derived from PubMedQA.
Authors
Sources
- A Comprehensive Benchmark for Detecting Medical Hallucinations ... aclanthology.org via serper
- MedHallu: Benchmark for Medical LLM Hallucination Detection www.emergentmind.com via serper
Referenced by nodes (4)
- Large Language Models concept
- medical hallucination concept
- MedHallu concept
- PubmedQA concept