claim
General-purpose large language models often outperform specialized medical models in hallucination detection tasks according to experiments conducted for the MedHallu benchmark.
Authors
Sources
- [Literature Review] MedHallu: A Comprehensive Benchmark for ... www.themoonlight.io via serper
Referenced by nodes (3)
- Large Language Models concept
- hallucination detection concept
- MedHallu concept