Fact — claim — Knowledge Tree

General-purpose LLMs like GPT-4 outperform specialized medical fine-tuned models in hallucination detection tasks when no extra context is provided.

Authors

Person: Not available Organization: Emergent Mind
MedHallu: Benchmark for Medical LLM Hallucination Detection

Sources

MedHallu: Benchmark for Medical LLM Hallucination Detection www.emergentmind.com Emergent Mind via serper

Referenced by nodes (3)