claim
General-purpose LLMs like GPT-4 outperform specialized medical fine-tuned models in hallucination detection tasks when no extra context is provided.
Authors
Sources
- MedHallu: Benchmark for Medical LLM Hallucination Detection www.emergentmind.com via serper
Referenced by nodes (3)
- Large Language Models concept
- hallucination detection concept
- GPT-4 concept