claim
General-purpose Large Language Models outperform fine-tuned medical Large Language Models in medical hallucination detection tasks, according to the evaluation conducted by the authors of the MedHallu benchmark.
Referenced by nodes (2)
- Large Language Models concept
- MedHallu concept