measurement
In an evaluation of 11 foundation models (7 general-purpose, 4 medical-specialized) across seven medical hallucination tasks, general-purpose models achieved a median of 76.6% hallucination-free responses, while medical-specialized models achieved a median of 51.3%.

Authors

Sources

Referenced by nodes (2)