reference
The Med-HALT benchmark categorizes hallucination tests into Reasoning Hallucination Tests (RHTs), which evaluate a Large Language Model's ability to reason accurately with medical information and generate logically sound, factually correct outputs without fabrication.

Authors

Sources

Referenced by nodes (2)