procedure
The authors evaluated the effectiveness of hallucination mitigation techniques on Large Language Models using the Med-HALT benchmark by sampling 50 examples from each of seven medical reasoning tasks, totaling 350 cases.

Authors

Sources

Referenced by nodes (3)