claim
The MedHALT benchmark is limited to assessing the reasoning capabilities of Large Language Models over the medical domain in a Question Answering (QA) format.

Authors

Sources

Referenced by nodes (2)