procedure
The 'Base' method for evaluating Large Language Models involves querying the models directly with questions from the Med-HALT benchmark without additional context or instructions to assess inherent hallucination tendencies in a zero-shot setting.

Authors

Sources

Referenced by nodes (4)