measurement
The authors created a calibration dataset for hallucination evaluation by evaluating 34 models (ranging from 1B to 685B parameters) across 10 runs of 150 questions, generating 51,000 data points.

Authors

Sources

Referenced by nodes (1)