procedure
The automated judge prompts used in the human validation study for comparing KGHaluBench’s entity-level and fact-level filters against GPT-3.5-Turbo are configured with a Temperature of 0 and Max Tokens of 10.

Authors

Sources

Referenced by nodes (1)