reference
The FEVER benchmark, introduced by Thorne et al. in 2018, utilizes a Natural Language Inference (NLI) model to evaluate whether a response contains, contradicts, or does not mention the provided evidence.
Authors
Sources
- A Knowledge Graph-Based Hallucination Benchmark for Evaluating ... arxiv.org via serper
Referenced by nodes (2)
- fever concept
- natural language inference (NLI) concept