claim
For HaluEval QA, Dialog, and Summarisation tasks, Mistral and LLaMA2-based models produce the best results.
Authors
Sources
- The Hallucinations Leaderboard, an Open Effort to Measure ... huggingface.co via serper
For HaluEval QA, Dialog, and Summarisation tasks, Mistral and LLaMA2-based models produce the best results.