measurement
HaluEval includes 5,000 general user queries with ChatGPT responses and 30,000 task-specific examples across three tasks: question answering (HaluEval QA), knowledge-grounded dialogue (HaluEval Dialogue), and summarisation (HaluEval Summarisation).
Authors
Sources
- The Hallucinations Leaderboard, an Open Effort to Measure ... huggingface.co via serper
Referenced by nodes (4)
- Question Answering concept
- summarization concept
- HaluEval concept
- chatgpt entity