claim
Instructions emphasizing conciseness, such as 'answer this question briefly,' degraded the factual reliability of the Large Language Models tested in the Phare benchmark.
Authors
Sources
- Phare LLM Benchmark: an analysis of hallucination in ... www.giskard.ai via serper
Referenced by nodes (1)
- Large Language Models concept