perspective
Using LLaMA-3.1-70B as the sole evaluation model in the HalluLens benchmark raises concerns about bias, particularly when the benchmark is used to judge other LLaMA variants.
Authors
Sources
- Pascale Fung's Post - LLM Hallucination Benchmark www.linkedin.com via serper
Referenced by nodes (1)
- LLaMA concept