Fact — measurement — Knowledge Tree

The 'Survey and analysis of hallucinations in large language models' reports Prompt Sensitivity (PS) and Model Variability (MV) scores for LLMs as follows: LLaMA 2 (13B) (PS: 0.091, MV: 0.045), Mistral 7B (PS: 0.078, MV: 0.053), DeepSeek 67B (PS: 0.060, MV: 0.080), OpenChat-3.5 (PS: 0.083, MV: 0.062), and Gwen (PS: 0.079, MV: 0.057).

Authors

Person: Not available Organization: Frontiers
Survey and analysis of hallucinations in large language models

Sources

Survey and analysis of hallucinations in large language models www.frontiersin.org Frontiers via serper

Referenced by nodes (3)