entity

Mistral AI

Also known as: Mistral

Facts (10)

Sources

Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org arXiv Aug 13, 2025 4 facts

measurementThe Mistral model exhibits pronounced performance degradation in zero-shot settings, with performance drops observed in Perplexity metrics, whereas the Llama model maintains more consistent performance with minimal degradation.

measurementThe Eigenscore hallucination detection method experiences a performance erosion of 19.0% for the Llama model and 30.4% for the Mistral model on the NQ-Open dataset when switching from ROUGE to LLM-as-Judge evaluation.

procedureTo evaluate hallucination detection, the authors of 'Re-evaluating Hallucination Detection in LLMs' randomly selected 200 question–answer pairs from Mistral model outputs on the NQ-Open dataset, ensuring a balanced representation of cases where ROUGE and LLM-as-Judge yield conflicting assessments.

measurementThe Perplexity hallucination detection method sees its AUROC score decrease by as much as 45.9% for the Mistral model on the NQ-Open dataset when switching from ROUGE to LLM-as-Judge evaluation.

Reducing hallucinations in large language models with custom ... aws.amazon.com Amazon Web Services Nov 26, 2024 2 facts

claimAmazon Bedrock is a fully managed service that provides access to foundation models from AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API.

claimAmazon Bedrock supports foundation models from various providers, including Anthropic (Claude models), AI21 Labs (Jamba models), Cohere (Command models), Meta (Llama models), and Mistral AI.

EdinburghNLP/awesome-hallucination-detection - GitHub github.com GitHub 1 fact

procedureThe BAFH framework is a lightweight method that trains a feedforward classifier on hidden states of Large Language Models to determine belief states and classify hallucination types, as evaluated against MIND and SAR baselines using Gemma-2, Llama-3.1, and Mistral models.

What is Open Source Software? - HotWax Systems hotwaxsystems.com HotWax Systems Aug 11, 2025 1 fact

referenceMistral AI developed the Mistral model, which is a compact, fast-performing large language model popular for edge and local use.

vectara/hallucination-leaderboard - GitHub github.com Vectara 1 fact

referenceThe Vectara hallucination leaderboard utilizes specific API access points for various large language models: Llama 4 Maverick 17B 128E Instruct FP8 and Llama 4 Scout 17B 16E Instruct are accessed via Together AI; Microsoft Phi-4 and Phi-4-Mini are accessed via Azure; Mistral Ministral 3B, Ministral 8B, Mistral Large, Mistral Medium, and Mistral Small are accessed via Mistral AI's API; Kimi-K2-Instruct-0905 is accessed via Moonshot AI API; GPT-4.1, GPT-4o, GPT-5-High, GPT-5-Mini, GPT-5-Minimal, GPT-5-Nano, o3-Pro, o4-Mini-High, and o4-Mini-Low are accessed via OpenAI API; GPT-OSS-120B, GLM-4.5-AIR-FP8 are accessed via Together AI; Qwen3-4b, Qwen3-8b, Qwen3-14b, Qwen3-32b, and Qwen3-80b-a3b-thinking are accessed via dashscope API; Snowflake-Arctic-Instruct is accessed via Replicate API; Grok-3, Grok-4-Fast-Reasoning, and Grok-4-Fast-Non-Reasoning are accessed via xAI's API; and GLM-4.6 is accessed via deepinfra.

What Is Open Source Software? - IBM ibm.com IBM 1 fact

claimMajor organizations including IBM (Granite), Meta (Llama), and Mistral AI are developing open source AI tools for developers and researchers.