concept

Gemini

Also known as: gemini-2.0

Facts (23)

Sources

Medical Hallucination in Foundation Models and Their ... medrxiv.org medRxiv Mar 3, 2025 4 facts

claimProminent large language models include OpenAI’s GPT series, Google’s Gemini, Anthropic’s Claude, and Meta’s Llama family.

measurementThe most commonly mentioned AI/LLM tools by survey respondents were ChatGPT (30 mentions), followed by Claude (20), Google Bard/Gemini (16), Llama (15), Perplexity (9), Alphafold (2), and Scite and Consensus (1).

claimThe models gemini-2.0 and deepseek-r1 demonstrate robust hallucination resistance, positioning themselves alongside o1-preview and outperforming earlier models.

measurementThe highest-performing models, including gemini-2.0-thinking, gemini-2.0, and deepseek-r1, cluster in the high similarity score range of 0.7-0.9, indicating a strong semantic alignment of their outputs with ground truth medical information.

Reference Hallucination Score for Medical Artificial ... medinform.jmir.org JMIR Medical Informatics Jul 31, 2024 3 facts

referenceBirinci M, Kilictas A, Gül O, Yemiş T, Erdivanlı B, Çeliker M, Özgür A, Çelebi Erdivanlı Ö, and Dursun E authored 'Large Language Models for Cochlear Implant Education: A Comparison of ChatGPT, Gemini, Claude, and DeepSeek', published in Otolaryngology–Head and Neck Surgery in 2026.

referenceWhitfield S and Yang S authored 'Evaluating AI Language Models for Reference Services: A Comparative Study of ChatGPT, Gemini, and Copilot', published in Internet Reference Services Quarterly in 2025.

referenceAsiri (2025) assessed the reliability of ChatGPT and Gemini in identifying relevant orthodontic literature, published in the European Journal of General Dentistry.

A Comprehensive Benchmark and Evaluation Framework for Multi ... arxiv.org arXiv Jan 6, 2026 2 facts

measurementThe Majority Voting strategy for ensemble LLM judges consistently produces stable agreement with human clinical experts, maintaining F1-scores in the 75–79% range across Doctor Agents including DeepSeek, Gemini, and GPT-5.

claimStudies by Maina et al. identify persistent challenges in LLM-as-a-Judge methods, including verbosity bias, inconsistency in low-resource languages, and a 'severity gap' where models like GPT-5 and Gemini exhibit divergent leniency compared to human clinicians.

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv Mar 12, 2026 2 facts

claimLarge Language Models such as ChatGPT (OpenAI, 2022), DeepSeek (Guo et al., 2025), Qwen (Bai et al., 2023a), Llama (Touvron et al., 2023), Gemini (Team et al., 2023), and Claude (Caruccio et al., 2024) have transcended the boundaries of traditional Natural Language Processing as established by Vaswani et al. (2017a).

referenceThe paper 'Gemini: a family of highly capable multimodal models' is an arXiv preprint, arXiv:2312.11805.

The Synergy of Symbolic and Connectionist AI in LLM-Empowered ... arxiv.org arXiv Jul 11, 2024 2 facts

referenceThe Gemini Team published a technical report on the Gemini family of multimodal models as an arXiv preprint in 2023.

claimLarge Language Models (LLMs) are transformer-based language models, including OpenAI’s GPT-4, Google’s Gemini and PaLM, Microsoft’s Phi-3, and Meta’s LLaMA.

Detecting and Evaluating Medical Hallucinations in Large Vision ... arxiv.org arXiv Jun 14, 2024 2 facts

claimWhen evaluating hallucination detection capabilities, Gemini correctly detected hallucination types but did not follow instructions well, providing extensive explanations for its classifications.

claimThe MediHallDetector model surpasses GPT-3.5, GPT-4, and Gemini in hallucination detection performance and improves efficiency compared to manual evaluation, though it still trails human performance.

Pascale Fung's Post - LLM Hallucination Benchmark linkedin.com Pascale Fung · LinkedIn 11 months ago 1 fact

perspectiveFuture iterations of the HalluLens benchmark could be strengthened by including diverse models such as Gemini and incorporating human raters.

A survey on augmenting knowledge graphs (KGs) with large ... link.springer.com Springer Nov 4, 2024 1 fact

referenceTeam G, Anil R, Borgeaud S, Wu Y, Alayrac J-B, Yu J, Soricut R, Schalkwyk J, Dai AM, Hauth A et al. authored 'Gemini: a family of highly capable multimodal models', published as an arXiv preprint in 2023.

Building Better Agentic Systems with Neuro-Symbolic AI cutter.com Cutter Consortium Dec 10, 2025 1 fact

claimDeep learning neural network-based large language models, such as GPT-4, Claude, and Gemini, process unstructured data including text, images, video, and streaming sensor data to learn patterns, classify data, and make predictions.

Real-Time Evaluation Models for RAG: Who Detects Hallucinations ... cleanlab.ai Cleanlab Apr 7, 2025 1 fact

claimEvaluation techniques such as 'LLM-as-a-judge' or 'TLM' (Trustworthy Language Model) can be powered by any Large Language Model and do not require specific data preparation, labeling, or custom model infrastructure, provided the user has infrastructure to run pre-trained LLMs like AWS Bedrock, Azure/OpenAI, Gemini, or Together.ai.

Bridging the Gap Between LLMs and Evolving Medical Knowledge arxiv.org arXiv Jun 29, 2025 1 fact

referenceSaab et al. (2024) published 'Capabilities of gemini models in medicine' as an arXiv preprint (arXiv:2404.18416).

A Comprehensive Review of Neuro-symbolic AI for Robustness ... link.springer.com Springer Dec 9, 2025 1 fact

claimDeepMind’s AlphaProof integrates an AlphaZero-style planner with a Gemini-class large language model to solve Olympiad-level mathematical proofs while adhering to formal soundness guarantees.

Combining Knowledge Graphs and Large Language Models - arXiv arxiv.org arXiv Jul 9, 2024 1 fact

claimMultimodal Large Language Models, such as Google's Gemini and GPT-4 with vision (GPT-4V), possess vision capabilities.

Medical Hallucination in Foundation Models and Their Impact on ... medrxiv.org medRxiv Nov 2, 2025 1 fact

claimMedGemma is a specialized open-source multimodal model built on the Gemma 3 architecture, incorporating research and technology derived from Google’s proprietary Gemini models.