multi-turn conversations
Also known as: multi-turn interaction, multi-turn dialogues, multi-turn interactions, Multi-turn Interaction, multi-turn conversations
Facts (14)
Sources
A Comprehensive Benchmark and Evaluation Framework for Multi ... arxiv.org Jan 6, 2026 7 facts
claimLLM-Mini-CEX supports multi-turn interactions, includes key points rubrics, and is expert-validated.
claimMed-PaLM 2 supports multi-turn interactions and is expert-validated, but does not include key points rubrics.
claimLiao et al., AgentClinic, MediQ, and MAQuE support multi-turn interactions but do not include key points rubrics and are not expert-validated.
claimMedMCQA and LLM-MedQA do not support multi-turn interactions, do not include key points rubrics, and are not expert-validated.
measurementThe HealthBench framework supports multi-turn interactions, includes key points rubrics, is expert-validated, and contains 48,562 rubrics.
measurementHealthBench features 5,000 multi-turn conversations evaluated against over 48,000 unique rubric criteria validated by 262 physicians.
measurementThe MedDialogRubrics framework, introduced by the authors of the study, supports multi-turn interactions, includes key points rubrics, is expert-validated, and contains 60,000 rubrics.
Large Language Models Meet Knowledge Graphs for Question ... arxiv.org Sep 22, 2025 3 facts
referenceThe InteractiveKBQA method, proposed by Xiong et al. in 2024, uses Multi-turn Interaction for Observation and Thinking with GPT-4-Turbo, Mistral-7B, and Llama-2-13B models and Freebase, Wikidata, and Movie KG knowledge graphs for KBQA and domain-specific QA, evaluated using F1, Hits@1, EM, and Acc metrics on the WQSP, CWQ, KQA Pro, and MetaQA datasets.
referenceCoRnNetA improves the interpretation of multi-turn interactions with knowledge graphs by introducing large language model-based question reformulation, a reinforcement learning agent, and a soft reward mechanism.
referenceGuanming Xiong, Junwei Bao, and Wen Zhao authored 'Interactive-KBQA: Multi-turn interactions for knowledge base question answering with large language models', published in the 2024 ACL proceedings.
A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org 2 facts
measurementWu et al. (2025a) released the RAIDEN Benchmark, which consists of 40,000 multi-turn dialogues for LLM agents.
referenceZhang et al. (2024b) developed a holistic automated red teaming method for large language models that utilizes top-down test case generation and multi-turn interaction, published in the Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.
LLM Observability: How to Monitor AI When It Thinks in Tokens | TTMS ttms.com Feb 10, 2026 1 fact
claimIn multi-turn interactions, LLMs may experience inconsistencies and drift, where the model contradicts itself or loses track of context, potentially frustrating users and degrading trust.
Practices, opportunities and challenges in the fusion of knowledge ... frontiersin.org 1 fact
claimCurrent Knowledge Graph Question Answering (KGQA) systems frequently mishandle contextual continuity in multi-turn dialogues by either dropping or misapplying key constraints such as temporal filters.