Fact — reference — Knowledge Tree

MedDialogRubrics is a benchmark and evaluation framework designed to assess the multi-turn inquiry abilities of medical Large Language Models (LLMs) by focusing on fine-grained, human-aligned evaluation of the diagnostic process rather than just single-turn QA or final diagnosis accuracy.

Authors

Person: Not available Organization: arXiv
A Comprehensive Benchmark and Evaluation Framework for Multi ...

Sources

A Comprehensive Benchmark and Evaluation Framework for Multi ... arxiv.org arXiv via serper

Referenced by nodes (2)

Question Answering concept
MedDialogRubrics concept