measurement
MedDialogRubrics utilizes 5,200 synthetic patient cases and over 60,000 expert-refined rubric criteria to assess diagnostic correctness, completeness, logic, and the effectiveness of information gathering in medical LLMs.
Authors
Sources
- A Comprehensive Benchmark and Evaluation Framework for Multi ... arxiv.org via serper
Referenced by nodes (1)
- MedDialogRubrics concept