claim
The MedDialogRubrics framework evaluates the medical consultation capabilities of four representative Large Language Models (LLMs) functioning as doctor agents and incorporates over 60,000 expert-annotated rubric criteria across more than 4,700 cases.
Authors
Sources
- A Comprehensive Benchmark and Evaluation Framework for Multi ... arxiv.org via serper
Referenced by nodes (2)
- Large Language Models concept
- MedDialogRubrics concept