claim
Existing benchmarks for medical LLMs, such as MedQA and MedMCQA, focus on static tasks like multiple-choice questions or summarization, which do not mirror the dynamic, multi-turn nature of real-world clinical diagnostic reasoning.

Authors

Sources

Referenced by nodes (3)