claim
In the MedDialogRubrics benchmark, increasing context length does not guarantee better diagnostic reasoning in Large Language Models, as the bottleneck lies in active inquiry planning.

Authors

Sources

Referenced by nodes (2)