claim
The evaluation of medical agents has evolved from linguistic metrics like BLEU and ROUGE to action-oriented benchmarks such as MedAgentBench and MedAgentBoard.

Authors

Sources

Referenced by nodes (2)