measurement
The AMG-RAG framework achieved an F1 score of 74.1% on the MEDQA benchmark and an accuracy of 66.34% on the MEDMCQA benchmark, outperforming comparable models and models 10 to 100 times larger.
Authors
Sources
- Bridging the Gap Between LLMs and Evolving Medical Knowledge arxiv.org via serper