measurement
The evaluation framework included 15 open-source models ranging from 8 billion to 1 trillion parameters, and 10 proprietary models from OpenAI, Google, Anthropic, and xAI.

Authors

Sources

Referenced by nodes (4)