measurement
The Liberal Strategy for ensemble LLM judges achieves the highest alignment metrics with human clinical experts, particularly for the GPT-5 model.

Authors

Sources

Referenced by nodes (2)