claim
In the TrafficQA dataset, only GPT-4o successfully generated satisfactory results when numerical comparison was required.

Authors

Sources

Referenced by nodes (1)