measurement
GPT-4 solves approximately 75% of false-belief tasks, which is comparable to the performance of a 6-year-old human, as reported by Kosinski (2024) and Strachan et al. (2024).

Authors

Sources

Referenced by nodes (1)