measurement
Human benchmarks on the CWQ dataset achieved an Exact Match (EM) score of 63%.

Authors

Sources

Referenced by nodes (2)