claim
Traditional n-gram overlap measures like ROUGE are limited in their ability to reliably assess factual consistency in AI systems.

Authors

Sources

Referenced by nodes (2)