account
Perez et al. (2022) conducted red-teaming between Language Models to determine if they could produce harmful text without human involvement in generating the adversarial test cases.

Authors

Sources

Referenced by nodes (1)