reference
The paper 'Safety alignment should be made more than just a few tokens deep' argues that safety alignment in language models requires more depth than current token-level approaches.

Authors

Sources

Referenced by nodes (1)