reference
The paper 'Safety alignment should be made more than just a few tokens deep' argues that safety alignment in language models requires more depth than current token-level approaches.
Authors
Sources
- A Survey on the Theory and Mechanism of Large Language Models arxiv.org via serper
Referenced by nodes (1)
- Language Model concept