reference
The paper 'MACPO: weak-to-strong alignment via multi-agent contrastive preference optimization' is cited in section 7.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models' and was presented at The Thirteenth International Conference on Learning Representations.

Authors

Sources

Referenced by nodes (1)