entity

International Conference on Machine Learning

Also known as: ICML

Facts (38)

Sources

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv Mar 12, 2026 31 facts

referenceThe paper 'Asymmetry in low-rank adapters of foundation models' is published in the Proceedings of the 41st International Conference on Machine Learning, pp. 62369–62385, and is cited in section 5.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Fundamental limitations of alignment in large language models' exists as an arXiv preprint (arXiv:2304.11082) and was also published in the Proceedings of the 41st International Conference on Machine Learning, pages 53079–53112.

referenceThe paper 'DPO meets PPO: reinforced token optimization for RLHF' is published in the Proceedings of the 42nd International Conference on Machine Learning, Vol. 267, pp. 78498–78521, and is cited in section 3.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Resurrecting recurrent neural networks for long sequences' was published in the International Conference on Machine Learning, pp. 26670–26698.

referenceThe paper 'The composition theorem for differential privacy' was published in the International Conference on Machine Learning, pages 1376–1385, and is cited in section 7.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Deduplicating training data mitigates privacy risks in language models' was published in the International Conference on Machine Learning, pages 10697–10707, and is cited in section 2.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'A kernel-based view of language model fine-tuning' (International Conference on Machine Learning, pp. 23610–23641) is cited in section 3.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe research paper 'ProRL: prolonged reinforcement learning expands reasoning boundaries in large language models' was published in the International Conference on Machine Learning, pp. 4051–4060, and cited in section 7.2.2 of the survey.

referenceThe paper 'Calibrate before use: improving few-shot performance of language models' is published in the International Conference on Machine Learning, pp. 12697–12706, and is cited in section 4.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Position: will we run out of data? limits of llm scaling based on human-generated data' was published in the Forty-first International Conference on Machine Learning.

referenceThe paper 'Beyond zero initialization: investigating the impact of non-zero initialization on LoRA fine-tuning dynamics' was published in the Proceedings of the 42nd International Conference on Machine Learning, Vol. 267, pp. 35519–35535.

referenceThe paper 'Great models think alike and this undermines AI oversight' was published in the Forty-second International Conference on Machine Learning.

referenceThe paper 'LoRA training in the ntk regime has no spurious local minima' was published in the International Conference on Machine Learning, pages 21306–21328.

referenceThe paper 'LoRA+ efficient low rank adaptation of large models' (Proceedings of the 41st International Conference on Machine Learning) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding parameter-efficient fine-tuning.

referenceThe paper 'Subspace optimization for large language models with convergence guarantees' was published in the Proceedings of the 42nd International Conference on Machine Learning, Volume 267, pages 22468–22522.

referenceThe paper 'Inherent trade-offs between diversity and stability in multi-task benchmarks' was published in the International Conference on Machine Learning (pp. 58984–59002) and is cited in section 7.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'LoRA-one: one-step full gradient could suffice for fine-tuning large language models, provably and efficiently' was published in the Proceedings of the 42nd International Conference on Machine Learning (Vol. 267, pp. 75513–75574) and is cited in section 4.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Understanding the robustness in vision transformers' is published in the International Conference on Machine Learning, pp. 27378–27394, and is cited in section 3.3.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Transformers are rnns: fast autoregressive transformers with linear attention' was published in the International Conference on Machine Learning, pages 5156–5165, and is cited in section 3.2.3 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Transformers learn nonlinear features in context: nonconvex mean-field dynamics on the attention landscape' was published in the Forty-first International Conference on Machine Learning and is cited in section 3.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'In-context convergence of transformers' was published in the Forty-first International Conference on Machine Learning.

referenceThe paper 'Shampoo: preconditioned stochastic tensor optimization' (published in International Conference on Machine Learning) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding optimization.

referenceThe paper 'Transformers learn in-context by gradient descent' was published in the International Conference on Machine Learning, pp. 35151–35174.

referenceThe paper 'How do transformers learn topic structure: towards a mechanistic understanding' was published in the International Conference on Machine Learning, pp. 19689–19729.

referenceThe paper 'Iterative preference learning from human feedback: bridging theory and practice for rlhf under kl-constraint' was published in the International Conference on Machine Learning, pages 54715–54754.

referenceThe research paper 'On the optimization landscape of low rank adaptation methods for large language models' was published in the International Conference on Machine Learning, pp. 32100–32121, and cited in section 4.2.2 of the survey.

referenceThe paper 'The dual form of neural networks revisited: connecting test time predictions to training patterns via spotlights of attention' was published in the International Conference on Machine Learning, pages 9639–9659.

referenceThe paper 'On the role of attention in prompt-tuning' was published in the International Conference on Machine Learning, pp. 26724–26768.

referenceThe paper 'LoRA training provably converges to a low-rank global minimum or it fails loudly (But it probably won’t fail)' was published in the Proceedings of the 42nd International Conference on Machine Learning, Vol. 267, pages 30224–30247, and is cited in section 4.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Understanding chain-of-thought in LLMs through information theory' was published in the Proceedings of the 42nd International Conference on Machine Learning, Vol. 267, pp. 59784–59811, edited by A. Singh, M. Fazel, D. Hsu, S. Lacoste-Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, and J. Zhu.

referenceThe paper 'GaLore: memory-efficient LLM training by gradient low-rank projection' is published in the International Conference on Machine Learning, pp. 61121–61143, and is cited in sections 1 and 7.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

Neuro-Symbolic AI: Explainability, Challenges, and Future Trends arxiv.org arXiv Nov 7, 2024 3 facts

referenceAlon et al. (2022) introduced a method for neuro-symbolic language modeling using automaton-augmented retrieval, published in the International Conference on Machine Learning.

referenceAmizadeh et al. (2020) presented a neuro-symbolic approach to visual reasoning focused on disentangling, published in the International Conference on Machine Learning.

referenceGlanois et al. (2022) introduced a method for neuro-symbolic hierarchical rule induction, presented at the International Conference on Machine Learning.

A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org arXiv 2 facts

measurementThe authors of 'A Survey of Incorporating Psychological Theories in LLMs' surveyed 175 papers from major computational linguistics venues (ACL Anthology), COLING, NeurIPS, ICML, ICLR, and influential arXiv preprints published between late 2021 and early 2025.

referenceKhan et al. (2024) published 'Debating with more persuasive llms leads to more truthful answers' in the Proceedings of the 41st International Conference on Machine Learning, asserting that persuasive LLMs improve truthfulness in debates.

A Comprehensive Review of Neuro-symbolic AI for Robustness ... link.springer.com Springer Dec 9, 2025 1 fact

referenceKoh et al. introduced 'Concept bottleneck models' in their 2020 paper presented at the International Conference on Machine Learning.

Construction of Knowledge Graphs: State and Challenges - arXiv arxiv.org arXiv 1 fact

referenceThe paper 'Inductive Relation Prediction by Subgraph Reasoning' by K.K. Teru, E.G. Denis, and W.L. Hamilton, published in the Proceedings of the 37th International Conference on Machine Learning in 2020, covers inductive relation prediction methods.