Advances in Neural Information Processing Systems
Also known as: NeurIPS
Facts (39)
Sources
A Survey on the Theory and Mechanism of Large Language Models arxiv.org Mar 12, 2026 29 facts
referenceThe paper 'White-box transformers via sparse rate reduction' was published in the Advances in Neural Information Processing Systems 36, pages 9422β9457.
referenceThe paper 'Attention is all you need' was published in Advances in Neural Information Processing Systems 30.
referenceThe paper 'Learning diverse and discriminative representations via the principle of maximal coding rate reduction' was published in the Advances in Neural Information Processing Systems 33, pages 9422β9434.
referenceThe paper 'Rl on incorrect synthetic data scales the efficiency of llm math reasoning by eight-fold' was published in Advances in Neural Information Processing Systems 37, pp. 43000β43031, and is cited in 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Tuning large neural networks via zero-shot hyperparameter transfer' was published in Advances in Neural Information Processing Systems 34, pp. 17084β17097.
referenceThe paper 'Representational strengths and limitations of transformers' was published in Advances in Neural Information Processing Systems 36 (pp. 36677β36707) and is cited in section 6.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Adam can converge without any modification on update rules' was published in Advances in neural information processing systems 35 (pp. 28386β28399) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Observational scaling laws and the predictability of langauge model performance' was published in Advances in Neural Information Processing Systems 37 (pp. 15841β15892) and is cited in section 7.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Rethinking llm memorization through the lens of adversarial compression' was published in Advances in Neural Information Processing Systems 37, pp. 56244β56267, and is cited in 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'D4: improving llm pretraining via document de-duplication and diversification' was published in Advances in Neural Information Processing Systems 36, pp. 53983β53995.
referenceThe paper 'Domain adaptation with multiple sources' (Advances in neural information processing systems 21) is cited in section 4.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Tree of thoughts: deliberate problem solving with large language models' was published in Advances in Neural Information Processing Systems 36.
referenceThe paper 'Towards understanding how transformers learn in-context through a representation learning lens' was published in Advances in Neural Information Processing Systems 37 (pp. 892β933) and is cited in section 3.2.3 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Learning and transferring sparse contextual bigrams with linear transformers' was published in Advances in Neural Information Processing Systems and is cited in sections 4.2.2 and 5.3.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Approximation rate of the transformer architecture for sequence modeling' was published in Advances in Neural Information Processing Systems 37, pages 68926β68955.
referenceThe paper 'Scan and snap: understanding training dynamics and token composition in 1-layer transformer' was published in Advances in Neural Information Processing Systems.
referenceThe paper 'Transformers are uninterpretable with myopic methods: a case study with bounded dyck grammars' was published in Advances in Neural Information Processing Systems 36, pages 38723β38766.
referenceThe paper 'Bridging the gap between low-rank and orthogonal adaptation via householder reflection adaptation' was published in the Advances in Neural Information Processing Systems 37, pages 113484β113518.
referenceThe paper 'On sparse modern hopfield model' was published in Advances in Neural Information Processing Systems 36, pages 27594β27608.
referenceThe paper 'Inevitable trade-off between watermark strength and speculative sampling efficiency for language models' was published in Advances in Neural Information Processing Systems 37, pages 55370β55402.
referenceThe paper 'Doremi: optimizing data mixtures speeds up language model pretraining' was published in Advances in Neural Information Processing Systems 36, pages 69798β69818.
referenceThe paper 'Privauditor: benchmarking data protection vulnerabilities in LLM adaptation techniques' is published in Advances in Neural Information Processing Systems 37, pp. 9668β9689, and is cited in section 5.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Inform: mitigating reward hacking in rlhf via information-theoretic reward modeling' (Advances in Neural Information Processing Systems 37, pp. 134387β134429) is cited in section 4.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Transformers from an optimization perspective' was published in Advances in Neural Information Processing Systems 35, pp. 36958β36971.
referenceThe paper 'Understanding scaling laws with statistical and approximation theory for transformer neural networks on intrinsically low-dimensional data' (Advances in Neural Information Processing Systems 37) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding scaling laws.
referenceThe paper 'On mesa-optimization in autoregressively trained transformers: emergence and capability' is published in Advances in Neural Information Processing Systems and is cited in section 6.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Why are adaptive methods good for attention models?' was published in Advances in Neural Information Processing Systems 33 (pp. 15383β15393) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'Why transformers need adam: a hessian perspective' was published in Advances in neural information processing systems 37 (pp. 131786β131823) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.
referenceThe paper 'The impact of initialization on lora finetuning dynamics' (Advances in Neural Information Processing Systems 37) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding LoRA fine-tuning.
A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org 4 facts
referenceShyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas, Viraj Mehta, Pier Giuseppe Sessa, Haitham Bou Ammar, and Ilija Bogunovic authored 'Group robust preference optimization in reward-free rlhf', published in the Advances in Neural Information Processing Systems (NeurIPS) in 2024.
claimYao et al. (2024) introduced 'Tree of Thoughts', a framework for deliberate problem solving using Large Language Models, published in Advances in Neural Information Processing Systems.
referenceThe paper 'Training language models to follow instructions with human feedback' was published in the Advances in Neural Information Processing Systems (NeurIPS) in 2022.
referenceRafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn authored 'Direct preference optimization: Your language model is secretly a reward model', published in the Advances in Neural Information Processing Systems (NeurIPS) in 2023.
Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org Aug 13, 2025 2 facts
referenceThe paper 'Judging LLM-as-a-judge with MT-Bench and Chatbot Arena' by Lianmin Zheng et al. was published in Advances in Neural Information Processing Systems, 36:46595β46623.
referenceGaurang Sriramanan et al. (2024) developed 'LLM-Check', a method for investigating the detection of hallucinations in large language models, published in Advances in Neural Information Processing Systems, volume 37.
Unlocking the Potential of Generative AI through Neuro-Symbolic ... arxiv.org Feb 16, 2025 1 fact
referencePaul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei published 'Deep reinforcement learning from human preferences' in Advances in Neural Information Processing Systems in 2017.
Survey and analysis of hallucinations in large language models frontiersin.org Sep 29, 2025 1 fact
referenceBrown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., et al. published 'Language models are few-shot learners' in Advances in Neural Information Processing Systems 33 in 2020.
LLM-KG4QA: Large Language Models and Knowledge Graphs for ... github.com 1 fact
referenceThe paper 'G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering' was published at NeurIPS in 2024, utilizes the GraphQA dataset, and is categorized under KBQA and KGQA.
Neuro-Symbolic AI: Explainability, Challenges, and Future Trends arxiv.org Nov 7, 2024 1 fact
referenceAnderson et al. (2020) introduced a neuro-symbolic reinforcement learning method that incorporates formally verified exploration, published in Advances in Neural Information Processing Systems.