entity

Advances in Neural Information Processing Systems

Also known as: NeurIPS

Facts (39)

Sources

A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv Mar 12, 2026 29 facts

referenceThe paper 'White-box transformers via sparse rate reduction' was published in the Advances in Neural Information Processing Systems 36, pages 9422–9457.

referenceThe paper 'Attention is all you need' was published in Advances in Neural Information Processing Systems 30.

referenceThe paper 'Learning diverse and discriminative representations via the principle of maximal coding rate reduction' was published in the Advances in Neural Information Processing Systems 33, pages 9422–9434.

referenceThe paper 'Rl on incorrect synthetic data scales the efficiency of llm math reasoning by eight-fold' was published in Advances in Neural Information Processing Systems 37, pp. 43000–43031, and is cited in 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Tuning large neural networks via zero-shot hyperparameter transfer' was published in Advances in Neural Information Processing Systems 34, pp. 17084–17097.

referenceThe paper 'Representational strengths and limitations of transformers' was published in Advances in Neural Information Processing Systems 36 (pp. 36677–36707) and is cited in section 6.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Adam can converge without any modification on update rules' was published in Advances in neural information processing systems 35 (pp. 28386–28399) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Observational scaling laws and the predictability of langauge model performance' was published in Advances in Neural Information Processing Systems 37 (pp. 15841–15892) and is cited in section 7.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Rethinking llm memorization through the lens of adversarial compression' was published in Advances in Neural Information Processing Systems 37, pp. 56244–56267, and is cited in 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'D4: improving llm pretraining via document de-duplication and diversification' was published in Advances in Neural Information Processing Systems 36, pp. 53983–53995.

referenceThe paper 'Domain adaptation with multiple sources' (Advances in neural information processing systems 21) is cited in section 4.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Tree of thoughts: deliberate problem solving with large language models' was published in Advances in Neural Information Processing Systems 36.

referenceThe paper 'Towards understanding how transformers learn in-context through a representation learning lens' was published in Advances in Neural Information Processing Systems 37 (pp. 892–933) and is cited in section 3.2.3 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Learning and transferring sparse contextual bigrams with linear transformers' was published in Advances in Neural Information Processing Systems and is cited in sections 4.2.2 and 5.3.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Approximation rate of the transformer architecture for sequence modeling' was published in Advances in Neural Information Processing Systems 37, pages 68926–68955.

referenceThe paper 'Scan and snap: understanding training dynamics and token composition in 1-layer transformer' was published in Advances in Neural Information Processing Systems.

referenceThe paper 'Transformers are uninterpretable with myopic methods: a case study with bounded dyck grammars' was published in Advances in Neural Information Processing Systems 36, pages 38723–38766.

referenceThe paper 'Bridging the gap between low-rank and orthogonal adaptation via householder reflection adaptation' was published in the Advances in Neural Information Processing Systems 37, pages 113484–113518.

referenceThe paper 'On sparse modern hopfield model' was published in Advances in Neural Information Processing Systems 36, pages 27594–27608.

referenceThe paper 'Inevitable trade-off between watermark strength and speculative sampling efficiency for language models' was published in Advances in Neural Information Processing Systems 37, pages 55370–55402.

referenceThe paper 'Doremi: optimizing data mixtures speeds up language model pretraining' was published in Advances in Neural Information Processing Systems 36, pages 69798–69818.

referenceThe paper 'Privauditor: benchmarking data protection vulnerabilities in LLM adaptation techniques' is published in Advances in Neural Information Processing Systems 37, pp. 9668–9689, and is cited in section 5.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Inform: mitigating reward hacking in rlhf via information-theoretic reward modeling' (Advances in Neural Information Processing Systems 37, pp. 134387–134429) is cited in section 4.2.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Transformers from an optimization perspective' was published in Advances in Neural Information Processing Systems 35, pp. 36958–36971.

referenceThe paper 'Understanding scaling laws with statistical and approximation theory for transformer neural networks on intrinsically low-dimensional data' (Advances in Neural Information Processing Systems 37) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding scaling laws.

referenceThe paper 'On mesa-optimization in autoregressively trained transformers: emergence and capability' is published in Advances in Neural Information Processing Systems and is cited in section 6.2.1 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Why are adaptive methods good for attention models?' was published in Advances in Neural Information Processing Systems 33 (pp. 15383–15393) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'Why transformers need adam: a hessian perspective' was published in Advances in neural information processing systems 37 (pp. 131786–131823) and is cited in section 4.3.2 of 'A Survey on the Theory and Mechanism of Large Language Models'.

referenceThe paper 'The impact of initialization on lora finetuning dynamics' (Advances in Neural Information Processing Systems 37) is cited in the survey 'A Survey on the Theory and Mechanism of Large Language Models' regarding LoRA fine-tuning.

A Survey of Incorporating Psychological Theories in LLMs - arXiv arxiv.org arXiv 4 facts

referenceShyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas, Viraj Mehta, Pier Giuseppe Sessa, Haitham Bou Ammar, and Ilija Bogunovic authored 'Group robust preference optimization in reward-free rlhf', published in the Advances in Neural Information Processing Systems (NeurIPS) in 2024.

claimYao et al. (2024) introduced 'Tree of Thoughts', a framework for deliberate problem solving using Large Language Models, published in Advances in Neural Information Processing Systems.

referenceThe paper 'Training language models to follow instructions with human feedback' was published in the Advances in Neural Information Processing Systems (NeurIPS) in 2022.

referenceRafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn authored 'Direct preference optimization: Your language model is secretly a reward model', published in the Advances in Neural Information Processing Systems (NeurIPS) in 2023.

Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org arXiv Aug 13, 2025 2 facts

referenceThe paper 'Judging LLM-as-a-judge with MT-Bench and Chatbot Arena' by Lianmin Zheng et al. was published in Advances in Neural Information Processing Systems, 36:46595–46623.

referenceGaurang Sriramanan et al. (2024) developed 'LLM-Check', a method for investigating the detection of hallucinations in large language models, published in Advances in Neural Information Processing Systems, volume 37.

Unlocking the Potential of Generative AI through Neuro-Symbolic ... arxiv.org arXiv Feb 16, 2025 1 fact

referencePaul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei published 'Deep reinforcement learning from human preferences' in Advances in Neural Information Processing Systems in 2017.

Survey and analysis of hallucinations in large language models frontiersin.org Frontiers Sep 29, 2025 1 fact

referenceBrown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., et al. published 'Language models are few-shot learners' in Advances in Neural Information Processing Systems 33 in 2020.

LLM-KG4QA: Large Language Models and Knowledge Graphs for ... github.com GitHub 1 fact

referenceThe paper 'G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering' was published at NeurIPS in 2024, utilizes the GraphQA dataset, and is categorized under KBQA and KGQA.

Neuro-Symbolic AI: Explainability, Challenges, and Future Trends arxiv.org arXiv Nov 7, 2024 1 fact

referenceAnderson et al. (2020) introduced a neuro-symbolic reinforcement learning method that incorporates formally verified exploration, published in Advances in Neural Information Processing Systems.