Out-of-Distribution
Also known as: Out-of-Distribution, out-of-distribution data, out-of-distribution detection
Facts (12)
Sources
Track: Poster Session 3 - aistats 2026 virtual.aistats.org 7 facts
claimRelational data, such as graphs, often disobey the Independent and Identically Distributed (IID) condition, which complicates the Out-of-Distribution problem, particularly when temporal data is involved.
claimBeomjun Kim, Jaehwan Kim, Kangyeon Kim, Sunwoo Kim, and Heejin Ahn propose a computationally efficient method for quantifying dataset quality that measures how well a dataset covers the input probability distribution to minimize out-of-distribution inputs.
claimThe "Accuracy-on-the-line" phenomenon in machine learning describes a positive correlation between a model's in-distribution (ID) and out-of-distribution (OOD) accuracy across different hyperparameters and data configurations.
claimScaling to larger datasets does not mitigate the "Accuracy-on-the-wrong-line" phenomenon and may exacerbate the negative correlation between ID and OOD accuracy.
claimChristina Baek, Aditi Raghunathan, and Zico Kolter proved a lower bound on the residual of the correlation between in-distribution versus out-of-distribution agreement that grows proportionally with the residual of accuracy.
claimThe "Accuracy-on-the-wrong-line" phenomenon occurs when noisy data, nuisance features, or spurious (shortcut) features cause ID and OOD accuracy to become negatively correlated, shattering the standard Accuracy-on-the-line relationship.
claimOut-of-Distribution (OOD) problems, defined as data discrepancies between training and testing environments, hinder the generalization of foundation models.
A Comprehensive Review of Neuro-symbolic AI for Robustness ... link.springer.com Dec 9, 2025 3 facts
claimPredefined rules in neuro-symbolic systems ensure that outputs are logically coherent and consistent with domain constraints, which mitigates failures caused by adversarial attacks or out-of-distribution (OOD) inputs, as noted in citation 75.
claimEncoding expected structure or behavior via symbolic constraints promotes the learning of more generalizable representations, which improves performance on out-of-distribution (OOD) data where spurious correlations may no longer hold.
referenceResearch by [52] showed that confidence scores can serve as useful signals for detecting out-of-distribution (OOD) inputs and adversarial attacks.
Re-evaluating Hallucination Detection in LLMs - arXiv arxiv.org Aug 13, 2025 1 fact
referenceJie Ren et al. (2023) proposed methods for out-of-distribution detection and selective generation for conditional language models in their paper presented at The Eleventh International Conference on Learning Representations.
A Survey on the Theory and Mechanism of Large Language Models arxiv.org Mar 12, 2026 1 fact
claimChu et al. (2025) provided empirical evidence that Supervised Fine-Tuning (SFT) tends to memorize training data, leading to poor performance on out-of-distribution (OOD) tasks, whereas Reinforcement Learning (RL) demonstrates superior generalization capabilities.