concept

AI safety and security

Also known as: AI safety

Facts (11)

Sources
Building Trustworthy NeuroSymbolic AI Systems - arXiv arxiv.org arXiv 5 facts
claimThe National Science Foundation (NSF) identifies grounding, instructability, and alignment as the three fundamental attributes of ensuring AI safety.
claimThe T5-XL language model, when tuned with domain-specific instructions from the National Institute on Drug Abuse (NIDA) quiz, attempts to ask follow-up questions to gather context, whereas an ungrounded ChatGPT model may produce unsafe responses.
claimAdherence to clinical guidelines is crucial for AI safety, particularly when users attempt to deceive AI agents or seek guidance on sensitive actions such as mental health issues or potential suicide attempts, as discussed by Reagle and Gaur (2022).
claimInstructability in AI safety refers to the assurance that an AI system understands and complies with user preferences, policies, and moral beliefs.
claimThe National Science Foundation (NSF) launched two programs, Safety-enabled Learning and Strengthening AI, in response to heightened attention surrounding AI safety.
Cybersecurity Trends and Predictions 2025 From Industry Insiders itprotoday.com ITPro Today 1 fact
perspectiveThe field of AI security and safety is expected to mature significantly in 2025 as real-world use cases for generative AI emerge, addressing AI as a target, a tool, and a threat.
A Survey on the Theory and Mechanism of Large Language Models arxiv.org arXiv Mar 12, 2026 1 fact
claimOpenAI (2023) defined "Superalignment" as the critical AI safety challenge of ensuring that superintelligent AI systems act in accordance with human values, intentions, and goals.
Understanding LLM Understanding skywritingspress.ca Skywritings Press Jun 14, 2024 1 fact
perspectivePredicting AI behaviors at scale, particularly phase transitions and emergence, is considered highly important for AI safety and alignment with human intent.
How Open-Source AI Drives Responsible Innovation - The Atlantic theatlantic.com The Atlantic 1 fact
measurementThe Partnership on AI has been working on AI safety and responsibility since 2016.
Enterprise AI Requires the Fusion of LLM and Knowledge Graph stardog.com Stardog Dec 4, 2024 1 fact
claimTo effectively ground LLM outputs in enterprise knowledge, a Knowledge Graph must contain knowledge from both database records and enterprise documents, a process Stardog calls 'extending AI safety by extending AI’s data reach.'
The Evidence for AI Consciousness, Today - AI Frontiers ai-frontiers.org AI Frontiers Dec 8, 2025 1 fact
perspectiveThe author advocates for classifying consciousness research as a core component of AI safety work, utilizing tools such as mechanistic interpretability, comparative computational neuroscience, and open-weight models.