$ timeahead.in
← back
$ articles --tag safety

#safety

100 articles

01
Specialization Beats Scale: A Strategic Variable Most AI Procurement Decisions Overlook
Specialization Beats Scale: A Strategic Variable Most AI Procurement Decisions Overlook When a model’s training history …
Hugging Face BlogResearch#inference#benchmark#training
24d
02
US scrambles to stop Internet users re-creating dead pilots’ voices
Pilots’ voices from the last seconds of a fatal cargo plane crash have been re-created by Internet sleuths using softwar…
Ars Technica AI#multimodal#safety
24d
03
Spotify is launching AI-generated remixes
Spotify and Universal Music Group (UMG) just announced a licensing deal that will allow users to prompt the creation of …
The Verge AI#safety
25d
04
Tech researchers are suing the Trump administration over the future of online safety
Tech researchers are suing the Trump administration over the future of online safety In a lawsuit, the Coalition for Ind…
MIT Technology ReviewResearch#safety
25d
05
The Download: online safety’s future and climate tech’s big pivot
The Download: online safety’s future and climate tech’s big pivot Plus: SpaceX has filed for an IPO expected to be the l…
MIT Technology ReviewResearch#safety
25d
06
May 19, 2026 Announcements KPMG integrates Claude across its core business and workforce of more than 276,000 in strategic alliance
KPMG integrates Claude across its core business and workforce of more than 276,000 in strategic alliance KPMG—one of the…
Anthropic NewsResearch#claude#safety
27d
07
May 19, 2026 Announcements Widening the conversation on frontier AI
Widening the conversation on frontier AI At Anthropic, we want to build AI systems that advance humanity and act for the…
Anthropic NewsResearch#safety
27d
08
Import AI 457: AI stuxnet; cursed Muon optimizer; and positive alignment
Import AI 457: AI stuxnet; cursed Muon optimizer; and positive alignment Welcome to Import AI, a newsletter about AI res…
Import AI (Jack Clark)Research#safety
28d
09
May 18, 2026 Announcements Anthropic acquires Stainless
Anthropic acquires Stainless The frontier of AI is shifting from models that answer to agents that act—and agents are on…
Anthropic NewsResearch#safety
31d
10
Musk v. Altman week 3: Musk and Altman traded blows over each other’s credibility. Now the jury will pick a side.
Musk v. Altman week 3: Musk and Altman traded blows over each other’s credibility. Now the jury will pick a side. The tr…
MIT Technology Review#safety
31d
11
Google updates its spam rules to include attempts to ‘manipulate’ AI
Google updated its spam policy to mark attempts to “manipulate” its AI model in search results as spam, including result…
The Verge AI#safety
31d
12
Announcements PwC is deploying Claude to build technology, execute deals, and reinvent enterprise functions for clients
PwC is deploying Claude to build technology, execute deals, and reinvent enterprise functions for clients Anthropic and …
Anthropic NewsResearch#claude#safety
31d
13
Behold, the Elon Musk jackass trophy
Yesterday, in Musk v. Altman, before the jurors came in, Sam Altman’s team passed up what looked — from a distance — lik…
The Verge AIResearch#safety
32d
14
Helping ChatGPT better recognize context in sensitive conversations
Helping ChatGPT better recognize context in sensitive conversations New safety updates help ChatGPT respond safely when …
OpenAI BlogTutorial#gpt#safety
32d
15
Your doctor’s AI notetaker may be making things up, Ontario audit finds
In recent years, many overworked doctors have turned to so-called AI medical scribes to help automatically summarize pat…
Ars Technica AIFrameworks#observability#safety
32d
16
May 14, 2026 Announcements Anthropic forms $200 million partnership with the Gates Foundation
Anthropic forms $200 million partnership with the Gates Foundation We’re partnering with the Gates Foundation to commit …
Anthropic NewsResearch#safety
32d
17
OpenAI Brings Its Ass to Court
Wednesday’s episode of the Musk v. Altman trial kicked off on Wednesday with a unique proposition: OpenAI wanted to brin…
Wired AIResearch#safety
33d
18
Anthropic blames dystopian sci-fi for training AI models to act “evil”
Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may r…
Ars Technica AIResearch#claude#training#safety
33d
19
Learning on the Shop floor
11th May 2026 - Link Blog Learning on the Shop floor. Tobias Lütke describes Shopify's internal coding agent tool, River…
Simon Willison BlogAgents#agents#coding#safety
35d
20
I Work in Hollywood. Everyone Who Used to Make TV Is Now Secretly Training AI
My name on the platform is ri611. Or h924092b12ee797f, depending on who’s paying me. I work as an AI trainer. I assess w…
35d
21
Introducing Trusted Contact in ChatGPT
Introducing Trusted Contact in ChatGPT Connecting you to someone you trust when it matters most. People use ChatGPT to l…
OpenAI BlogRelease#gpt#safety
39d
22
Trump Pivots on AI Regulation, Worker Ousted by DOGE Runs for Office, and Hantavirus Explained
This week on Uncanny Valley, the team discusses the surprising reports of the Trump administration seemingly reversing i…
Wired AI#safety
39d
23
ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns
OpenAI is launching an optional safety feature for ChatGPT that allows adult users to assign an emergency contact for me…
The Verge AI#gpt#safety
39d
24
The balcony solar boom is coming to the US
The balcony solar boom is coming to the US Plug-in panels are getting popular—how do we make sure they’re safe? Dozens o…
MIT Technology Review#safety
39d
25
Spooked by Mythos, Trump suddenly realized AI safety testing might be good
This week, the Trump administration back pedaled and signed agreements with Google DeepMind, Microsoft, and xAI to run g…
Ars Technica AIModel#claude#safety
40d
26
Mira Murati tells the court that she couldn’t trust Sam Altman’s words
Mira Murati, OpenAI’s former CTO, has testified under oath that CEO Sam Altman lied to her about the safety standards fo…
The Verge AIInfra#multimodal#safety
40d
27
Advancing youth safety and wellbeing in EMEA
Advancing youth safety and wellbeing in EMEA Announcing our European Youth Safety Blueprint and EMEA Youth & Wellbeing G…
OpenAI BlogInfra#inference#safety
41d
28
May 4, 2026 Announcements Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs Anthropic, Blacksto…
Anthropic NewsResearch#safety
42d
29
A new US phone network for Christians aims to block porn and gender-related content
A new US phone network for Christians aims to block porn and gender-related content Launching next week on T-Mobile's ne…
MIT Technology ReviewRelease#safety
45d
30
Elon Musk's 7 biggest stumbles on the stand at OpenAI trial
Elon Musk seems tired and cranky. On Thursday, he took the stand for the third day in a four-week trial stemming from hi…
Ars Technica AI#safety
46d
31
Emergency First Responders Say Waymos Are Getting Worse
Emergency first-responder leaders told federal regulators in a private meeting last month that they were frustrated with…
47d1 view
32
Sam Altman is “the face of evil” for not reporting school shooter, says lawyer
OpenAI could have prevented one of the deadliest mass shootings in Canada’s history, a string of seven lawsuits filed We…
Ars Technica AI#gpt#local#safety
47d
33
Our commitment to community safety
Our commitment to community safety Mass shootings, threats against public officials, bombing attempts, and attacks on co…
OpenAI BlogTutorial#gpt#safety
48d
34
Google and Pentagon reportedly agree on deal for ‘any lawful’ use of AI
Google has signed a classified deal that allows the US Department of Defense to use its AI models for “any lawful govern…
The Verge AI#safety
48d
35
Apr 28, 2026 Announcements Claude for Creative Work
Claude for Creative Work Creative professionals look to technology to expand what's possible in their work. Claude can't…
Anthropic NewsResearch#claude#safety
49d1 view
36
Apr 27, 2026 Announcements Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office
Anthropic names Theo Hourmouzis General Manager of Australia & New Zealand and officially opens Sydney office Theo Hourm…
Anthropic NewsResearch#safety
49d1 view
37
Man faces 5 years in prison for using AI to fake sighting of runaway wolf
A 40-year-old man was arrested after using artificial intelligence to generate a fake image of a runaway wolf that South…
Ars Technica AIOpen Source#safety
52d
38
GPT-5.5 Bio Bug Bounty
GPT‑5.5 Bio Bug Bounty Testing universal jailbreaks for biorisks in GPT‑5.5 As part of our ongoing efforts to strengthen…
OpenAI BlogModel#safety
53d
39
Apr 24, 2026 Announcements An update on our election safeguards
An update on our election safeguards People around the world turn to Claude for information about political parties, can…
Anthropic NewsResearch#safety
53d
40
Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4
Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4 At what point do the financial m…
Import AI (Jack Clark)Research#safety
56d
41
Apr 20, 2026 Announcements Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute
Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute We have signed a new agreement with Amazo…
Anthropic NewsResearch#safety
56d
42
Apr 14, 2026 Announcements Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors
Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors Vas Narasimhan has been appointed to A…
Anthropic NewsResearch#safety
63d
43
Responsible and safe use of AI
Responsible and safe use of AI Learn best practices for using ChatGPT safely and effectively. AI is a transformative new…
OpenAI BlogTutorial#gpt#safety
66d
44
SOTA Normalization Performance with torch.compile
Introduction Normalization methods (LayerNorm/RMSNorm) are foundational in deep learning and are used to normalize value…
PyTorch BlogResearch#training#gpu#safety
68d
45
Introducing the Child Safety Blueprint
Introducing the Child Safety Blueprint A framework for combatting and preventing AI-enabled Child Sexual Exploitation Ch…
OpenAI BlogRelease#safety
68d
46
Announcing the OpenAI Safety Fellowship
Introducing the OpenAI Safety Fellowship A pilot program to support independent safety and alignment research and develo…
OpenAI BlogResearch#safety
70d
47
Apr 6, 2026 Announcements Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute
Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute We have signed …
Anthropic NewsResearch#safety
70d
48
Mar 31, 2026 Announcements Australian government and Anthropic sign MOU for AI safety and research
Australian government and Anthropic sign MOU for AI safety and research Today, Anthropic signed a Memorandum of Understa…
Anthropic NewsResearch#safety
76d
49
Introducing the OpenAI Safety Bug Bounty program
Today, OpenAI is launching a public Safety Bug Bounty(opens in a new window) program focused on identifying AI abuse and…
OpenAI BlogAgents#agents#safety
82d
50
Inside our approach to the Model Spec
Inside our approach to the Model Spec As AI systems become more capable and widely used, we need a clear public framewor…
OpenAI BlogTutorial#safety
82d
51
Helping developers build safer AI experiences for teens
Helping developers build safer AI experiences for teens Introducing a set of teen safety policies formatted as prompts f…
OpenAI BlogModel#coding#safety
83d
52
Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety
Agentic AI is an ecosystem where specialized models work together to handle planning, reasoning, retrieval, and safety g…
NVIDIA Developer BlogInfra#rag#agents#multimodal
83d
53
Creating with Sora Safely
Loading… The Sora 2 model and the Sora app offer state-of-the-art video generation with a new way to create together, an…
OpenAI BlogHardware#gpt#multimodal#safety
84d
54
NVIDIA IGX Thor Powers Industrial, Medical, and Robotics Edge AI Applications
Industrial and medical systems are rapidly increasing the use of high-performance AI to improve worker productivity, hum…
NVIDIA Developer BlogHardware#agents#gpu#safety
84d
55
How we monitor internal coding agents for misalignment
How we monitor internal coding agents for misalignment Using our most powerful models to detect and study misaligned beh…
OpenAI BlogResearch#observability#coding#safety
88d
56
OpenAI Japan announces Japan Teen Safety Blueprint to put teen safety first
OpenAI Japan announces Japan Teen Safety Blueprint to put teen safety first Strengthening age-appropriate protections, p…
OpenAI BlogRelease#safety
90d
57
vLLM Semantic Router v0.2 Athena: ClawOS, Model Refresh, and the System Brain Mar 10, 2026 · 23 min read Since v0.1 Iris, vLLM Semantic Router has made a large jump. In one release cycle, the project rebuilt its model stack, expanded routing into safety, semantic caching, memory, retrieval, and...
vLLM Semantic Router v0.2 Athena: ClawOS, Model Refresh, and the System Brain Since v0.1 Iris, vLLM Semantic Router has …
vLLM BlogInfra#inference#safety
97d
58
vLLM Semantic Router v0.2 Athena: ClawOS, Model Refresh, and the System Brain Mar 10, 2026 · 23 min read Since v0.1 Iris, vLLM Semantic Router has made a large jump. In one release cycle, the project rebuilt its model stack, expanded routing into safety, semantic caching, memory, retrieval, and...
vLLM Semantic Router v0.2 Athena: ClawOS, Model Refresh, and the System Brain Since v0.1 Iris, vLLM Semantic Router has …
vLLM BlogInfra#inference#safety
97d
59
Improving instruction hierarchy in frontier LLMs
Improving instruction hierarchy in frontier LLMs Introducing IH-Challenge, a training dataset that strengthens instructi…
OpenAI BlogInfra#coding#training#safety
97d
60
Reasoning models struggle to control their chains of thought, and that’s good
Reasoning models struggle to control their chains of thought, and that’s good Why a limitation of frontier models is rea…
OpenAI BlogResearch#agents#observability#coding
102d
61
An update on our mental health-related work
An update on our mental health-related work Each week, more than 900 million people use ChatGPT to improve their daily l…
OpenAI BlogRelease#safety
108d
62
Advancing independent research on AI alignment
Advancing independent research on AI alignment We’re committing $7.5M to The Alignment Project to fund independent resea…
OpenAI BlogResearch#safety
116d
63
After Orthogonality: Virtue-Ethical Agency and AI Alignment
Preface This essay argues that rational people don’t have goals, and that rational AIs shouldn’t have goals. Human actio…
The GradientResearch#safety
117d
64
Bringing ChatGPT to GenAI.mil
Bringing ChatGPT to GenAI.mil Today, OpenAI for Government is announcing the next phase of our national security work: b…
OpenAI BlogInfra#gpt#safety
126d
65
Import AI 444: LLM societies; Huawei makes kernels with AI; ChipBench
Import AI 444: LLM societies; Huawei makes kernels with AI; ChipBench How can you quantify creativity? Welcome to Import…
Import AI (Jack Clark)Research#agents#safety
126d
66
EMEA Youth & Wellbeing Grant
EMEA Youth & Wellbeing Grant Supporting organizations improving youth safety and wellbeing in the age of AI. April 2026 …
OpenAI BlogResearch#safety
138d
67
The next chapter for AI in the EU
Key takeaways: - New program to train 20,000 SME across Europe with AI skills - €500,000 NGO grant to support research i…
OpenAI BlogResearch#safety
138d
68
Our approach to age prediction
We’re rolling out age prediction on ChatGPT consumer plans to help determine whether an account likely belongs to someon…
OpenAI Blog#gpt#safety
146d
69
AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems
AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems In this work, we introduce AprielGu…
Hugging Face BlogResearch#agents#training#safety
174d
70
Updating our Model Spec with teen protections
We’re sharing an update to our Model Spec, the written set of rules, values, and behavioral expectations that guides how…
OpenAI BlogFrameworks#gpt#safety
179d
71
BNY builds “AI for everyone, everywhere” with OpenAI
BNY builds “AI for everyone, everywhere” with OpenAI With frontier capabilities from OpenAI, BNY enables employees to bu…
OpenAI BlogResearch#gpt#safety
185d
72
The Walt Disney Company and OpenAI reach landmark agreement to bring beloved characters to Sora
The Walt Disney Company and OpenAI reach landmark agreement to bring beloved characters from across Disney’s brands to S…
186d
73
Update to GPT-5 System Card: GPT-5.2
GPT‑5.2 is the latest model family in the GPT‑5 series, and explained in our blog. The comprehensive safety mitigation a…
OpenAI BlogResearch#safety
186d
74
Import AI 437: Co-improving AI; RL dreams; AI labels might be annoying
Import AI 437: Co-improving AI; RL dreams; AI labels might be annoying Do you believe the singularity is nigh? Welcome t…
Import AI (Jack Clark)Research#safety
189d
75
Stop Saying Boredom is Good for Kids
Chronic boredom is harmful to adults, causing stress, disengagement, and poor well-being. Academic researchers have show…
fast.ai BlogResearch#safety
194d
76
Funding grants for new research into AI and mental health
Funding grants for new research into AI and mental health Introducing a new program to award up to $2 million to support…
OpenAI BlogResearch#safety
196d
77
GPT-5.1-Codex-Max System Card
GPT‑5.1‑Codex‑Max is our new frontier agentic coding model. It is built on an update to our foundational reasoning model…
OpenAI BlogAgents#agents#training#safety
208d
78
Strengthening our safety ecosystem with external testing
Strengthening our safety ecosystem with external testing Our approach to third party assessments for frontier AI. At Ope…
OpenAI BlogResearch#safety
208d
79
GPT-5.1 Instant and GPT-5.1 Thinking System Card Addendum
GPT‑5.1 Instant and GPT‑5.1 Thinking System Card Addendum As described in our blog, GPT‑5.1 Instant and GPT‑5.1 Thinking…
OpenAI BlogResearch#safety
215d
80
Introducing the Teen Safety Blueprint
Introducing the Teen Safety Blueprint A framework for building AI that protects, empowers, and creates safer experiences…
OpenAI BlogRelease#safety
221d
81
Introducing gpt-oss-safeguard
Today, we’re releasing a research preview of gpt-oss-safeguard, our open-weight reasoning models for safety classificati…
OpenAI BlogOpen Source#coding#safety
229d
82
gpt-oss-safeguard technical report
gpt-oss-safeguard technical report Performance and baseline evaluations of gpt-oss-safeguard-120b and gpt-oss-safeguard-…
OpenAI BlogResearch#safety
229d
83
OpenAI gpt-oss-safeguard October 29, 2025 Ollama is partnering with OpenAI and ROOST (Robust Open Online Safety Tools) to bring the latest gpt-oss-safeguard reasoning models to users for safety classification tasks. gpt-oss-safeguard models are available in two sizes: 20B and 120B, and are permissively licensed under the Apache 2.0 license.
OpenAI gpt-oss-safeguard October 29, 2025 Ollama is partnering with OpenAI and ROOST (Robust Open Online Safety Tools) t…
Ollama BlogOpen Source#llama#safety
229d
84
Day Zero Support for OpenAI Open Safety Model
Day Zero Support for OpenAI Open Safety Model Fast and Affordable AI Inference For the World’s Latest Open Safety Model …
Groq BlogInfra#inference#safety
229d
85
Addendum to GPT-5 System Card: Sensitive conversations
Addendum to GPT‑5 System Card: Sensitive conversations When we launched GPT‑5, we noted in the system card that we were …
OpenAI BlogResearch#benchmark#safety
231d
86
Defining and evaluating political bias in LLMs
Defining and evaluating political bias in LLMs ChatGPT shouldn’t have political bias in any direction. People use ChatGP…
OpenAI BlogTutorial#gpt#safety
249d
87
User Story Bilge Yücel DevRel Engineer Kelsey Sorrels Data Scientist at Telus AG How TAC Built an Agentic Chatbot with Haystack to Transform Trade Promotions Workflows See how TELUS Agriculture & Consumer Goods (TAC) gives users unprecedented access to their data with safety in mind October 6, 2025
How TAC Built an Agentic Chatbot with Haystack to Transform Trade Promotions Workflows See how TELUS Agriculture & Consu…
Haystack (deepset) BlogAgents#agents#safety
252d
88
Launching Sora responsibly
Loading… Sora 2 and the Sora app combine cutting-edge video generation with a new way to create together, and we’ve made…
OpenAI BlogHardware#gpt#multimodal#safety
258d
89
ENEOS Materials brings ChatGPT Enterprise to manufacturing
ENEOS Materials brings ChatGPT Enterprise to manufacturing Transforming the sector with AI-powered workforce solutions. …
OpenAI BlogResearch#gpt#agents#safety
264d
90
Democratizing AI Safety with RiskRubric.ai
Democratizing AI Safety with RiskRubric.ai More than 500,000 models can be found on the Hugging Face hub, but it’s not a…
Hugging Face BlogTutorial#coding#local#safety
270d
91
Detecting and reducing scheming in AI models
Detecting and reducing scheming in AI models Together with Apollo Research, we developed evaluations for hidden misalign…
OpenAI BlogResearch#safety
271d
92
Teen safety, freedom, and privacy
Some of our principles are in conflict, and we’d like to explain the decisions we are making around a case of tensions b…
OpenAI Blog#local#safety
272d
93
A joint statement from OpenAI and Microsoft
A joint statement from OpenAI and Microsoft Loading… OpenAI and Microsoft have signed a non-binding memorandum of unders…
OpenAI Blog#safety
277d
94
SafetyKit scales risk agents with OpenAI’s most capable models
SafetyKit scales risk agents with OpenAI’s most capable models From prototyping with early vision model previews to scal…
OpenAI BlogModel#rag#safety
279d
95
OpenAI and Greek Government launch ‘OpenAI for Greece’
OpenAI and Greek Government launch ‘OpenAI for Greece’ Today we’re launching ‘OpenAI for Greece’—a new partnership betwe…
OpenAI BlogRelease#gpt#local#safety
283d
96
Why language models hallucinate
At OpenAI, we’re working hard to make AI systems more useful and reliable. Even as language models become more capable, …
OpenAI BlogResearch#safety
283d
97
OpenAI and Anthropic share findings from a joint safety evaluation
Findings from a pilot Anthropic–OpenAI alignment evaluation exercise: OpenAI Safety Tests This summer, OpenAI and Anthro…
OpenAI BlogResearch#safety
292d
98
Collective alignment: public input on our Model Spec
Collective alignment: public input on our Model Spec We surveyed over 1,000 people worldwide on how our models should be…
OpenAI BlogTutorial#safety
292d
99
OpenAI’s letter to Governor Newsom on harmonized regulation
OpenAI’s letter to Governor Newsom on harmonized regulation The US faces an increasingly urgent choice on AI: set clear …
OpenAI BlogTutorial#safety
307d
100
From hard refusals to safe-completions: toward output-centric safety training
From hard refusals to safe-completions: toward output-centric safety training Introduced in GPT‑5, safe-completion is a …
OpenAI BlogHardware#training#safety
312d