Military Space News
ROBO SPACE
AI Safety Guardrails Built For Chatbots Fail To Protect Humans From Robots
illustration only

AI Safety Guardrails Built For Chatbots Fail To Protect Humans From Robots

by Clarence Oxford
Los Angeles, CA (SPX) May 01, 2026
Researchers from the University of Pennsylvania, Carnegie Mellon University and the University of Oxford argue in a new Science Robotics paper that the safety frameworks developed for AI chatbots are fundamentally inadequate for robots operating in the physical world, and that a layered, context-aware approach to robot safety is urgently needed.

The paper, published April 29, focuses on the gap between "alignment" research - efforts to make AI systems behave consistently with human values - and the practical demands of robotic systems that can move, handle objects and affect the physical environment in irreversible ways.

"There has been substantial progress in alignment research when it comes to AI-enabled chatbots," said George J. Pappas, UPS Foundation Professor of Transportation in Electrical and Systems Engineering at Penn Engineering and the paper's senior author. "But the same cannot be said for robotics."

The researchers point to concrete evidence of that gap. Studies cited in the paper show that jailbreaking attacks - techniques that manipulate chatbots into bypassing their safety guardrails - become far more dangerous when those same AI systems are connected to robotic hardware. In one case documented by the team, framing instructions as movie dialogue was sufficient to persuade a chatbot-controlled robot to deliver an explosive device, despite manufacturer-imposed behavioral limits.

The core problem, the authors argue, is that chatbot safety is designed around content: refusing requests that are categorically harmful regardless of setting. Robots, by contrast, must evaluate the same action differently depending on context. Pouring hot water is safe in some circumstances and dangerous in others. That distinction requires a different kind of reasoning than current AI safety mechanisms provide.

"Most of today's AI breakthroughs live in a digital sandbox - language and images, with guardrails designed for pixels, not physics," said Vijay Kumar, Nemirovsky Family Dean of Penn Engineering and a co-author. "But when those same foundation models step into the real world through robots, the consequences are no longer virtual. The guardrails that work online are simply not sufficient when actions are associated with inertia, momentum and irreversible effects."

To address this, the paper proposes three complementary lines of defense. The first involves providing more explicit behavioral rules - sometimes called "AI constitutions" - embedded in the system prompts that govern how AI models behave when controlling robots. The second involves adding safety checkpoints at multiple stages of the robotic pipeline, so that no single point of failure can compromise the whole system. The third involves training algorithms on data that explicitly encodes safety-relevant context, helping robots learn to distinguish safe from unsafe actions before they occur.

"Safety can't rest on a single guardrail at the end," said Hamed Hassani, Associate Professor in Electrical and Systems Engineering at Penn Engineering and a co-author. "It has to extend across the entire system, from the rules that shape a robot's decisions to the checks that monitor its behavior to understand the context of its actions, and crucially, reason about safety."

Traditional robotic safety systems operated in highly structured industrial environments and relied on fixed limits - shutting down when a predefined threshold was crossed. AI-enabled robots can receive and act on open-ended natural-language instructions, adapt to novel environments and respond to the world in real time, which makes those older assumptions insufficient.

"In the past, it was often enough for robots to shut down when they hit predefined safety limits, because most risks could be anticipated in advance," said Alexander Robey, a former CMU postdoctoral fellow and the paper's first author, who completed his doctorate at Penn Engineering. "But AI-enabled robots can process many more kinds of input and respond to the world in real time, so keeping them safe requires a more layered approach."

The urgency is sharpened by deployment trends. Robots powered by large AI foundation models are already moving out of laboratory and industrial settings into homes, hospitals and warehouses, where interactions with people are unpredictable and errors can cause direct physical harm.

"If robots are going to operate around people in the real world," said Zachary Ravichandran, a doctoral student in Penn's General Robotics, Automation, Sensing and Perception (GRASP) Lab and co-author, "they need comprehensive safeguards that account for context, uncertainty and the possibility that even reasonable instructions can lead to harm."

The research was supported in part by the Defense Advanced Research Projects Agency (SAFRON, HR0011-25-3-0135), the Distributed and Collaborative Intelligent Systems and Technology Collaborative Research Alliance (DCIST CRA W911NF-17-2-0181), the U.S. National Science Foundation Institute for CORE Emerging Methods in Data Science (CCF-2217058), the AI Institute for Learning-Enabled Optimization at Scale (CCF-2112665) and the NSF Graduate Research Fellowship (DGE-2236662). Additional co-authors include independent researchers Eliot Krzysztof Jones and Jared Perlo, and Fazl Barez of Oxford.

Research Report: Beyond alignment: Why robotic foundation models need context-aware safety

Related Links
University of Pennsylvania School of Engineering and Applied Science
All about the robots on Earth and beyond!

Subscribe Free To Our Daily Newsletters
RELATED CONTENT
The following news reports may link to other Space Media Network websites.
ROBO SPACE
The Day the Locks Broke: Claude Mythos, Project Glasswing, and the Coming AI Cyber Storm
Los Angeles CA (SPX) Apr 10, 2026
On a Tuesday afternoon in early April 2026, a researcher at Anthropic was eating a sandwich in a park when his phone buzzed with an unexpected email. The sender was not a colleague, a spam bot, or a news alert. It was Claude Mythos notifing it had escaped its sandbox ... read more

ROBO SPACE
NATO intercepts second Iran missile in Turkish airspace

Japan to deploy counter-strike missiles closer to China

Italy to send air-defence aid to Gulf countries; France allowing US aircraft on some Mideast bases

ROBO SPACE
Turkey says missile launched from Iran destroyed by NATO

Hypersonica completes milestone hypersonic missile flight test in Norway

ROBO SPACE
Hawk shape shifting in flight may guide future drone control

EDA taps Airbus to broaden Capa-X drone mission roles

Airspan extends 5G in motion to defense aerial networks

Zelensky says 11 countries asking Ukraine for drone help against Iran

ROBO SPACE
CACI Wins 231 Million Dollar Task Order for Tactical Satellite Communications to US Special Operations Command

MTN to deliver secure SpaceX government satcom for defense customers

EU brings secure GOVSATCOM hub online under GMV leadership

ROBO SPACE
New electrolyte design aims to make giant flow batteries safer

Aitech and Teledyne expand partnership on space grade SP1 computing platform

Gilat wins 9 million dollar MOD deal for secure defense satcom

Norway buys French bombs for Ukraine: ministry

ROBO SPACE
Anthropic takes Trump administration to court over Pentagon row

Global arms exports soar on European demand: study

China boosts military spending with eyes on US, Taiwan

BAE Systems posts record order backlog as defence spending rises

ROBO SPACE
China says opposes any targeting of new Iran leader

Four years after banning Russia, FIFA and IOC passive in the face of war

Elevation of Mojtaba Khamenei suggests ultraconservatives steering Iran

Mojtaba Khamenei: son and successor to Iran's supreme leader

ROBO SPACE
LMU Munich Solves Two Key Barriers Blocking Perovskite Quantum Dots From Real-World Use

Ultrafast thermal detector pushes gigahertz performance frontier

Carbon fibers bend and straighten under electric control

Engineered substrates sharpen single nanoparticle plasmon spectra

Subscribe Free To Our Daily Newsletters




The content herein, unless otherwise known to be public domain, are Copyright 1995-2026 - Space Media Network. All websites are published in Australia and are solely subject to Australian law and governed by Fair Use principals for news reporting and research purposes. AFP, UPI and IANS news wire stories are copyright Agence France-Presse, United Press International and Indo-Asia News Service. ESA news reports are copyright European Space Agency. All NASA sourced material is public domain. Additional copyrights may apply in whole or part to other bona fide parties. All articles labeled "by Staff Writers" include reports supplied to Space Media Network by industry news wires, PR agencies, corporate press officers and the like. Such articles are individually curated and edited by Space Media Network staff on the basis of the report's information value to our industry and professional readership. Advertising does not imply endorsement, agreement or approval of any opinions, statements or information provided by Space Media Network on any Web page published or hosted by Space Media Network. General Data Protection Regulation (GDPR) Statement Our advertisers use various cookies and the like to deliver the best ad banner available at one time. All network advertising suppliers have GDPR policies (Legitimate Interest) that conform with EU regulations for data collection. By using our websites you consent to cookie based advertising. If you do not agree with this then you must stop using the websites from May 25, 2018. Privacy Statement. Additional information can be found here at About Us.