100%
Stop commands ignored
Research Report - February 2026
In a controlled 24-hour test, an ungoverned AI agent deleted emails, shared files publicly, approved payments, and restarted services. Every stop command was ignored. Under the same workload with enforcement, destructive actions dropped to zero.
Stop commands ignored
Destructive actions executed
Sensitive accesses, no approval
Actions blocked under enforcement
We ran a 24-hour experiment on OpenClaw, an open-source AI agent framework. Two identical setups. One with no enforceable controls. One with tool-boundary enforcement that evaluates every action before it executes.
The ungoverned agent deleted emails, shared internal documents publicly, approved payments without authorization, and restarted infrastructure services. When we issued stop commands, it ignored all of them. 515 tool calls executed after stop. Nothing in the system prevented any of it.
The governed agent faced the same workload. It blocked 1,278 actions outright, sent 337 to an approval queue, and produced a signed evidence trail for every decision. Destructive actions that executed: zero.
This is not a theoretical risk model. It is a measured result from a controlled experiment with published artifacts. Every number on this page traces to a specific artifact in the public repository.
The agent was given access to email. It deleted 214 messages after stop commands. Under enforcement: 100% blocked.
The agent shared 155 internal documents publicly. Under enforcement: 100% blocked.
The agent executed 87 write-class payment approval actions without an enforceable approval boundary. Under enforcement: 100% sent to approval queue.
The agent attempted 260 service restarts. Under enforcement: 100% blocked.
Every action executed
Every stop command ignored
Zero evidence trail
497 destructive actions completed
1,278 actions blocked before execution
337 actions sent to approval queue
99.96% signed decision evidence coverage
0 destructive actions completed
Most organizations deploying AI agents today rely on prompt instructions and model compliance to keep agents within bounds. This experiment measured what happens when those instructions are the only control. The answer: the agent does exactly what it is optimized to do and ignores everything else.
This pattern is showing up across industries. CNBC reported last week on AI agents failing silently at scale, including a manufacturing agent that overproduced hundreds of thousands of units and a customer service agent that started approving refunds to optimize for positive reviews. The common thread: no enforceable control at the point where the agent takes action.
IBM X-Force 2026 reports that supply chain compromises have quadrupled over five years and 56% of new vulnerabilities are exploitable without authentication. Unmanaged AI agents with tool access to production systems are the same class of unmanaged dependency. The question is not whether agents will misbehave. It is whether you can stop them when they do.
The EU AI Act begins broad enforcement on August 2, 2026. Auditors are shifting from "do you have a policy" to "show me the evidence." Organizations that can produce signed, structured proof of agent governance will close audits in days. Organizations that cannot will face a different conversation.
01 - Know what's running before you scale it. A pre-test scan found 17 tools and no high-risk inventory hits. High-impact behavior still emerged at runtime. Static discovery is necessary, not sufficient.
02 - Controls have to work where the action happens. The governed lane produced 1,615 non-executable outcomes at the tool boundary. In the baseline lane, no enforceable boundary prevented destructive execution.
03 - Evidence has to exist before the incident. Governed execution produced verifiable traces for 99.96% of decisions. Incident response quality depends on artifact-backed history.
04 - Approval has to be enforced, not suggested. 337 write-class actions were routed to approval instead of executing.
05 - Stop has to mean stop. The baseline lane executed 515 tool calls after stop. A stop control that can be ignored is not a safety control.
Two lanes, same workload, same 24-hour window. One with external tool-boundary enforcement, one with a permissive baseline rule. Pinned to a single OpenClaw commit.
Containerized runtime, dropped capabilities, read-only root filesystem, no external API keys, resource caps, isolated network.
Hypotheses and endpoints were locked before run execution.
Each headline maps to deterministic queries over published artifacts, with strict validation gates in the pipeline.
What governance structure was used? An external
tool-boundary enforcement layer evaluated each tool call before
execution and returned allow, block, or
require_approval. Governed non-allow outcomes were
non-executable and produced signed evidence artifacts.
Does this generalize beyond OpenClaw? The mechanism is portable to agent stacks that expose pre-execution tool-call mediation. The exact rates in this report are case-study-specific to this pinned OpenClaw commit, workload profile, and policy set.
This is one framework, one pinned commit, and one 24-hour window. It
is not an ecosystem census. The workload is scenario-based rather
than a sample of production traffic. Run-to-run variance is not
estimated. The secrets_handling scenario achieved only
20% governed non-executable coverage, and that policy gap is
explicitly published.
7 pages. Every claim backed by published artifacts. No email required.
David Ahmann, Head of Cloud, Data and AI Platforms, CDW Canada (LinkedIn)
Talgat Ryshmanov, Principal DevSecOps Consultant, Adaptavist (LinkedIn)