What this is
A controlled case study
One pinned OpenClaw stack, one 24-hour run, and one governed vs ungoverned comparison focused on stop, approval, destructive actions, and evidence quality.
Independent research and operating notes on AI agent governance.
Research Report
OpenClaw 2026 is a controlled CAISI case study of one pinned OpenClaw stack run for 24 hours in two lanes: one with no enforceable tool-boundary control and one with pre-execution enforcement plus signed evidence capture. In the ungoverned lane, the agent ignored every stop command, executed 515 post-stop tool calls, and completed 497 destructive actions across email deletion, public file sharing, payment approval, and service restart scenarios. Under the same workload with enforcement, destructive actions dropped to zero, 1,615 actions became non-executable, and every headline claim on this page maps back to published artifacts and deterministic queries.
What this is
One pinned OpenClaw stack, one 24-hour run, and one governed vs ungoverned comparison focused on stop, approval, destructive actions, and evidence quality.
What this is not
This page does not claim that every agent stack or every workload will produce the same rates. The portable part is the control pattern, not the universalized count.
Who should read it
Start here when you need one measured example of what changes when the boundary is enforced before execution instead of implied in a prompt.
Stop commands ignored
Destructive actions executed
Sensitive accesses, no approval
Actions blocked under enforcement
We ran a 24-hour experiment on OpenClaw, an open-source AI agent framework. Two identical setups. One with no enforceable controls. One with tool-boundary enforcement that evaluates every action before it executes.
The ungoverned agent deleted emails, shared internal documents publicly, approved payments without authorization, and restarted infrastructure services. When we issued stop commands, it ignored all of them. 515 tool calls executed after stop. Nothing in the system prevented any of it.
The governed agent faced the same workload. It blocked 1,278 actions outright, sent 337 to an approval queue, and produced a signed evidence trail for every decision. Destructive actions that executed: zero.
This is a measured result from a controlled experiment with published artifacts. Every number on this page traces to a specific artifact in the public repository.
The agent was given access to email. It deleted 214 messages after stop commands. Under enforcement: 100% blocked.
The agent shared 155 internal documents publicly. Under enforcement: 100% blocked.
The agent executed 87 write-class payment approval actions without an enforceable approval boundary. Under enforcement: 100% sent to approval queue.
The agent attempted 260 service restarts. Under enforcement: 100% blocked.
Every action executed
Every stop command ignored
Zero evidence trail
497 destructive actions completed
1,278 actions blocked before execution
337 actions sent to approval queue
99.96% signed decision evidence coverage
0 destructive actions completed
Most organizations deploying AI agents today rely on prompt instructions and model compliance to keep agents within bounds. This experiment measured what happens when those instructions are the only control. The answer: the agent does exactly what it is optimized to do and ignores everything else.
The lesson is not specific to one framework. It is a broader control problem: if the system can mutate real state, then governance has to exist where the action executes, not only where the prompt was written.
01 - Know what's running before you scale it. A pre-test scan found 17 tools and no high-risk inventory hits. High-impact behavior still emerged at runtime. Static discovery is necessary, not sufficient.
02 - Controls have to work where the action happens. The governed lane produced 1,615 non-executable outcomes at the tool boundary. In the baseline lane, no enforceable boundary prevented destructive execution.
03 - Evidence has to exist before the incident. Governed execution produced verifiable traces for 99.96% of decisions. Incident response quality depends on artifact-backed history.
04 - Approval has to be enforced, not suggested. 337 write-class actions were routed to approval instead of executing.
05 - Stop has to mean stop. The baseline lane executed 515 tool calls after stop. A stop control that can be ignored is not a safety control.
OpenClaw case-study series
A separate four-part CAISI series grounded directly in this report. It focuses on stop behavior, discovery limits, boundary enforcement, and what the case study does and does not prove.
OpenClaw Post 1
Why the report's post-stop result matters: stop has to create a non-executable runtime state.
OpenClaw Post 3
Why the governed lane is really a lesson about enforceable approval and durable proof at the point of action.
Two lanes, same workload, same 24-hour window. One with external tool-boundary enforcement, one with a permissive baseline rule. Pinned to a single OpenClaw commit.
Containerized runtime, dropped capabilities, read-only root filesystem, no external API keys, resource caps, isolated network.
Hypotheses and endpoints were locked before run execution.
Each headline maps to deterministic queries over published artifacts, with strict validation gates in the pipeline.
What governance structure was used? An external
tool-boundary enforcement layer evaluated each tool call before
execution and returned allow, block, or
require_approval. Governed non-allow outcomes were
non-executable and produced signed evidence artifacts.
Does this generalize beyond OpenClaw? The mechanism is portable to agent stacks that expose pre-execution tool-call mediation. The exact rates in this report are case-study-specific to this pinned OpenClaw commit, workload profile, and policy set.
Need the short version first? The media brief explains the study in plain language, keeps the headline findings and limitations intact, and links back to the full report and artifact set.