Reference Guide

AI Agent Governance: A Field Guide to Control, Proof, and Safe Adoption

The governance problem becomes real the moment an agent can change something real. From that point, the question is no longer whether the prompt was clear. It is whether the organization can approve, bound, observe, and reconstruct the work after it touched code, tools, CI, or infrastructure.

What this is

A practical reference page

This guide pulls together the main ideas that recur across CAISI: control, evidence, repo contracts, isolated execution, and proof of work for AI-generated change.

What this is not

Not generic AI governance

This page stays focused on software-delivery systems, coding agents, MCP and tool use, CI workflows, and approval and proof at the execution boundary.

Who this is for

AppSec, CISO, platform, and engineering leaders

Use this page when you need one shared language for control quality, evidence quality, and safe adoption decisions.

What AI agent governance means

AI agent governance begins the moment a system can change something real. Before that, you are mostly dealing with assistance. After that, you are dealing with permissioned autonomy across deterministic systems.

The practical question is simple: if the system took an action that mattered, could your team explain what it knew, what it tried to do, what was allowed, what executed, and what proof remains? If the answer is no, the organization does not have a serious governance layer yet.

The 10-minute accountability test

A useful way to judge maturity is the 10-minute accountability test. During an incident or audit, can your team answer these questions quickly without relying on screenshots or memory?

If those answers take hours to rebuild, the control surface is weaker than it looks in a demo.

The core control layers

Discovery

Know what exists

Inventory unknown-to-security tools, MCP configs, repo entrypoints, local AI tooling, and CI workflows before you pretend the stack is governed.

Boundary enforcement

Put the gate before the action

Controls matter when they can change runtime behavior before the tool call crosses the execution boundary.

Deterministic workflow

Let code own deterministic steps

Planning can stay flexible. Validation, shipping, and merge mechanics should be in code, not left to model improvisation.

Proof

The approval is not the proof

Every meaningful run should leave a packet that a reviewer, auditor, or incident responder can understand cold.

Where teams go wrong

Most organizations do not fail because they forgot the word governance. They fail because they solve the wrong layer first. They buy a tool before they define evaluation language. They write prompts before they design the runtime contract. They talk about approvals before they check whether approval actually changes execution state.

The safer path is not to slow everything down. It is to sequence the work correctly: know what exists, define what can execute, make deterministic steps explicit, and generate proof by construction.

Start by role

AppSec

Start with control that fails or holds

OpenClaw and the benchmark series are the fastest route if you need runtime control evidence and a sharper buying language.

CISO

Start with approval and proof posture

The sprawl report and benchmark series are the best entry point if you need adoption visibility, evidence posture, and pilot language.

Platform

Start with the operating model

The framework series and the Gait and Wrkr collections show how discovery, enforcement, orchestration, and proof fit together.

Where to go next

If you want the measured artifact, start with the research hub. If you want the framework, start with the Operating Notes. If you want a shared vocabulary first, open the glossary.