If Policy Runs After the Tool Call, It Isn't a Control

Incident calls reveal this quickly: everyone agrees the action should have been blocked, but nobody can point to the mechanism that could have stopped it before execution. That is the core distinction in AI governance. Policy language can describe intent.

Only boundary enforcement can change runtime outcomes.

In this piece

Implementation context Where the pressure shows up The failure mode The better pattern Why security and the CISO care Why platform and engineering care What to do next

Series home | All field notes

Implementation context

The Gait implementation pattern discussed here centers the boundary explicitly: evaluate structured tool-call intent with gait gate eval, return allow, block, or require_approval, and treat non-allow as non-execute.

Where the pressure shows up

AppSec teams hear policy-complete statements every week: "that action requires approval," "the agent is instructed not to do that," "this path is restricted by guidance." Those statements are useful as documentation. They become insufficient at the exact moment an agent tries the action anyway.

The decisive follow-up is operational: what happens before the call executes? If policy only triggers in review, logs, or retrospective analysis, then the boundary is advisory. Advisory controls can improve behavior over time, but they cannot provide hard safety guarantees at execution time.

This distinction matters more as autonomy increases. Human supervision can compensate for weak controls in low-volume workflows. It does not scale when autonomous steps run frequently across multiple repos and environments.

The failure mode

The anti-pattern is reporting advisory controls as hard controls. Teams write strong guidance and collect approvals, but tool execution still depends on people noticing issues in time. This is an old automation mistake in new packaging: runbook intent is mistaken for boundary enforcement.

Prompt rules are especially easy to overtrust because they are near the model. The boundary is not the model. The boundary is the pre-action decision point where a non-allow verdict can prevent side effects. If that verdict does not exist, control quality is weaker than policy language suggests.

The better pattern

The better pattern is boundary-first policy with explicit runtime semantics. Agent intent is evaluated before execution. The gate returns allow, block, or require_approval. Any non-allow verdict is non-executable by default. That is where policy becomes a real control.

Gait is useful implementation context because it makes this split concrete: YAML policy is declarative, boundary evaluation is runtime, and decision artifacts are durable. This allows AppSec and platform to evaluate behavior based on mechanisms, not assumptions.

This model also improves approval integrity. "Require approval" means waiting at the boundary until explicit approval exists, not permitting execution and flagging afterward. That difference is what separates governance signaling from governance enforcement.

Good controls must also survive production friction. Teams need clear approval ownership, break-glass conditions, false-positive triage, and deterministic unblock paths. Without these operating details, even correct controls get bypassed at the first high-pressure deadline.

Why security and the CISO care

Security leadership needs this distinction to report control quality honestly. Advisory controls can still be useful, but they should not be represented as hard prevention. CISOs need to know which mechanisms can actually block unsafe execution and which are detection or guidance.

During incidents, this difference becomes explicit. If evidence is only prompts, policy docs, and reviewer memory, the organization ends up defending intent instead of demonstrating enforceable behavior.

Why platform and engineering care

Platform teams benefit because explicit verdicts reduce ambiguity and policy drift. Instead of relying on model behavior norms, teams get a clear decision point with known failure modes that can be automated and tested.

It also makes debugging faster. A blocked step carries a reasoned verdict, which is materially better than post-hoc debates about whether the model "misunderstood" instructions.

What to do next

Choose one high-impact action currently governed by guidance or manual review only.
Implement a boundary verdict for that action with explicit non-allow non-execute semantics.
Capture verdict, reason code, and approval path as durable review artifacts.
Classify current controls into preventive, detective, and advisory groups.
Report this control taxonomy to leadership before expanding autonomy scope.

Once teams accept boundary-first control, the next correction is about analogy quality: policy-as-code is useful, but not the same as lint.