Home / Field Notes / AI Engineering Operating Notes

CAISI Field Notes / Framework Series

AI Engineering Operating Notes

This 10-part series explains the core operating model behind governed AI engineering: repo contracts, context loading, blueprints, orchestration, isolation, evaluation, proof, and maturity. It is the main framework collection in CAISI field notes.

For the inventory layer that sits underneath this series, start with the Agent Action BOM and the CI/CD control guide.

Posts 1 to 3 reframe the problem Posts 4 to 7 define the operating model Posts 8 to 10 define trust and maturity

Start with post 1 All field notes

Narrative arc

Posts 1 to 3

Reframe the problem

Move from prompts and demos to repo contracts, context architecture, and control as the real unit of design.

Posts 4 to 7

Define the operating model

Show how blueprints, orchestrators, warm isolation, and safe parallelism turn sessions into a delivery system.

Posts 8 to 10

Define trust and maturity

Explain hidden evaluations, proof packets, and what staged maturity actually looks like in practice.

The 10 posts

Post 1

Reframe the problem

AI Engineering Is a Control Problem, Not a Prompt Problem

Why unmanaged agents create new change risk, supply-chain risk, and evidence gaps that prompting alone cannot solve.

Post 2

Reframe the problem

Your Repository Is the Runtime Contract for Agents

What makes a repo agent-legible, and why repo structure is part of the security control surface.

Post 3

Reframe the problem

Why Giant Instruction Files Fail

The architecture problem behind context sprawl, weak enforceability, and slow, expensive agent workflows.

Post 4

Define the operating model

From Skills to Blueprints: Where AI Should Stop and Code Should Take Over

How to split planning, execution, validation, and shipping across AI reasoning and deterministic code.

Post 5

Define the operating model

The Dark Factory: Managing Work Instead of Supervising Agents

The orchestrator model for turning work items into validated PRs with human review states and rollback paths.

Post 6

Define the operating model

Why Governed Agent Runs Need Isolated, Warm Sandboxes

Why shared checkouts and mutable environments undermine both control and throughput.

Post 7

Define the operating model

Parallel Agents Without Chaos

How safe concurrency depends on path boundaries, dependency DAGs, claims, retries, and reconciliation.

Post 8

Define trust

Why Visible Tests Are Not Enough

Why agents overfit to what they can see and why hidden evals and digital twins matter.

Post 9

Define trust

Proof of Work for AI-Generated Changes

What a trustworthy autonomous change packet has to contain beyond green CI checks.

Post 10

Define maturity

The AI Engineering Maturity Model

A roadmap from prompt-first experiments to orchestrated, evidence-rich, dark-factory capable autonomy.

Recurring themes

Systems over prompting

The unit of analysis is not prompt quality. It is the system that turns work into changes, approvals, evidence, and recovery paths.

Security and velocity can align

When workflows are designed correctly, the same controls that reduce blast radius also remove ambiguity and scale throughput.

Deterministic code owns deterministic steps

The best agent workflows interleave reasoning with scripts, validators, and policies that do not depend on model behavior.

Proof is part of the product

A governed system produces a reviewable packet from trigger to merge. Passing CI is necessary, but never sufficient; use the CI/CD guide for the control path underneath that proof.