ADR-0012: Agent runtime — code-mode over the generated SDK

Status

Accepted

Date

2026-06-24

Context

“Agentic” is a core product-identity pillar (ADR-0002). The pack governs agents thoroughly (delegation, AgentPrincipal, authority, tool traces) but leaves the runtime model underspecified. Two insights shape the decision:

The agent is nearly free if the surface is governed. Every Action and read is a clean contract that already generates MCP/agent tool schemas (POC 1, Value Types §20). With authority enforced inside tool execution, never in the prompt, “governance outside the prompt” falls out for free, and the agent reuses the exact human surface. (Invariant: agents do not get a separate permission system.)
Code beats tool-call chaining for real orchestration. Agents are most capable when they write code that invokes Actions/reads and wraps logic around them — loops, branching, batching, composing reads into decisions. The generated typed SDK becomes the agent’s API; Value Types give it type-safe, validated operations. The pack already anticipated this: code_mode is a presentation type on AuthorityOperation (§4.2).

Code-mode breaks per-action human approval: a loop can invoke a thousand Actions. The open question is where authority bites without neutering the power. (Resolved here; see Decision.)

Decision

Code-mode over the generated typed SDK is the primary agent substrate. The agent writes code (in the SDK’s language) against the generated, Value-Type-typed SDK. MCP tools are thin single-shot wrappers over the same SDK for conversational/simple use. There is one governed boundary under both.

Authority enforcement — per-call inside a delegation envelope, with execute-gating:

Code runs in a sandbox bound to the agent’s delegated principal — no raw DB credentials, no arbitrary network/filesystem (invariant 4.6). Only the generated SDK is in scope.
Every SDK→kernel call passes through AuthorityOperation / ReadAuthorityOperation.
Reads, computes, and prepares run freely within the delegation envelope: allowed actions/realms, max risk class, and rate + blast-radius caps.
A high-risk or irreversible Action throws DENY_REQUESTABLE — the agent’s code must catch it and emit an approval request, not push through. Approval becomes control flow the code handles. This is the pack’s prepare-allow / execute-deny_requestable split (§4.3, §7) expressed in code.
The whole run is logged as one ToolTrace (the code + every SDK call + every authority decision).

Sequencing: commit to the tool-shaped/SDK-shaped surface from day one so Actions/reads are agent-ready; build the agent itself last (proof Phase 5) as a thin demonstrator — context + prepare + explain, with execution human-approved.

Alternatives Considered

Discrete MCP tools only (no code mode)

Rejected as the primary path: per-call approval is clean but gives up the open-ended orchestration that makes agents capable. Retained as thin wrappers for simple cases.

Static analysis + whole-plan approval

Analyze generated code, bound its possible effects, approve the plan, then run.

Rejected: soundly bounding arbitrary code (loops, dynamic dispatch) is effectively impossible; the bound would be unsafe or uselessly conservative.

Dry-run in a scenario, approve the diff, commit

Rejected for now: strong safety, but heavy and leans on the scenario machinery deferred in ADR-0005.

Pre-approved skill library only

Compose from fixed, individually-governed parameterized skills.

Rejected: safer but closer to tool-calling; gives up the power we’re after. May return as a complement (a library of vetted skills the agent can call from code).

Consequences

The Action/read layer must be designed as clean, typed, generatable contracts from day one, or the agent becomes a costly retrofit — this constrains earlier work even though the agent is built last.
The delegation envelope (allowed actions/realms, max risk, rate + blast-radius caps) is a required part of AgentPrincipal and must be enforceable at the SDK→kernel boundary.
The sandbox is a real engineering artifact (no creds/network/fs escape); it is a security-critical component.
Depends on ADR-0006: the SDK language the agent writes in follows the kernel/SDK language decision.
Sensitive to: if per-call enforcement proves too coarse for some workflows, the dry-run-in-scenario model (once scenarios exist) is the escalation path.