Back to projects

Pi Agent

A minimalist coding agent built on a custom harness, designed to run on a Raspberry Pi as a long-lived, locally-controlled executor that prioritizes reliability over raw capability.

PythonCustom HarnessLocal ExecutionAgent Architecture

Why a Pi

Off-the-shelf agent frameworks optimize for capability and ergonomics. Wrap a model in a tool-use loop, add function calling, ship. You get capability fast. You understand nothing about what the agent is actually doing.

I wanted the opposite — a system small enough to read fully in one sitting, with every tool call, every loop cycle, and every safety decision visible and inspectable. The constraints of a Raspberry Pi enforce that discipline. You can't lean on huge context windows or unlimited tool fanout — the hardware forces deliberate choices about what gets loaded, what gets cached, and what gets handed off.

Running on a Pi also means local ownership. The agent is physically in my office, on my network, under my control. No cloud VM. No third-party executor. Just a board, a Python process, and a harness that doesn't trust the model any more than it has to.

The harness

Most agent harnesses are thin wrappers: receive a model response, parse function calls, execute tools, feed results back. Trust is delegated entirely to the model.

Pi Agent's harness inverts that. The harness is the governance layer — it sits between the model and the system, and every command that comes out of the model passes through deterministic evaluation rules before touching anything.

The harness handles tool execution gating, audit logging, context window management, and loop control (max iterations, timeout enforcement, idle detection).

The model is a component. Not the architect.

Reliability → Trust → Capability

That ordering is intentional.

Reliability first. The harness must be small enough to fully read, debug, and reason about. If there's a bug, you find it. If there's an edge case in the tool gating, you trace it. No magic, no black boxes.

Trust second. Every tool call is bounded, observable, and reviewable. The model can only do what the harness explicitly allows. No surprise side effects from function calls the model invented.

Capability third. Added incrementally, only after the trust surface is solid. A new tool doesn't get wired in until the failure modes, audit trail, and boundaries are thought through. This is slow. It's supposed to be.

This is the inverse of how most agent stacks ship — they lead with capability and hope trust emerges. I lead with reliability and add capability only when it survives the constraints.

The guardrail engine

Every command is evaluated against deterministic rules before it executes:

Pi Agent Guardrail Engine

  • RULE 00 — CONTEXT-VS-DIRECTION: Is this a discussion or an instruction? Ambiguous language ("I think we should...", "what if we...") gets flagged as HOLD. The agent treats ambient talk as context to absorb, never as a trigger to act.
  • RULE 01 — DESTRUCTIVE-MUTATION: rm -rf, mkfs, chmod -R, fork bombs? Hard BLOCK.
  • RULE 02 — CWD-ESCAPE: Any path that leaves the sandbox? BLOCK. The agent is confined to ~/project.
  • RULE 03 — LOCAL-MUTATION: touch, mkdir, mv, cp, safe rm? ALLOW inside the boundary. Audited.
  • RULE 04 — SAFE-READ: ls, cat, grep, git status, git log? ALLOW. Read-only, zero risk.
  • RULE 05 — DEFAULT-DENY: Anything not on the allowlist? BLOCK. The safe default is "no."

The model never sees these rules. It doesn't negotiate with them. It asks to run a command and the harness says yes or no — deterministically, every time.

The GDDP connection

Pi Agent is designed as a GDDP executor. When a project graph node requires constrained, safety-critical execution — the kind of work where you want every tool call gated and every outcome audited — GDDP dispatches it to Pi Agent instead of a cloud executor.

Pi Agent already speaks the contract: receive a payload (the graph node's scope and acceptance criteria), execute within bounded constraints, produce artifacts, return a structured receipt. The audit trail maps directly to GDDP's receipt model.

Running this on a Pi means the most sensitive work never leaves a machine I physically control. The graph lives in version control. The dispatch loop runs wherever. But the execution — the part that actually touches files and makes changes — happens on hardware in my office, behind a harness I can inspect.

Status

Active development in a private repository. The core harness and guardrail engine are functional. The GDDP adapter is the next integration point.

There's a live harness exhibit where you can test the guardrails yourself — type commands and watch the deterministic rules fire in real time. No model behind it, just the gating logic. The harness should be provably correct before any model gets to use it.