Skip to content

Brain

The Brain layer (src/istota/brain/) is the single seam between executor orchestration and model invocation. The executor builds a fully composed prompt + env + sandbox configuration and hands a BrainRequest to a Brain implementation. Brains own the call to the model, stream parsing, and transient-API retry. Everything else — memory, skills, context, sandboxing, deferred DB writes, malformed-output detection, and result composition — stays in the executor.

Three brains ship behind the same protocol: ClaudeCodeBrain (the default, a headless claude -p subprocess wrapper), NativeBrain (Istota's own in-process agent loop against any OpenAI-compatible model), and TmuxClaudeBrain (drives the interactive claude TUI in a detached tmux session, keeping traffic on subscription billing). The executor doesn't change when you swap between them. make_brain selects on config.brain.kind (KNOWN_BRAIN_KINDS = {"claude_code", "native", "tmux_claude"}); an unknown kind raises ValueError at startup.

Layout

brain/
├── __init__.py     # Brain Protocol re-exports + make_brain factory
├── _types.py       # BrainRequest, BrainResult, BrainConfig, Brain Protocol
├── _events.py      # StreamEvent types + Claude Code stream-json parser
├── _roles.py       # Global operator role-override state (provider-agnostic)
├── claude_code.py  # ClaudeCodeBrain — wraps the `claude` CLI subprocess +
│                   # owns the Anthropic model namespace (canonical IDs,
│                   # MODEL_ALIASES, DEFAULT_ROLE_TARGETS, resolver methods)
├── native.py       # NativeBrain — drives Istota's in-process agent loop
└── tmux_claude.py  # TmuxClaudeBrain — drives the interactive `claude` TUI in a
                    # detached tmux session; delegates model resolution to a
                    # composed ClaudeCodeBrain, only implements execute()

The native loop's machinery lives in sibling packages: llm/ (the provider abstraction — openai_compat is the only provider), agent/ (the loop and tool dispatch), and session/ (turn state, compaction, retry).

stream_parser.py at the package root is now a thin re-export shim of brain/_events.py, kept for backward compatibility with tests and a few internal callers.

Brain protocol

class Brain(Protocol):
    def execute(self, req: BrainRequest) -> BrainResult: ...
    def resolve_alias(self, alias: str) -> tuple[str | None, str | None] | None: ...
    def resolve_model_name(self, name: str | None) -> str: ...
    def list_aliases(self) -> list[tuple[str, str | None, str | None]]: ...

Each brain owns its own model namespace. Consumers never reach into a brain module's tables — they go through make_brain(config.brain) and call these methods. resolve_alias returns (model_id, effort) or None; resolve_model_name collapses any name to a canonical ID; list_aliases exposes the merged table for !models. make_brain(config.brain) constructs the right implementation; unknown kind values raise ValueError so misconfiguration fails loudly at startup.

Model identity

Every model ID in the codebase resolves through the active brain. Three layers, top to bottom:

  1. Operator role overrides (brain/_roles.py, global) — provider-agnostic. Operators write [models.roles] smart = "opus-46-high" in TOML; set_role_overrides(...) is called once at config-load time.
  2. Default role targets (per-brain, e.g. claude_code.DEFAULT_ROLE_TARGETS) — each brain decides what fast / general / smart mean for its namespace if the operator hasn't overridden.
  3. Provider aliases (per-brain, e.g. claude_code.MODEL_ALIASES) — short names like opus-high for (model_id, effort) pairs. Brain-specific.

Brain.validate_role_override(role, target) warns on typos and provider-alias collisions at config-load time. ClaudeCodeBrain pins to versioned IDs: OPUS = "claude-opus-4-8" (current default Opus), with prior versions kept for production pinning — OPUS_47 = "claude-opus-4-7" (opus-47 / opus-47-high aliases), OPUS_46 = "claude-opus-4-6", SONNET = "claude-sonnet-4-6", HAIKU = "claude-haiku-4-5". Bare alias names (opus, sonnet, haiku) always resolve to the current-latest constant, so bumping OPUS ripples through every consumer automatically.

BrainRequest

The dataclass the executor populates per task. The brain treats it as immutable input.

Field Notes
prompt Fully composed prompt (emissaries + persona + memory + skills + context + request)
allowed_tools From executor.build_allowed_tools()["Read","Write","Edit","Grep","Glob","Bash","WebSearch","WebFetch"]. For ClaudeCodeBrain / TmuxClaudeBrain the list contents no longer reach the CLI (both run with --dangerously-skip-permissions, not an --allowedTools allowlist); the names only matter to NativeBrain, which filters its in-process tool set by them. A non-empty list is also the signal that distinguishes a tool-bearing task from a text-only one (empty = no tools, no skip-permissions, e.g. the sleep cycle).
cwd Subprocess working directory (config.temp_dir)
env Per-task env (already credential-stripped if the skill proxy is enabled)
timeout_seconds config.scheduler.task_timeout_minutes * 60
model task.model or config.model; brain default if empty
effort task.effort or config.effort; brain default if empty
custom_system_prompt_path Override system prompt with a file (claude_code-specific)
streaming True when the executor wants per-event progress callbacks
on_progress Per-event callback receiving StreamEvents (the brain handles filtering)
cancel_check Polled between events; True → kill subprocess, return cancelled
on_pid Called once with subprocess PID immediately after spawn
sandbox_wrap Closure that wraps the brain's raw cmd (e.g. with bubblewrap); brain stays sandbox-agnostic
result_file claude_code-specific fallback file path

BrainResult

Field Notes
success Final success/failure
result_text Final response text (executor reconciles against trace via _compose_full_result)
actions_taken JSON-encoded list of tool-use descriptions
execution_trace JSON-encoded [{"type":"tool"\|"text"\|"cm_boundary", ...}]
stop_reason completed / cancelled / timeout / oom / transient_api_error / error / not_found

ClaudeCodeBrain

Wraps the claude CLI subprocess. Owns:

  1. Command constructionclaude -p - --dangerously-skip-permissions --disallowedTools Agent Workflow, plus optional --model, --effort, --system-prompt-file, and (in streaming mode) --output-format stream-json --verbose --include-partial-messages. Tool-bearing tasks no longer pass an --allowedTools allowlist — the model gets its full default toolset and the security boundary is the bwrap sandbox + network proxy + clean env, not an interactive permission prompt. Agent + Workflow (the harness's multi-agent fan-out) stay denied so Istota orchestrates through its own skills. Text-only invocations (empty allowed_tools, e.g. the sleep cycle) emit no tool flags and no skip-permissions. The --include-partial-messages flag makes the CLI emit answer / reasoning text token-by-token as stream_event frames before the whole assistant block lands — without it the final response would arrive as one block and dump all at once on stream surfaces (web / REPL).
  2. Sandbox wrap — calls req.sandbox_wrap(cmd) if provided so the executor's bwrap configuration applies.
  3. SubprocessPopen (streaming) or subprocess.run (simple), prompt via stdin to avoid E2BIG on large prompts; stderr drained on a background thread to prevent deadlock.
  4. Stream parsing — line-by-line via make_stream_parser() from _events.py, dispatching ResultEvent → final result, ToolUseEvent / TextEvent → trace + on_progress, ContextManagementEventcm_boundary marker in trace. The stream_event partial frames parse into TextDeltaEvent / ThinkingDeltaEvent and go to on_progress only (never the trace); the trailing whole-block TextEvent / ThinkingEvent still records the trace and is deduped against the deltas executor-side (text via _delta_seen, thinking via _thinking_seen). On push surfaces (Talk) the deltas are dropped and TextEventprogress_text stands.
  5. Cancellation — polls req.cancel_check() between events; final re-check after the subprocess exits catches SIGTERM-style external kills.
  6. Timeoutthreading.Timer kills the process after req.timeout_seconds; result tagged stop_reason="timeout".
  7. OOM detection — returncode -9stop_reason="oom".
  8. API retry — wraps single-attempt execution in a 3-attempt loop with 5 s fixed sleep when is_transient_api_error() matches (5xx / 429). Retries do NOT count against the task's attempt_count.
  9. Result fallback — prefers ResultEvent → result file → stderr.

_compose_full_result() is intentionally NOT in the brain — both brains will produce (result_text, execution_trace) and the executor reconciles them (CM-aware composition + terse-result recovery).

API error helpers

Function Purpose
parse_api_error(text) Match API Error: (\d{3}) (\{...\}) and parse status_code / message / request_id
is_transient_api_error(text) True if status_code in {500, 502, 503, 504, 529, 429}

Both are re-exported from executor for scheduler.py and tests; canonical home is brain/claude_code.py.

Configuration

[brain]
kind = "claude_code"  # "claude_code" | "native" | "tmux_claude"

[brain.native]         # only when kind = "native" (or routed-to)
provider = "openai_compat"
model = "claude-sonnet-4-6"
base_url = "https://api.anthropic.com/v1"
# api_key via ISTOTA_BRAIN_NATIVE_API_KEY (kept out of TOML)

[brain.tmux]           # only when kind = "tmux_claude" (or routed-to)
# All fields default in code to the prototype's pinned values, so an
# absent block is behavioral parity. See config.example.toml for the
# full set (marker heuristics, circuit-breaker thresholds, CLI pin).

[brain.source_type_overrides]   # per-source-type routing (gradual rollout)
scheduled = "native"
heartbeat = "native"

Defaults to "claude_code", so existing deployments need no changes. source_type_overrides maps a task's source_type to a brain kind, overriding kind for matching tasks — the gradual-rollout knob (brain.resolve_brain_kind resolves it per task; unknown kinds are logged and ignored). "native" is istota's own in-process agent loop — see the native brain operator runbook for enabling it, the dev tiers, and shadow compare. "tmux_claude" drives the interactive claude TUI in a detached tmux session to keep traffic on subscription billing; a launch-level failure returns stop_reason="fallback" so the executor reruns the task headless, and a process-global circuit breaker short-circuits to claude_code for a cooldown after repeated launch failures.

Adding a new brain

  1. Create brain/<name>.py with a class implementing Brain.execute().
  2. Add the kind string to make_brain() in brain/__init__.py.
  3. Extend BrainConfig (or add a nested config dataclass) for new knobs.
  4. Update _build_network_allowlist() in executor.py if the brain calls a new external host (e.g. openrouter.ai:443).
  5. Tests: instantiate the brain, mock its transport (HTTP / subprocess), verify it produces correct BrainResult shapes for the standard cases (success, transient retry, cancel, timeout, oom, malformed output).

The executor doesn't need to know the new brain exists — selection is config-driven.