Brain¶
The Brain layer (src/istota/brain/) is the single seam between executor orchestration and model invocation. The executor builds a fully composed prompt + env + sandbox configuration and hands a BrainRequest to a Brain implementation. Brains own the call to the model, stream parsing, and transient-API retry. Everything else — memory, skills, context, sandboxing, deferred DB writes, malformed-output detection, and result composition — stays in the executor.
Three brains ship behind the same protocol: ClaudeCodeBrain (the default, a headless claude -p subprocess wrapper), NativeBrain (Istota's own in-process agent loop against any OpenAI-compatible model), and TmuxClaudeBrain (drives the interactive claude TUI in a detached tmux session, keeping traffic on subscription billing). The executor doesn't change when you swap between them. make_brain selects on config.brain.kind (KNOWN_BRAIN_KINDS = {"claude_code", "native", "tmux_claude"}); an unknown kind raises ValueError at startup.
Layout¶
brain/
├── __init__.py # Brain Protocol re-exports + make_brain factory
├── _types.py # BrainRequest, BrainResult, BrainConfig, Brain Protocol
├── _events.py # StreamEvent types + Claude Code stream-json parser
├── _roles.py # Global operator role-override state (provider-agnostic)
├── claude_code.py # ClaudeCodeBrain — wraps the `claude` CLI subprocess +
│ # owns the Anthropic model namespace (canonical IDs,
│ # MODEL_ALIASES, DEFAULT_ROLE_TARGETS, resolver methods)
├── native.py # NativeBrain — drives Istota's in-process agent loop
└── tmux_claude.py # TmuxClaudeBrain — drives the interactive `claude` TUI in a
# detached tmux session; delegates model resolution to a
# composed ClaudeCodeBrain, only implements execute()
The native loop's machinery lives in sibling packages: llm/ (the provider abstraction — openai_compat is the only provider), agent/ (the loop and tool dispatch), and session/ (turn state, compaction, retry).
stream_parser.py at the package root is now a thin re-export shim of brain/_events.py, kept for backward compatibility with tests and a few internal callers.
Brain protocol¶
class Brain(Protocol):
def execute(self, req: BrainRequest) -> BrainResult: ...
def resolve_alias(self, alias: str) -> tuple[str | None, str | None] | None: ...
def resolve_model_name(self, name: str | None) -> str: ...
def list_aliases(self) -> list[tuple[str, str | None, str | None]]: ...
Each brain owns its own model namespace. Consumers never reach into a brain module's tables — they go through make_brain(config.brain) and call these methods. resolve_alias returns (model_id, effort) or None; resolve_model_name collapses any name to a canonical ID; list_aliases exposes the merged table for !models. make_brain(config.brain) constructs the right implementation; unknown kind values raise ValueError so misconfiguration fails loudly at startup.
Model identity¶
Every model ID in the codebase resolves through the active brain. Three layers, top to bottom:
- Operator role overrides (
brain/_roles.py, global) — provider-agnostic. Operators write[models.roles] smart = "opus-46-high"in TOML;set_role_overrides(...)is called once at config-load time. - Default role targets (per-brain, e.g.
claude_code.DEFAULT_ROLE_TARGETS) — each brain decides whatfast/general/smartmean for its namespace if the operator hasn't overridden. - Provider aliases (per-brain, e.g.
claude_code.MODEL_ALIASES) — short names likeopus-highfor(model_id, effort)pairs. Brain-specific.
Brain.validate_role_override(role, target) warns on typos and provider-alias collisions at config-load time. ClaudeCodeBrain pins to versioned IDs: OPUS = "claude-opus-4-8" (current default Opus), with prior versions kept for production pinning — OPUS_47 = "claude-opus-4-7" (opus-47 / opus-47-high aliases), OPUS_46 = "claude-opus-4-6", SONNET = "claude-sonnet-4-6", HAIKU = "claude-haiku-4-5". Bare alias names (opus, sonnet, haiku) always resolve to the current-latest constant, so bumping OPUS ripples through every consumer automatically.
BrainRequest¶
The dataclass the executor populates per task. The brain treats it as immutable input.
| Field | Notes |
|---|---|
prompt |
Fully composed prompt (emissaries + persona + memory + skills + context + request) |
allowed_tools |
From executor.build_allowed_tools() — ["Read","Write","Edit","Grep","Glob","Bash","WebSearch","WebFetch"]. For ClaudeCodeBrain / TmuxClaudeBrain the list contents no longer reach the CLI (both run with --dangerously-skip-permissions, not an --allowedTools allowlist); the names only matter to NativeBrain, which filters its in-process tool set by them. A non-empty list is also the signal that distinguishes a tool-bearing task from a text-only one (empty = no tools, no skip-permissions, e.g. the sleep cycle). |
cwd |
Subprocess working directory (config.temp_dir) |
env |
Per-task env (already credential-stripped if the skill proxy is enabled) |
timeout_seconds |
config.scheduler.task_timeout_minutes * 60 |
model |
task.model or config.model; brain default if empty |
effort |
task.effort or config.effort; brain default if empty |
custom_system_prompt_path |
Override system prompt with a file (claude_code-specific) |
streaming |
True when the executor wants per-event progress callbacks |
on_progress |
Per-event callback receiving StreamEvents (the brain handles filtering) |
cancel_check |
Polled between events; True → kill subprocess, return cancelled |
on_pid |
Called once with subprocess PID immediately after spawn |
sandbox_wrap |
Closure that wraps the brain's raw cmd (e.g. with bubblewrap); brain stays sandbox-agnostic |
result_file |
claude_code-specific fallback file path |
BrainResult¶
| Field | Notes |
|---|---|
success |
Final success/failure |
result_text |
Final response text (executor reconciles against trace via _compose_full_result) |
actions_taken |
JSON-encoded list of tool-use descriptions |
execution_trace |
JSON-encoded [{"type":"tool"\|"text"\|"cm_boundary", ...}] |
stop_reason |
completed / cancelled / timeout / oom / transient_api_error / error / not_found |
ClaudeCodeBrain¶
Wraps the claude CLI subprocess. Owns:
- Command construction —
claude -p - --dangerously-skip-permissions --disallowedTools Agent Workflow, plus optional--model,--effort,--system-prompt-file, and (in streaming mode)--output-format stream-json --verbose --include-partial-messages. Tool-bearing tasks no longer pass an--allowedToolsallowlist — the model gets its full default toolset and the security boundary is the bwrap sandbox + network proxy + clean env, not an interactive permission prompt.Agent+Workflow(the harness's multi-agent fan-out) stay denied so Istota orchestrates through its own skills. Text-only invocations (emptyallowed_tools, e.g. the sleep cycle) emit no tool flags and no skip-permissions. The--include-partial-messagesflag makes the CLI emit answer / reasoning text token-by-token asstream_eventframes before the wholeassistantblock lands — without it the final response would arrive as one block and dump all at once on stream surfaces (web / REPL). - Sandbox wrap — calls
req.sandbox_wrap(cmd)if provided so the executor's bwrap configuration applies. - Subprocess —
Popen(streaming) orsubprocess.run(simple), prompt via stdin to avoidE2BIGon large prompts; stderr drained on a background thread to prevent deadlock. - Stream parsing — line-by-line via
make_stream_parser()from_events.py, dispatchingResultEvent→ final result,ToolUseEvent/TextEvent→ trace + on_progress,ContextManagementEvent→cm_boundarymarker in trace. Thestream_eventpartial frames parse intoTextDeltaEvent/ThinkingDeltaEventand go toon_progressonly (never the trace); the trailing whole-blockTextEvent/ThinkingEventstill records the trace and is deduped against the deltas executor-side (text via_delta_seen, thinking via_thinking_seen). On push surfaces (Talk) the deltas are dropped andTextEvent→progress_textstands. - Cancellation — polls
req.cancel_check()between events; final re-check after the subprocess exits catches SIGTERM-style external kills. - Timeout —
threading.Timerkills the process afterreq.timeout_seconds; result taggedstop_reason="timeout". - OOM detection — returncode
-9→stop_reason="oom". - API retry — wraps single-attempt execution in a 3-attempt loop with 5 s fixed sleep when
is_transient_api_error()matches (5xx / 429). Retries do NOT count against the task'sattempt_count. - Result fallback — prefers
ResultEvent→ result file → stderr.
_compose_full_result() is intentionally NOT in the brain — both brains will produce (result_text, execution_trace) and the executor reconciles them (CM-aware composition + terse-result recovery).
API error helpers¶
| Function | Purpose |
|---|---|
parse_api_error(text) |
Match API Error: (\d{3}) (\{...\}) and parse status_code / message / request_id |
is_transient_api_error(text) |
True if status_code in {500, 502, 503, 504, 529, 429} |
Both are re-exported from executor for scheduler.py and tests; canonical home is brain/claude_code.py.
Configuration¶
[brain]
kind = "claude_code" # "claude_code" | "native" | "tmux_claude"
[brain.native] # only when kind = "native" (or routed-to)
provider = "openai_compat"
model = "claude-sonnet-4-6"
base_url = "https://api.anthropic.com/v1"
# api_key via ISTOTA_BRAIN_NATIVE_API_KEY (kept out of TOML)
[brain.tmux] # only when kind = "tmux_claude" (or routed-to)
# All fields default in code to the prototype's pinned values, so an
# absent block is behavioral parity. See config.example.toml for the
# full set (marker heuristics, circuit-breaker thresholds, CLI pin).
[brain.source_type_overrides] # per-source-type routing (gradual rollout)
scheduled = "native"
heartbeat = "native"
Defaults to "claude_code", so existing deployments need no changes. source_type_overrides maps a task's source_type to a brain kind, overriding kind for matching tasks — the gradual-rollout knob (brain.resolve_brain_kind resolves it per task; unknown kinds are logged and ignored). "native" is istota's own in-process agent loop — see the native brain operator runbook for enabling it, the dev tiers, and shadow compare. "tmux_claude" drives the interactive claude TUI in a detached tmux session to keep traffic on subscription billing; a launch-level failure returns stop_reason="fallback" so the executor reruns the task headless, and a process-global circuit breaker short-circuits to claude_code for a cooldown after repeated launch failures.
Adding a new brain¶
- Create
brain/<name>.pywith a class implementingBrain.execute(). - Add the kind string to
make_brain()inbrain/__init__.py. - Extend
BrainConfig(or add a nested config dataclass) for new knobs. - Update
_build_network_allowlist()inexecutor.pyif the brain calls a new external host (e.g.openrouter.ai:443). - Tests: instantiate the brain, mock its transport (HTTP / subprocess), verify it produces correct
BrainResultshapes for the standard cases (success, transient retry, cancel, timeout, oom, malformed output).
The executor doesn't need to know the new brain exists — selection is config-driven.