Memory subsystem¶

The memory subsystem lives under src/istota/memory/. This page describes how the subsystem is wired into the rest of Istota — the executor (read path), the scheduler (write path), and the on-disk storage layout. For the conceptual layering and configuration, see Memory.

src/istota/memory/
├── __init__.py
├── sleep_cycle.py        # Cron pipeline (user + channel extraction, curation, retention)
├── search.py             # Hybrid BM25 + vector index, retention sweep
├── knowledge_graph.py    # Temporal triples (subject, predicate, object)
└── curation/
    ├── types.py          # SectionedDoc / Section
    ├── parser.py         # parse / serialize markdown sections
    ├── ops.py            # apply_ops with validation
    ├── prompt.py         # curation prompt + JSON-fence stripper
    ├── audit.py          # USER.md.audit.jsonl writer
    ├── file_lock.py      # per-file flock for runtime memory writes
    └── lint.py           # Phase-A lint pass over USER.md bullets

memory/__init__.py re-exports the public surface for back-compat. In-repo callers import explicitly (from istota.memory.search import ...). The search() function is intentionally not re-exported because it would shadow the search submodule.

Read path: executor¶

The executor is the only consumer of memory data at task time. During prompt assembly (see executor) it injects memory in this fixed order:

User memory (USER.md) — read_user_memory_v2(config, user_id) from storage.py. Auto-loaded into every interactive prompt, skipped for briefings.
Knowledge graph facts — select_relevant_facts() returns identity facts (subject == user_id) plus any fact whose subject or object appears in the prompt. User-subject facts whose predicate is ephemeral (decided, interested_in, completed, acquired, disposed_of, traveled_to) are not auto-loaded as always-on identity — they pass through the same prompt-relevance gate as third-party facts, so a one-off shopping decision only surfaces when the current task is about it (ISSUE-109 lever 2). Capped by max_knowledge_facts. Skipped for briefings.
Channel memory (CHANNEL.md) — read_channel_memory(config, conversation_token) when a token is set.
Dated memories — read_dated_memories() reads the last auto_load_dated_days files from memories/YYYY-MM-DD.md. Skipped for briefings.
Recalled memories — _recall_memories() runs a hybrid search using the task prompt as the query, keyed on the user's namespace plus channel:{token} when applicable. Off by default (auto_recall = false). Two ISSUE-109 scope levers shape the results: recency decay down-weights each hit by age with a half-life of recency_half_life_days (default 180; 0 disables) so dense old clusters don't outrank current context on sheer mass, and episode windows suppress any chunk whose valid_until has passed so time-boxed memories age out cleanly.

If the resulting memory section exceeds max_memory_chars, _apply_memory_cap() truncates in this order: recalled → knowledge facts → dated → playbooks. Playbooks are truncated last (most protected — an actionable procedure outranks recalled snippets and dated context). User and channel memory are always preserved.

The read path is pure I/O and FTS5 lookups — there is no LLM call in the executor's memory layer.

Write paths¶

Per-task indexing (scheduler)¶

After every successful task, the scheduler indexes the conversation into memory_chunks:

process_one_task → execute_task → success
                ↓
                index_conversation(conn, user_id, task_id, prompt, result)
                ↓ (if conversation_token is set)
                index_conversation under channel:{token} as well

Two filters apply before indexing:

[memory_search] enabled and auto_index_conversations must both be true.
Silent scheduled jobs (task.heartbeat_silent = True) are skipped — high-volume retrieve-and-render crons have no recall value and were inflating memory_chunks.

index_conversation() chunks the text, content-hash dedupes, inserts into memory_chunks (FTS5 syncs via trigger), and embeds + writes memory_chunks_vec rows when sqlite-vec and sentence-transformers are both available. Indexing failures are caught and logged but never affect task completion.

Nightly extraction (sleep cycle)¶

check_sleep_cycles() and check_channel_sleep_cycles() run from the scheduler's main loop on briefing_check_interval (default 60 s). Each evaluates a per-user (or per-channel) cron expression and calls into memory/sleep_cycle.py when due.

process_user_sleep_cycle():

Reads sleep_cycle_state (last task id processed for this user).
gather_day_data() partitions completed tasks since the last run into INTERACTIVE (talk, email, cli) and AUTOMATED (cron, briefing, subtask) sections. Interactive tasks get 80% of a 50,000-char budget; each task gets an equal share of the section budget (interactive_budget // len(interactive), min-floored) with tail-biased truncation (40% head + 60% tail).
build_memory_extraction_prompt() includes the day data, the current USER.md (so Sonnet skips already-known facts), and a list of suggested predicates with usage hints.
Runs a privileged text-only model call through the configured brain (make_brain(config.brain).execute(BrainRequest(...))) — no streaming, no sandbox, no task queue. The sleep cycle is privileged orchestration: it goes through the brain abstraction (so a native deployment extracts with its own provider) but skips the isolation pipeline user-initiated tasks run through. Per-feature model overrides come from [sleep_cycle] (extraction_model, curation_model).
_parse_structured_extraction() extracts MEMORIES: (bullets), FACTS: (JSON triples), and TOPICS: (JSON map). Missing or malformed sections degrade gracefully.
Writes memories/YYYY-MM-DD.md with the bullets only.
Inserts each fact via add_fact() (fuzzy-deduped, single-valued predicates auto-supersede).
Picks the dominant topic from the TOPICS map and indexes the dated file with that topic via index_file(..., source_type="memory_file", topic=...).
Advances sleep_cycle_state.last_processed_task_id.
Calls cleanup_old_memory_files() (file pruning by date in filename).
Calls cleanup_old_chunks() (chunk pruning, see Retention).
If curate_user_memory is on, calls curate_user_memory().

process_channel_sleep_cycle() is the same shape, keyed on conversation_token, runs in UTC, attributes each task by user_id, focuses on shared context, and indexes under namespace channel:{token} with source_type = "channel_memory".

USER.md curation¶

When curate_user_memory = true, the user sleep cycle ends with op-based curation rather than a full file rewrite:

curate_user_memory(config, user_id, conn)
  ├── read_user_memory_v2()              # current USER.md
  ├── read_dated_memories(max_days=3)    # last 3 days, capped at 8000 chars
  ├── _load_kg_facts_text()              # current knowledge graph
  ├── parse_sectioned_doc()              # SectionedDoc
  ├── build_op_curation_prompt()         # prompt builder
  ├── make_brain(config.brain).execute() # configured brain (text-only, no sandbox)
  ├── strip_json_fences() + json.loads() # {"ops": [...]}
  ├── apply_ops()                        # (new_doc, applied, rejected)
  ├── (skip-write check on outcomes)
  ├── serialize_sectioned_doc() + write
  ├── index_file(source_type="user_memory")  # re-index for search
  ├── write_audit_log()                  # USER.md.audit.jsonl
  └── _post_curation_summary()           # one-line message to log_channel

apply_ops() accepts five op shapes (append, add_heading, remove, replace, remove_heading), validates each independently, and never raises. Bad ops accumulate in rejected while good ones still apply. remove and replace reach bullets anywhere in a section, including inside ### subsections; replace rewrites in place, remove_heading drops a whole section, and append --subheading targets a subsection.

The skip-write decision is outcome-based, not text-based: if every applied op was a no-op (noop_dup or noop_no_match), the file is left alone. Comparing serialized output against the file's current text would trigger a spurious rewrite whenever USER.md had harmless drift (CRLF, trailing whitespace on headings, missing trailing newline) that the round-trip normalized away.

Detailed semantics — op shapes, validation rules, reject reasons, audit format — live in Memory § Op-based USER.md curation.

Retention¶

[sleep_cycle] memory_retention_days is the unified knob. Each nightly user sleep cycle runs:

cleanup_old_memory_files(config, user_id, retention_days) — deletes dated files in memories/ whose date prefix is older than the cutoff.
cleanup_old_chunks(conn, user_id, retention_days) — deletes memory_chunks rows where source_type ∈ ("conversation", "memory_file", "channel_memory") and created_at is older than the cutoff. Vec rows cascade row-by-row (the vec table has no trigger; the FTS5 trigger handles memory_chunks_fts automatically). Durable user_memory chunks are never pruned by age — they refresh on file edit and after curation re-indexes.

The channel sleep cycle does the same chunk sweep scoped to channel_memory only, gated by [channel_sleep_cycle] memory_retention_days.

A subtle gotcha worth knowing: cleanup_old_chunks() formats its cutoff with strftime('%Y-%m-%d %H:%M:%S') so it matches SQLite's datetime('now') column default exactly. Python's isoformat() would emit a T separator that lex-compares greater than the SQLite space form for any same-date row, deleting up to 24 hours of rows on the cutoff day.

Storage layout¶

Files written to the user's Nextcloud workspace:

/Users/{user_id}/{bot_dir}/config/USER.md           # durable, hand- or curation-edited
/Users/{user_id}/{bot_dir}/config/USER.md.audit.jsonl  # curation audit log (sidecar)
/Users/{user_id}/memories/YYYY-MM-DD.md              # dated memory files

/Channels/{conversation_token}/CHANNEL.md            # durable, hand-edited
/Channels/{conversation_token}/memories/YYYY-MM-DD.md

SQLite tables (schema.sql):

Table	Role
`sleep_cycle_state`	Per-user `last_run_at`, `last_processed_task_id`
`channel_sleep_cycle_state`	Same, keyed on `conversation_token`
`memory_chunks`	Indexed text chunks; columns include `source_type`, `topic`, `entities`, `metadata_json`, `content_hash`, `created_at`, and `valid_from` / `valid_until` episode-window bounds (ISSUE-109)
`memory_chunks_fts`	FTS5 virtual table, trigger-synced from `memory_chunks`
`memory_chunks_vec`	sqlite-vec table, lazy-created via `ensure_vec_table()`
`knowledge_facts`	Temporal triples; `valid_from` / `valid_until` columns; unique-current index on `(user_id, subject, predicate, object) WHERE valid_until IS NULL`

source_type values used in memory_chunks:

Value	Source	Lifecycle
`conversation`	`index_conversation()` per task	Ephemeral — pruned by retention
`memory_file`	`index_file()` for dated `memories/YYYY-MM-DD.md`	Ephemeral — pruned by retention
`user_memory`	`index_file()` for USER.md (after curation or `reindex_all`)	Durable — never pruned by age
`channel_memory`	`index_file()` for dated channel memory files	Ephemeral — pruned by retention

Knowledge graph integration¶

memory/knowledge_graph.py is consumed in three places:

Sleep cycle. Extracted facts are inserted via add_fact() with source_type = "extracted". Predicate categories (all frozensets in knowledge_graph.py):
Single-valued (works_at, lives_in, has_role, has_status) — a new value auto-supersedes the prior one (sets valid_until on it).
Temporary (staying_in, visiting) — short-lived; the extractor sets valid_until case-by-case.
Auto-expiring (interested_in, completed, acquired, disposed_of, traveled_to) — when the caller supplies no valid_until, a default DEFAULT_EPHEMERAL_TTL_DAYS (90) window is stamped so the event ages out of the always-current view (ISSUE-109 lever 1). decided is deliberately excluded — durable decisions persist.
Everything else is multi-valued and coexists.

Word-level Jaccard similarity on the object string (threshold 0.6, FUZZY_DEDUP_THRESHOLD), compared only after a predicate-equality gate, catches near-duplicates. 2. Executor read path. select_relevant_facts() filters by relevance to the prompt and format_facts_for_prompt() renders them into the prompt's "Known facts" section. 3. Curation prompt. _load_kg_facts_text() includes current facts in the USER.md curation prompt so Sonnet doesn't duplicate structured knowledge as bullets in USER.md.

The graph stores temporal validity in dedicated columns rather than baking dates into the object string. invalidate_fact() sets valid_until = today; delete_fact() is a hard delete.

Failure modes and degradation¶

sqlite-vec missing: enable_vec_extension() returns False; search degrades to BM25-only. Indexing skips the vec write but still inserts memory_chunks and memory_chunks_fts.
sentence-transformers missing: _get_model() returns None; same degradation as above.
Sleep cycle brain unavailable or timeout: extraction is skipped for that user/channel that night; state advances anyway when the data is empty so we don't reprocess silently.
Curation JSON parse failure: logged with the raw output truncated; nothing is written, no audit entry. The next night re-attempts.
Indexing exception: caught and logged; never affects task completion.
Mount unavailable: sleep cycle skips file writes (mount is required for memory file reads/writes).

CLI surface¶

The memory_search skill exposes the index and the knowledge graph:

istota-skill memory_search search QUERY [--topic ...] [--entity ...] [--since YYYY-MM-DD]
istota-skill memory_search index conversation TASK_ID
istota-skill memory_search index file PATH
istota-skill memory_search reindex
istota-skill memory_search stats
istota-skill memory_search facts [--entity ...]
istota-skill memory_search timeline ENTITY
istota-skill memory_search add-fact …
istota-skill memory_search invalidate ID
istota-skill memory_search delete-fact ID

Memory — layered design, prompt order, configuration tables.
Executor — full prompt-assembly order and result composition.
Scheduler — when sleep cycles fire and how they're driven from the main loop.
Database — full schema overview.