Distillation Loop

Distillation is the process of rolling up many small records into fewer, higher-level records. It runs on its own cadence, asynchronously from everything else.

The motivation is simple: records accumulate. After a few months a single project might have hundreds of small records — meeting notes, action items, decisions, context fragments. Recall against that volume gets noisy quickly. The signal is still there, but the cost of finding it goes up.

Distillation does not delete the originals. It produces summary records that condense common themes, leaving the raw records in place for when detail is needed.

What distillation does

For each scope above a configured threshold (typically tens of records), the distillation pass:

Selects the input set. All records under the scope, optionally filtered by recency or type.
Sends them to the LLM with a structured prompt. The prompt asks for a thematic rollup: recurring topics, open threads, resolved decisions, unfinished questions.
Parses the result into one or more new summary records.
Writes the summary records with their own stable IDs and provenance that points back to every input record they were derived from.
Updates the SQLite sidecar with the new records and their relationships.

The summary records are first-class. They can be recalled directly, they can be superseded, they show up in status calls. The only thing that distinguishes them from any other record is their provenance — their derived_from field lists the record IDs of every input they were built from, rather than raw event IDs. See Provenance for the full frontmatter schema.

Why originals are kept

Three reasons:

Provenance. A summary that says “decision X was reached” is worth more if you can drill into the meeting where it was reached. Throwing away the original loses that.
Re-distillation. Distillation is not a one-shot operation. As more records accumulate, summaries can be re-generated with more context. Throwing away the inputs makes that impossible.
Storage is cheap. Records are small markdown files. There is no storage pressure that justifies destructive consolidation.

Distillation triggers

Distillation does not run on every normalize pass. It runs on a slower cadence:

Scheduled. Periodic background runs scan all scopes and pick the ones above threshold.
On-demand. Explicit invocation against a specific scope.
Reflect. The reflect MCP operation can trigger a small, synchronous distillation against the active session as part of its rollup.

Each is a separate code path that ends up calling the same core routine.

Cost control

Distillation is the only Cortex operation that calls the LLM. Three guards keep that bounded:

Threshold. A scope below the threshold is skipped entirely. Most scopes never distil.
Token budget. The prompt limits how many input records are sent to the LLM in a single call. Very large scopes are batched.
Wrapper-routed. The LLM call is routed through the same wrapper used by the rest of the AI tooling, so it shares quotas and observability.

What distillation is not

It is not deduplication. Dedup happens in normalize. Distillation assumes the input set is already de-duped.
It is not summarisation of arbitrary content. The inputs are existing Cortex records, with their existing structure and provenance.
It is not lossy compression of the store. The inputs survive distillation intact.

Where distillation outputs land

Distillation results are routed to the same PARA buckets as the records they were derived from, with a scope prefix that identifies them as summaries. They appear in recall results as ordinary records and rank through the same activation system.

Over a long enough horizon, recall against a scope returns a mix of recent raw records and older distilled summaries — which is exactly what you want. Detail where it matters, signal where it does not.