Skip to content

Provenance

Provenance is the property that lets every record be traced back to its origin. It is what makes Cortex auditable, what makes supersession safe, and what makes distillation reversible.

Every record carries a provenance block in its frontmatter:

provenance:
event_ids: [evt-...]
source_session_id: <id>
supersedes: <prior_record_id> # only when this record replaces another
derived_from: [rec-..., rec-...] # only on distillation summaries

The block is filled in by the normalizer (and the distillation loop) using metadata that was already present on the source events. The user does not write provenance by hand.

A record claims something. The claim sits inside a session that produced it. The session was triggered in a project. The project had specific context.

Provenance makes that chain navigable:

record ──provenance.event_ids──▶ event ──event.session_id──▶ session

If a record looks wrong, the original event is recoverable. If the original event looks wrong, the originating session is identifiable. The chain bottoms out at something concrete.

When a record supersedes another, the new record’s provenance includes supersedes: <old_id>. The old record’s frontmatter is updated to point forward: superseded_by: <new_id>. Both directions are explicit.

This means superseding is not a destructive operation. A reader landing on either the old or the new record can navigate to the other.

A distillation summary’s provenance includes derived_from: [list of record IDs]. Anyone reading the summary can drill into every record it was built from.

This is what makes summaries trustworthy. They are not opaque restatements; they are pointers into specific original material.

The default failure mode of AI memory is hallucinated detail: the system remembers the gist but cannot recover the specifics. Asked “where did this come from?”, the answer is “I’m not sure.”

Provenance makes that failure impossible. Every record either knows where it came from or it is malformed. There is no third state. A record cannot quietly forget its origin while staying in the store, because the normalizer would have refused to write it.

Provenance covers static lineage. The access events table covers dynamic usage. Together they answer two distinct questions:

  • Where did this record come from? (Static. Provenance.)
  • When was this record actually used? (Dynamic. Access events.)

Both are available without LLM involvement. Both are queryable from the SQLite sidecar. They are separate concerns and they are kept separate on purpose.