OpenClaw is powerful at doing but could use some help remembering.

OpenClaw runs on your machine, clears your inbox, and builds skills. The gap is memory you can verify, query, and fix. A truth layer under it gives you provenance, rollback, and the same result every time you ask.

OpenClaw is powerful at doing but could use some help remembering.

What OpenClaw gets right

OpenClaw (and the broader Claw/Clawdbot wave) is the first time a lot of people have felt like they have a real personal AI. It runs on your machine. It has persistent memory. It can read your texts, manage your calendar, browse the web, fill forms, and build skills that get better as you use it. Brandon Wang's bull case is a good read: promise extraction from texts into calendar events, price alerts with complex criteria (e.g. "pullout bed OK if not in the same room as another bed"), freezer inventory from photos into Notion, Resy booking that intersects your calendar with restaurant availability. The agent does things. It also remembers things. Context accumulates. That's the "sweet elixir of context" he talks about.

So on the axis of "can the agent act on my behalf and learn my preferences," the answer is yes. The gap I care about is the other axis: how that memory is stored and whether it's something you can trust, replay, and fix when it goes wrong.

Where "more context" hits the same ceiling

I run a lot of my life through one agent (Cursor plus MCP): email, tasks, finance, contacts, content. I've hit limits that aren't about retrieval or model size. They're about state.

  • Overwrites with no undo. The agent updates a contact or merges two tasks. The previous state is gone. There's no versioning, no rollback. Writes are in-place.
  • No provenance. When the agent gives a wrong number or a wrong total, I can't trace it back to a specific record or import. I don't know which observation led to which answer.
  • No canonical identity. "Acme Corp" in one session and "ACME CORP" in the next may or may not be treated as the same entity. The agent re-infers each time. There are no stable IDs or merge rules.
  • Non-deterministic answers. Same question ("what's my total spend with vendor X?"), different answer tomorrow. Missed files, truncated search, or different entity resolution. No way to reproduce or verify.
  • Tool-bound memory. What the agent "knows" lives inside that tool's memory or that provider's context. I can't use the same contacts and tasks from Claude.ai or ChatGPT. The memory isn't shared across the tools I use.

Those limits don't go away when the agent gets more capable or more context. They get sharper. The more the agent does (calendar, contacts, tasks, transactions), the more you need a place where that state is first-class: identity, lineage, and the ability to query it deterministically and roll it back when something breaks.

What a truth layer adds under an agent like Claw

A truth layer isn't a replacement for the agent. It's the layer under it. The agent keeps doing: reading texts, browsing, filling forms, making calendar events, building skills. The layer is where the resulting state lives and how it's queried.

  • Persistent canonical identity. Contacts, tasks, transactions, events get stable IDs. "Acme Corp" and "ACME CORP" resolve to one entity by rule, not by per-session inference.
  • Provenance and audit. Every record can be traced to a source (import, agent action, user edit) and a time. When a number is wrong, you can see where it came from.
  • Deterministic queries. "Every transaction with vendor X in the last two years" or "all tasks for project Y" hit a structured store. Same query, same result. No re-search, no truncation, no re-inference.
  • Recoverability. When the agent overwrites a contact or merges two tasks by mistake, you have versioning and an audit trail. You can see what changed and roll back. Mutations are explicit; they're not silent overwrites.
  • Cross-tool truth. The same contacts, tasks, and execution plans are available to Cursor, Claude, ChatGPT, or Claw, via something like MCP. One memory substrate, many agents.

So Claw (or any Claw-style agent) would still own the "do" part: interpret intent, browse, fill forms, create events, learn workflows. The truth layer would own the "remember" part: canonical entities, timelines, provenance, and idempotent, replayable queries. The agent writes into the layer and reads from it. You get the lift of an agent that does things and a memory that doesn't drift, overwrite without trace, or disagree across sessions or tools.

Concrete picture

Imagine Claw creating a follow-up task after you promise something in a text. Today that might live in the agent's memory or in a local list. With a truth layer, that task is a first-class entity: linked to the conversation that created it, to the contact if relevant, and to any project or execution plan. You can query "all follow-ups from last week" or "tasks linked to this contact" from any tool that talks to the layer. If the agent later merges two tasks by mistake, you have a history of changes and can revert.

Or: Claw helps you track spending with a vendor. Without a structured store, it re-searches exports and emails each time and re-infers entity resolution. Totals can change. With a truth layer, transactions are normalized and tied to a canonical vendor ID. "Total spend with vendor X" is a query, not a one-off assembly. Same question, same answer. And if the agent "corrects" a transaction based on wrong inference, you have an audit trail and the option to roll back.

Brandon mentions writing workflows to Notion so he can see what Claw has learned. That's visibility into behavior. A truth layer adds visibility into state: what entities exist, how they're linked, where they came from, and how they changed. That's the complement to "the agent did something." "The agent did something, and here's the state it wrote, with lineage and the ability to fix it."

Why I'm building Neotoma this way

I'm building Neotoma as a structured memory layer with those primitives: entity resolution, timelines, provenance, determinism, and cross-platform access via MCP. I'm dogfooding it in my own agentic stack to see where they matter. The lesson from that work is that retrieval (embedding-based or agentic) and "more context" don't by themselves give you stable identity, verifiable state, or recoverability. Something that does has to sit underneath. OpenClaw and its ecosystem are proving that agents can do a lot. I think the next step is making sure what they do is grounded in a memory layer that you can trust, query, and fix. That's the layer I'm building.

5 min readessay