Caching is where a lot of agent projects get weird.
At first you don’t cache anything. Everything is “live.” The agent fetches what it needs and answers.
Then you look at the bill, or the latency, or the rate limits, and you add caching.
Now the agent is fast. And wrong.
This is not a failure of caching. It’s a failure of what you cached.
Don’t cache vibes
The easiest thing to cache is raw text.
It’s also the most dangerous.
If you cache an arbitrary page scrape, you are caching something with unclear semantics:
- Did the page change?
- Was it republished?
- Was the snippet a quote, or a paraphrase, or someone’s conclusion?
- Is the cached thing still within the time window the user cares about?
When you cache vibes, you get stale confidence.
Cache objects with addresses
The better move is to cache context objects that have stable identifiers and explicit time fields.
That way, your cache isn’t “some text we saw once.” It’s “this claim” or “this manifest” or “this snapshot,” and it comes with the metadata that makes caching safe:
- a stable ID (so you can tell if you already have it),
- a published date (so you know where it sits in time),
- an “as of” timestamp for snapshots (so you can detect staleness),
- evidence and confidence labels (so you can enforce policy before reuse).
This is one of the underrated reasons to build a structured context layer in the first place. It turns caching from a hack into an engineering decision.
Three cache layers that tend to work
In practice, we see caching behave well when it’s layered:
- Public aggregates: cache rolling snapshots and schema contracts. They’re stable, cheap to revalidate, and great for “do I need to go deeper?”
- Manifest envelopes: cache the bundle that ties source metadata to claims and routing context. This is the unit you can reuse across workflows.
- Record bodies: cache the deepest layer only when you actually need it, because it’s the most expensive and the easiest to misuse.
Hanging Context is essentially layer one, made public: panels and JSON twins that make the aggregate state inspectable.
Synorb is where layers two and three live for production systems: streams and retrieval objects that can be cached by ID and refreshed by policy.
A small heuristic
If a cached thing can’t be invalidated by a timestamp and an ID, you probably shouldn’t cache it.
That rule sounds strict. It’s not. It’s just what you end up rediscovering after an agent confidently cites something that was true last week.