Tiered Memory Graduation: Don't Delete, Promote

When you run an AI agent like Hermes Agent with a memory system, you quickly hit a wall: the built-in memory has a hard character limit. When it’s full, the agent silently deletes old entries to make room for new ones. Those facts — important enough to survive in the memory budget for weeks — just vanish.

This is a known problem in the Hermes Agent community. Users on r/hermesagent regularly ask about memory management. The built-in memory (MEMORY.md and USER.md) is capped at roughly 2,200 and 1,375 characters respectively. When you hit the cap, the agent is expected to “consolidate, replace, or remove” entries. But there’s no documented guidance on what happens to evicted facts.

The Setup

I run Hermes Agent with Honcho as an AI-native memory backend. They coexist as two layers:

Layer	What it does	Budget
Hermes memory (`MEMORY.md` / `USER.md`)	Compact facts injected into every prompt, every turn	~2,200 / ~1,375 chars
Honcho conclusions (`honcho_conclude`)	Durable facts written to a searchable archive	Unlimited
Honcho peer cards	Stable identity markers (name, location, relationships)	Max 40 entries
Honcho observations	Raw facts extracted automatically from every conversation turn	Automatic

Hermes memory is always-on — every single turn, the system prompt includes the full memory block. This is great for high-signal facts that must always be in context (“user prefers CAD”, “never push to main without review”). But the tight budget means you’re constantly making room.

Honcho, on the other hand, has unlimited storage. It’s searchable via honcho_search, can be reasoned over via honcho_reasoning, and conclusions persist across sessions. But it’s not injected into every prompt — you have to explicitly query it.

The Problem

When I needed to add a new fact to Hermes memory and the budget was full, I’d remove an old entry. That fact would disappear from per-turn injection AND I never wrote it to Honcho first. The information was lost unless it happened to still exist as a raw observation the deriver had extracted from the original conversation.

Nobody is doing this better. I searched the Hermes docs, Honcho docs, GitHub discussions, Reddit threads, and community blog posts. The pattern of “graduate evicted memory to the archive layer before deletion” doesn’t exist anywhere.

The Fix: Memory Graduation

Instead of deleting evicted entries, graduate them to Honcho’s searchable archive before removing them from the tight-budget per-turn layer.

The flow:

Hermes memory is full — need to make room for a new fact
Before evicting, write the old entry to Honcho via honcho_conclude
Then remove from Hermes memory
The fact graduates from always-on per-turn injection → on-demand searchable archive

This ensures durable facts are never silently lost. They remain retrievable via honcho_search and honcho_reasoning even without per-turn injection.

When NOT to graduate

The fact is stale or wrong and should be forgotten entirely
The fact was already corrected (write the correction to Honcho first, then evict the old version)
The fact is duplicated elsewhere in memory

When facts conflict

If Hermes memory and Honcho disagree, defer to the user. Ask which is current before reconciling.

Why This Works

The two systems have complementary strengths:

Hermes memory is fast, always-on, and has a tight budget. It’s for facts that must be in context right now.
Honcho conclusions are unlimited, searchable, and don’t cost per-turn tokens. They’re for facts that are durable but don’t need to be in every prompt.

Graduation respects the nature of both layers. The tight budget of Hermes memory stays tight — you’re still making room. But nothing is lost. The archive grows, and honcho_search / honcho_reasoning can surface graduated facts when they become relevant again.

What About Honcho’s Dreamer?

Honcho 3.x has a “dreaming” system — a background reasoning agent that crawls accumulated observations and synthesizes conclusions, peer cards, and summaries. This is Honcho’s own automatic graduation: raw observations get reasoned into higher-order knowledge.

The manual graduation pattern I’m describing here is complementary, not competing. The dreamer handles the automatic synthesis — patterns, deductions, identity markers. Manual honcho_conclude handles the explicit facts the user states — “my secondary email is X”, “the agent’s email is Y”. These are things the deriver might extract, but writing them explicitly ensures they’re captured with the right framing and no delay.

Implementation

I saved this as a skill in Hermes so it persists across sessions. The key rules:

Before evicting a Hermes memory entry → write it to Honcho via honcho_conclude
Then remove from Hermes memory
Conflicting facts → defer to the user
Don’t silently delete — always graduate first (unless the fact is stale/wrong)

The skill also documents the broader write-path routing:

Fact type	Hermes memory	Honcho conclusion	Peer card
Compact preference	✅	✅ if durable	❌ (dreamer-owned)
Standing instruction	✅	✅	❌ (dreamer-owned)
Identity fact	✅ if compact	✅	❌ (dreamer-owned)
Correction	✅ update existing	✅	Only if dreamer fails

Lessons

Don’t delete what you can graduate. If you have a tiered memory system, eviction from the hot layer should be a promotion, not a deletion.
Check the docs first, but don’t stop there. Neither Hermes nor Honcho docs mention this pattern. It’s a natural consequence of how the two systems work, not something that’s prescribed.
Manual writes complement automatic ones. Honcho’s dreamer handles synthesis. The agent writing explicit conclusions via honcho_conclude fills gaps the dreamer might miss or take too long to produce.
Measure before optimizing. I ran a full token economy audit before touching memory routing. Understanding where quota was actually being burned made the memory layering decision obvious.

This pattern isn’t specific to Hermes + Honcho. Any agent setup with a hot per-turn memory and a cold searchable archive can use it — MemGPT, Letta, custom RAG stacks, anything with tiered memory. The principle is simple: when the hot layer is full, don’t throw away what you’ve learned. Put it somewhere you can find it again.

The Setup#

The Problem#

The Fix: Memory Graduation#

When NOT to graduate#

When facts conflict#

Why This Works#

What About Honcho’s Dreamer?#

Implementation#

Lessons#