Blog · May 2026 · 8 min read

Giving AI Memory Across Threads

The hardest part of running an AI agent isn't making it smart. It's making it remember. Every new conversation thread starts with amnesia — the agent has no idea what happened in other threads, what decisions were made, or what you told it yesterday. I solved this with a three-tier memory system that gives Hermes continuity across Discord, terminal, and cron jobs.

The Three Tiers

Not all memory is the same. I needed different mechanisms for different types of information:

  1. Basic Memory — Always-on context, injected into every conversation turn
  2. Fact Store — Structured entity data with search and relational queries
  3. Session Search — Recall from past conversation transcripts

Each tier solves a different forgetting problem.

Tier 1: Basic Memory

Basic memory is a block of text injected into the system prompt of every conversation. It's always there, every turn, regardless of thread. Think of it as the agent's "always know" list.

This is where I store:

The key constraint: basic memory has a character limit (8,000 chars in my config). It needs to be dense and factual, not procedural. Every entry earns its place by preventing a future question or correction.

Tier 2: Fact Store (Holographic Memory)

Basic memory handles "what do I always need to know?" But what about structured queries like "what port does LanceOS run on?" or "what are the details of my AWS setup?" That's where the fact store comes in.

The fact store is a structured database with entity resolution and trust scoring. Each fact has:

The query operations make it powerful:

This means the agent can answer "what services run on Unraid?" by probing the Unraid entity, or "how do AWS and my blog connect?" by reasoning across both entities.

Tier 3: Session Search

For everything that happened in past conversations — the work done, the bugs fixed, the decisions made — there's session search. It's a full-text search index over all past conversation transcripts.

When the user says "what were we working on last time?" or "how did we fix that Docker networking issue?", the agent searches past sessions:

The search uses FTS5 with OR logic for broad recall. If "Docker networking" returns nothing, searching "Docker" OR "networking" usually finds it.

Cross-Thread Awareness

The real magic happens when all three tiers work together. A cron job runs every 6 hours, refreshing a "cross-thread context" file that summarizes recent activity across all channels. This gets injected into new conversations as background context.

When I start a new thread on Discord, the agent already knows:

It's not perfect recall, but it's enough to feel like talking to something that pays attention.

Lessons

Need help with this?

I design AI agent memory systems — from basic context injection to structured fact stores with relational queries.

Work with me →
← Back to blog