Blog · May 2026 · 8 min read

Giving AI Memory Across Threads

The hardest part of running an AI agent isn't making it smart. It's making it remember. Every new conversation thread starts with amnesia — the agent has no idea what happened in other threads, what decisions were made, or what you told it yesterday. I solved this with a three-tier memory system that gives Hermes continuity across Discord, terminal, and cron jobs.

The Three Tiers

Not all memory is the same. I needed different mechanisms for different types of information:

Basic Memory — Always-on context, injected into every conversation turn
Fact Store — Structured entity data with search and relational queries
Session Search — Recall from past conversation transcripts

Each tier solves a different forgetting problem.

Tier 1: Basic Memory

Basic memory is a block of text injected into the system prompt of every conversation. It's always there, every turn, regardless of thread. Think of it as the agent's "always know" list.

This is where I store:

Environment facts (Unraid IP, Docker ports, SSH key locations)
User preferences (timezone, communication style, coding conventions)
Project summaries (which services run where)
Operational lessons (don't assume WSL, verify writes before claiming success)

The key constraint: basic memory has a character limit (8,000 chars in my config). It needs to be dense and factual, not procedural. Every entry earns its place by preventing a future question or correction.

Tier 2: Fact Store (Holographic Memory)

Basic memory handles "what do I always need to know?" But what about structured queries like "what port does LanceOS run on?" or "what are the details of my AWS setup?" That's where the fact store comes in.

The fact store is a structured database with entity resolution and trust scoring. Each fact has:

Content: The actual fact text
Entity: What it's about (e.g., "LanceOS", "Unraid", "AWS CLI")
Category: Type classification (user_pref, project, tool, general)
Trust score: How reliable this fact is (based on feedback)
Tags: Searchable keywords

The query operations make it powerful:

search("docker") — keyword search across all facts
probe("Unraid") — all facts about a specific entity
reason(["AWS", "LanceOS"]) — facts connecting multiple entities

This means the agent can answer "what services run on Unraid?" by probing the Unraid entity, or "how do AWS and my blog connect?" by reasoning across both entities.

Tier 3: Session Search

For everything that happened in past conversations — the work done, the bugs fixed, the decisions made — there's session search. It's a full-text search index over all past conversation transcripts.

When the user says "what were we working on last time?" or "how did we fix that Docker networking issue?", the agent searches past sessions:

Browse mode: Recent sessions with titles and timestamps (zero LLM cost)
Search mode: Keyword search with LLM-generated summaries

The search uses FTS5 with OR logic for broad recall. If "Docker networking" returns nothing, searching "Docker" OR "networking" usually finds it.

Cross-Thread Awareness

The real magic happens when all three tiers work together. A cron job runs every 6 hours, refreshing a "cross-thread context" file that summarizes recent activity across all channels. This gets injected into new conversations as background context.

When I start a new thread on Discord, the agent already knows:

What was discussed recently (from session search)
What's happening with active services (from fact store)
What the user's preferences and constraints are (from basic memory)

It's not perfect recall, but it's enough to feel like talking to something that pays attention.

Lessons

Separate "always know" from "can look up." Basic memory for essentials, fact store for details, session search for history.
Character limits force discipline. 8,000 chars of basic memory means every entry needs to earn its place.
Trust scoring catches stale facts. When facts get outdated, users rate them "unhelpful" and they sink in relevance.
Cross-thread context bridges the gap. Even without perfect recall, a summary of recent activity prevents the "I have no idea what you're talking about" experience.

Need help with this?

I design AI agent memory systems — from basic context injection to structured fact stores with relational queries.

Work with me →

← Back to blog