Blog · May 2026 · 11 min read

How I Run an AI Agent as Infrastructure: Hermes on Autopilot

Most people interact with AI through a chat window. They type a message, get a response, and close the tab. I run mine 24/7 as infrastructure — with cron jobs, multiple platforms, persistent memory, and automated workflows. Here's the full stack.

The Gateway Architecture

Hermes runs as a Docker container on Unraid with multiple gateway profiles. Each profile connects to a different platform:

Discord gateway: Always-on bot that responds to messages in channels and threads
TUI/API gateway: Local HTTP API and terminal interface for direct interaction
Home Assistant: Bidirectional integration with my smart home
Webhook receiver: Accepts POST requests from external services

All profiles share the same persistent home directory, so memory, skills, and configuration are consistent across platforms.

Cron Jobs: The Autonomous Engine

The agent doesn't just respond — it acts on schedule. I have several cron jobs running:

Climate monitor: Checks Home Assistant climate sensors every 30 minutes, alerts on anomalies
Promised action watchdog: Scans session transcripts for unlogged commitments every 30 minutes
Content idea generator: Weekly scan of trending topics for blog post ideas
Analytics digest: Monthly Plausible analytics summary

Cron jobs run in isolated sessions with no current-chat context. This means the prompt must be self-contained — it can't rely on "what we were just talking about." This constraint forces clarity in task definitions.

Skills: Procedural Memory

Skills are reusable workflows stored as markdown files. Each skill has a YAML frontmatter (metadata, trigger conditions) and a markdown body (step-by-step instructions). When a task matches a skill's trigger, the agent loads it and follows the procedure.

Current skills include:

Pre-flight: Mandatory gate before any multi-step task (plan before building)
Systematic debugging: 4-phase root cause analysis (understand before fixing)
Home server deployment: WSL → Unraid Docker Compose pipeline
Google Health API: Query patterns for health data endpoints

Skills are version-controlled and updated when procedures change. If I discover a pitfall during a task, I patch the skill immediately — not after.

The Memory Hierarchy

Memory is the hardest problem in running an AI agent. I use three tiers:

Basic memory: Always-injected context (~8,000 chars). Environment facts, preferences, project summaries.
Fact store: Structured entity data with search, probe, and relational queries. 56+ facts with trust scoring.
Session search: Full-text search over past conversation transcripts. The agent's "recall" mechanism.

A cross-thread context file auto-refreshes every 6 hours, summarizing recent activity so new conversations start with awareness of what's been happening.

Monitoring and Health

The Hermes dashboard (port 9119) shows real-time status of all gateways, cron jobs, and connected platforms. Health checks run on each container, and failed cron jobs send alerts through Discord.

The key metrics I watch:

Gateway uptime: Both Discord and API gateways should be "connected"
Cron job success rate: Failed jobs get flagged within 30 minutes
Token usage: Monthly spend across all providers
Memory size: Basic memory should stay under 50% capacity

The Cost

Running an AI agent 24/7 sounds expensive. It isn't:

Hardware: Unraid server already running (no incremental cost)
Docker: Two containers, ~512MB RAM each
LLM API: Varies by usage, roughly $20-50/month with current workloads
Hosting: $0 (self-hosted)

Total: ~$20-50/month for an always-on, multi-platform AI agent with persistent memory and automated workflows.

Lessons from Months of Runtime

Treat your agent like infrastructure. Health checks, monitoring, automated restarts. "It works when I chat with it" isn't production.
Memory discipline pays off. Every fact saved in basic memory prevents a future question. Every skill saved prevents relearning a workflow.
Cron jobs are where the magic is. An agent that only responds to messages is a chatbot. An agent that acts on schedule is a system.
Verify everything. AI agents are confident wrong. After any write, deploy, or configuration change, verify it actually happened.

Need help with this?

I set up and manage AI agent infrastructure — 24/7 operation, cron automation, memory systems, multi-platform gateways.

Work with me →

← Back to blog