How I Run an AI Agent as Infrastructure: Hermes on Autopilot
Most people interact with AI through a chat window. They type a message, get a response, and close the tab. I run mine 24/7 as infrastructure — with cron jobs, multiple platforms, persistent memory, and automated workflows. Here's the full stack.
The Gateway Architecture
Hermes runs as a Docker container on Unraid with multiple gateway profiles. Each profile connects to a different platform:
- Discord gateway: Always-on bot that responds to messages in channels and threads
- TUI/API gateway: Local HTTP API and terminal interface for direct interaction
- Home Assistant: Bidirectional integration with my smart home
- Webhook receiver: Accepts POST requests from external services
All profiles share the same persistent home directory, so memory, skills, and configuration are consistent across platforms.
Cron Jobs: The Autonomous Engine
The agent doesn't just respond — it acts on schedule. I have several cron jobs running:
- Climate monitor: Checks Home Assistant climate sensors every 30 minutes, alerts on anomalies
- Promised action watchdog: Scans session transcripts for unlogged commitments every 30 minutes
- Content idea generator: Weekly scan of trending topics for blog post ideas
- Analytics digest: Monthly Plausible analytics summary
Cron jobs run in isolated sessions with no current-chat context. This means the prompt must be self-contained — it can't rely on "what we were just talking about." This constraint forces clarity in task definitions.
Skills: Procedural Memory
Skills are reusable workflows stored as markdown files. Each skill has a YAML frontmatter (metadata, trigger conditions) and a markdown body (step-by-step instructions). When a task matches a skill's trigger, the agent loads it and follows the procedure.
Current skills include:
- Pre-flight: Mandatory gate before any multi-step task (plan before building)
- Systematic debugging: 4-phase root cause analysis (understand before fixing)
- Home server deployment: WSL → Unraid Docker Compose pipeline
- Google Health API: Query patterns for health data endpoints
Skills are version-controlled and updated when procedures change. If I discover a pitfall during a task, I patch the skill immediately — not after.
The Memory Hierarchy
Memory is the hardest problem in running an AI agent. I use three tiers:
- Basic memory: Always-injected context (~8,000 chars). Environment facts, preferences, project summaries.
- Fact store: Structured entity data with search, probe, and relational queries. 56+ facts with trust scoring.
- Session search: Full-text search over past conversation transcripts. The agent's "recall" mechanism.
A cross-thread context file auto-refreshes every 6 hours, summarizing recent activity so new conversations start with awareness of what's been happening.
Monitoring and Health
The Hermes dashboard (port 9119) shows real-time status of all gateways, cron jobs, and connected platforms. Health checks run on each container, and failed cron jobs send alerts through Discord.
The key metrics I watch:
- Gateway uptime: Both Discord and API gateways should be "connected"
- Cron job success rate: Failed jobs get flagged within 30 minutes
- Token usage: Monthly spend across all providers
- Memory size: Basic memory should stay under 50% capacity
The Cost
Running an AI agent 24/7 sounds expensive. It isn't:
- Hardware: Unraid server already running (no incremental cost)
- Docker: Two containers, ~512MB RAM each
- LLM API: Varies by usage, roughly $20-50/month with current workloads
- Hosting: $0 (self-hosted)
Total: ~$20-50/month for an always-on, multi-platform AI agent with persistent memory and automated workflows.
Lessons from Months of Runtime
- Treat your agent like infrastructure. Health checks, monitoring, automated restarts. "It works when I chat with it" isn't production.
- Memory discipline pays off. Every fact saved in basic memory prevents a future question. Every skill saved prevents relearning a workflow.
- Cron jobs are where the magic is. An agent that only responds to messages is a chatbot. An agent that acts on schedule is a system.
- Verify everything. AI agents are confident wrong. After any write, deploy, or configuration change, verify it actually happened.
I set up and manage AI agent infrastructure — 24/7 operation, cron automation, memory systems, multi-platform gateways.
Work with me →