docs: rewrite HERMES.md with accurate 2026 market comparisons

* docs: rewrite HERMES.md with accurate 2026 market comparisons

* fix: correct /loop and scheduling claims for Claude Code

Three factual errors corrected:

1. /loop is a native bundled skill available without any plugin. The doc
   incorrectly described it as behavior from the ralph-wiggum plugin.
   ralph-wiggum provides /ralph-loop, which is distinct: it iterates toward
   a completion goal. /loop polls on a fixed schedule. Both exist and serve
   different purposes.

2. claude.ai/code/scheduled is not a real usable URL or scheduling interface.
   Removed the reference. Cloud scheduling is described as cloud-managed cron
   with a 1-hour minimum interval.

3. 'your data leaves your hardware' was only half-true. Desktop scheduled tasks
   run locally with full file access. Cloud tasks do leave your hardware. Rewrote
   to be precise: the real distinction vs Hermes cron is that neither option runs
   as a headless server daemon.

---------

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
This commit is contained in:
nesquena-hermes
2026-04-11 15:45:38 -07:00
committed by GitHub
parent 1003fa410c
commit 09325f1bdf

552
HERMES.md
View File

@@ -1,165 +1,176 @@
# Why Hermes # Why Hermes
Hermes is a persistent, autonomous AI agent that lives on your server. It remembers everything, Hermes is a persistent, autonomous AI agent that runs on your server. It has layered memory that
schedules work while you sleep, and gets more capable the longer it runs. This document explains accumulates across sessions, a cron scheduler that fires jobs while you're offline, and a
the mental model, why that matters, and how Hermes compares to every major AI tool available today. self-improving skills system that saves reusable procedures automatically. You reach it from a
terminal, a browser, or a messaging app — and it's the same agent with the same history every time.
This document explains the mental model, how Hermes compares to other tools honestly, and where
it is and is not the right choice.
--- ---
## The Core Idea: Assistants Forget. Agents Don't. ## The real problem: most tools are excellent in the moment and weak over time
Every time you open Claude Code, Codex, or a chat window, the tool starts from zero. It does not Memory is no longer a differentiator on its own. ChatGPT, Claude, Cursor, and GitHub Copilot all
know who you are, what you worked on yesterday, how your repo is structured, or what bugs you have some form of memory now. Anthropic, OpenAI, and Microsoft are all shipping scheduling and
already fixed. You re-explain yourself every single session. The tool is powerful in the moment agent features. The category boundaries that existed twelve months ago are blurring fast.
and useless the next day.
Hermes fills that gap. It runs on your server, retains context across every session, and acts Hermes is not the only tool with memory or automation. It is the tool that makes those
on your behalf whether or not you are at a keyboard. capabilities durable, self-hosted, cross-surface, and cumulative on your own server. The
distinction that matters is not "has memory" vs. "has no memory" — it's whether context persists
across sessions automatically, whether execution happens on hardware you control, whether you can
reach the same agent identity from any device, and whether the system gets meaningfully better at
your specific workflow over time without manual configuration.
``` ```
Assistant model: You -> [Tool] -> Answer -> Done Session-scoped: You -> [Tool] -> Answer -> Done
(tool forgets everything when the window closes) (some tools now carry memory, but the execution is stateless)
Agent model: You <-> [Hermes] <-> (memory, skills, schedule, tools) Persistent agent: You <-> [Hermes] <-> (memory, skills, schedule, tools, surfaces)
(persistent, learns your stack, acts on your behalf, runs while you're offline) (runs on your server, accumulates context, acts on your behalf offline)
``` ```
--- ---
## The Three Pillars ## A note on convergence
### 1. Memory That Compounds The market is converging. Chat assistants are adding task scheduling and file connectors. IDE
tools are launching cloud agent modes. CLI tools are adding skills systems and mobile surfaces.
The lines between "assistant," "editor," and "agent" are dissolving.
Hermes has layered memory that survives every session, every reboot, every model swap: This makes comparisons harder but also makes the question sharper: what actually matters when
every tool is claiming some version of every feature? For Hermes, the answer is synthesis. Any
single feature — memory, scheduling, messaging — is available somewhere else. The value is
having all of them in one self-hosted system, running continuously, with a persistent identity
that accumulates real knowledge of your stack over time.
- **User profile** -- who you are, your preferences, your communication style, things you've ---
corrected Hermes on
- **Agent memory** -- facts about your environment, your toolchain, your project conventions ## The three pillars
- **Skills** -- reusable procedures Hermes discovers and saves; it never has to relearn how to
deploy your app, run your tests, or review a PR ### 1. Memory that compounds
- **Session history** -- every past conversation is searchable; Hermes can recall what you
worked on last Tuesday Hermes has layered memory that survives every session, every reboot, and every model swap:
- User profile — who you are, your preferences, your communication style, things you've corrected Hermes on
- Agent memory — facts about your environment, your toolchain, your project conventions
- Skills — reusable procedures Hermes discovers and saves automatically; it never has to relearn how to deploy your app, run your tests, or review a PR
- Session history — every past conversation is searchable; Hermes can recall what you worked on last Tuesday
When you correct Hermes, it remembers. When it solves a tricky problem, it saves the approach. When you correct Hermes, it remembers. When it solves a tricky problem, it saves the approach.
When it learns your stack, that knowledge carries into every future session. When it learns your stack, that knowledge carries into every future session. You never configure
this manually — it happens in the background as a side effect of normal use.
### 2. Autonomous Scheduling ### 2. Autonomous scheduling
Hermes can run jobs without you present -- every hour, every morning, on any cron schedule. Hermes can run jobs without you present every hour, every morning, on any cron schedule. It
It fires up a fresh session, runs the task, and delivers the result to wherever you want it: fires up a fresh session with full access to your memory and skills, runs the task, and delivers
Telegram, Discord, Slack, Signal, WhatsApp, SMS, email, and more. the result wherever you want it: Telegram, Discord, Slack, Signal, WhatsApp, SMS, email, and more.
Things Hermes can do while you sleep: Things Hermes can do while you sleep:
- Review new pull requests on your GitHub repo and post a full verdict comment - Review new pull requests on your GitHub repo and post a full verdict comment
- Send you a morning briefing of news, markets, or anything else you care about - Send a morning briefing of news, markets, or anything else you track
- Run your test suite and alert you if something breaks - Run your test suite and alert you if something breaks
- Watch a competitor's blog for new posts and summarize them - Watch a competitor's blog for new posts and summarize them
- Monitor a datasource and notify you when a threshold is crossed - Monitor a datasource and notify you when a threshold is crossed
### 3. Reach It From Anywhere The difference from cloud-scheduled alternatives is that the job runs on your server, with your
memory and skills, and your data never leaves your hardware.
### 3. Reach it from anywhere
Hermes runs on your server and is reachable from every surface: terminal over SSH, the web UI Hermes runs on your server and is reachable from every surface: terminal over SSH, the web UI
(this project), and messaging apps including Telegram, Discord, Slack, WhatsApp, Signal, and (this project), and messaging apps including Telegram, Discord, Slack, WhatsApp, Signal, and
Matrix. Start a task from your phone, check it from the browser on your laptop, continue it in Matrix. Start a task from your phone, check it from the browser on your laptop, continue it in
a terminal on a remote server. The same agent, memory, and history follow you everywhere. a terminal on a remote server. The same agent, memory, and history follow you across all of them.
--- ---
## A Framework for AI Tools ## How AI tools are layered today
There are four distinct categories of AI tool. Understanding the category tells you what a tool The old four-category model — chat, editor, CLI, agent — is too clean. These layers are actively
can and cannot do. collapsing into each other. Here is a more honest picture:
### Category 1: Chat Assistants Chat assistants (Claude.ai, ChatGPT) now have persistent memory, task scheduling, 50+ service
*Claude.ai, ChatGPT, Gemini* connectors, and in some cases full agent modes with computer use. They are no longer "just chat."
You open a window, ask something, get an answer. No persistent memory beyond the conversation, IDE tools (Cursor, Windsurf, Copilot) have shipped or are shipping cross-session memory,
no ability to run code or touch files, no way to act on your behalf. Excellent for Q&A, cloud-based background agents, and in Cursor's case a full Automations platform with Slack
drafting, and brainstorming. You re-explain your context every session. integration. Cursor v3.0 (April 2026) is explicitly agent-first.
### Category 2: IDE Integrations CLI tools (Claude Code, Codex, OpenCode) have added hooks, skills, desktop app automations,
*GitHub Copilot, Cursor, Windsurf, Zed AI* and multi-surface reach. Claude Code now spans terminal, IDE, desktop, and browser. Codex has
become a product family: CLI, IDE extension, desktop app, and Codex Cloud.
Deep inside your editor. Autocomplete, inline diffs, refactors -- all excellent. Windsurf was Persistent self-hosted agents (Hermes, OpenClaw) sit at the intersection: they combine the
earliest with workspace-scoped memory (Cascade Memories); Copilot has been shipping repo-level tool-use power of CLI agents, the memory of chat assistants, the scheduling of automation
memory since late 2025 and is catching up. Cursor has no native memory as of early 2026. None platforms, and the cross-surface reach of messaging integrations — running continuously on
have scheduling or messaging access. Tied to one machine and one editor. hardware you own.
### Category 3: Agentic CLI Tools The question is not which category a tool belongs to. The question is which combination of
*Claude Code, Codex CLI, OpenCode, Aider* capabilities you actually need, where that execution lives, and whether the system gets better
at your specific context over time.
The current frontier for most developers. Can use real tools -- run shell commands, read and
write files, search the web, call APIs. Great for deep, multi-step tasks in a single terminal
session. All are adding memory and scheduling features to varying degrees (see comparisons below),
but the core model is still session-scoped: you invoke it, it works, it stops.
### Category 4: Persistent Autonomous Agents
*Hermes, OpenClaw (as of early 2026)*
All the tool use of Category 3, plus memory that accumulates across sessions, plus always-on
scheduling, plus multi-modal access from any device or messaging app. Gets more useful over time
rather than resetting to zero. Hermes and OpenClaw are the two primary open-source, self-hosted
tools in this category. OpenClaw is a gateway-centric automation platform; Hermes is a
self-improving agent that writes and reuses its own procedures from experience.
--- ---
## How Hermes Compares ## How Hermes compares
### vs. OpenClaw ### vs. OpenClaw
OpenClaw is the most direct comparison to Hermes and the question most people ask first. OpenClaw is the most direct comparison and the question most people ask first. Both are
Both are open-source, self-hosted, always-on agents with persistent memory, cron scheduling, open-source, self-hosted, always-on agents with persistent memory, cron scheduling, and messaging
and messaging app integration. If you're evaluating Hermes, you should evaluate OpenClaw too. app integration. If you're evaluating Hermes, evaluate OpenClaw too.
OpenClaw (MIT, ~347k GitHub stars) is built around a **Gateway** control plane written in OpenClaw (MIT) is built around a Gateway control plane written in Node.js/TypeScript. It has the
Node.js/TypeScript. It excels at broad personal automation: native Chrome/Chromium control for widest messaging coverage in the space — 24+ channels including WhatsApp, Telegram, Signal,
browser automation, the widest messaging platform support in the space (WhatsApp, Telegram, iMessage, LINE, WeChat, Slack, Discord, Teams, Matrix, Google Chat, Feishu, Mattermost, IRC,
Signal, iMessage, LINE, WeChat, Slack, Discord, Teams, Matrix, and more), voice wake words, Nextcloud Talk, and more. It has native Chrome/Chromium control via CDP, voice wake words on
and a ClawHub skill marketplace where users share pre-built automations. The community is large macOS and iOS, and a ClawHub marketplace with 10,700+ skills. The community is large (350k+
and the ecosystem is growing fast. GitHub stars, 16,900+ commits) and growing.
Hermes takes a different approach. It is built in Python and centers on a **self-improving Hermes is built in Python and centers on a self-improving agent loop rather than a gateway
agent loop** rather than a gateway control plane. The core difference is in how skills work: control plane. The core architectural difference is in skills: OpenClaw skills are primarily
OpenClaw skills are primarily human-authored plugins installed from a marketplace; Hermes human-authored plugins installed from a marketplace. Hermes writes and saves its own skills
**writes and saves its own skills automatically** as part of every session. When Hermes solves automatically as part of every session. When Hermes solves a problem a new way, it saves the
a problem a new way, it saves the procedure and reuses it going forward without any user effort. procedure and reuses it without any user effort. That's not a subtle distinction — it's the
reason Hermes gets meaningfully better at your workflow without you maintaining a plugin library.
Beyond the skills architecture, there are two other practical differences worth knowing: Two practical differences worth knowing directly:
**Stability.** OpenClaw's community forums and GitHub issues document a recurring pattern of Stability. OpenClaw's GitHub issues and community forums document recurring update-breaking
update-breaking regressions -- for example, Telegram integration was broken across multiple regressions. Telegram integration was broken across multiple releases from early 2026 through
releases in early 2026. The unofficial WhatsApp Web protocol OpenClaw uses is known to at least April 2026. The unofficial WhatsApp Web protocol OpenClaw relies on disconnects and
disconnect and requires periodic re-pairing (this is documented in OpenClaw's own FAQ). requires periodic re-pairing this is in OpenClaw's own FAQ.
Hermes has had no equivalent release breakages.
**Security.** ClawHub's open publishing model has been exploited repeatedly. A community audit Security. ClawHub's open publishing model has been exploited at scale. Three separate audits in
identified over a thousand malicious skills in the marketplace including prompt injections and early 2026 found serious problems: Koi Security (January 2026) linked 335 skills to a campaign
tool-poisoning payloads; the community-maintained awesome-openclaw-skills list tracks confirmed called "ClawHavoc" that delivered Atomic Stealer malware on macOS; Bitdefender found roughly
removals and flags known bad actors. Hermes has no third-party marketplace and a correspondingly 900 malicious packages representing about 20% of the ecosystem at the time; Snyk's "ToxicSkills"
smaller attack surface. report (February 2026) found malicious skills across roughly 4,000 scanned packages. China's
CNCERT issued a national warning about ClawHub. Hermes has no third-party marketplace and a
correspondingly smaller attack surface.
**OpenClaw's genuine strengths** are worth stating plainly: it has broader messaging coverage OpenClaw's genuine strengths are worth stating plainly: broader messaging coverage (iMessage,
(iMessage, LINE, WeChat, Teams -- platforms Hermes does not support), native browser and LINE, WeChat, Teams, Google Chat — platforms Hermes does not support), native browser and
computer control via Chrome CDP, voice wake words on macOS and iOS, a larger community, and computer control via Chrome CDP, voice wake words, a larger community, and more third-party
more third-party integrations than Hermes. If those capabilities matter most to you, OpenClaw integrations than Hermes. If those capabilities matter most, OpenClaw is worth a serious look.
is worth a serious look.
Where Hermes is the better fit: you want an agent that self-improves from experience without Where Hermes fits better: you want an agent that self-improves from experience without managing
manual plugin authoring, you work in Python and want access to the ML/data science ecosystem, a plugin library, you work in Python and want the ML/data science ecosystem, you want a stable
you want a stable deployment that does not break between updates, or you want a full web chat deployment that doesn't break between updates, or you want a full web chat UI rather than a
UI rather than a monitoring dashboard. control dashboard.
| | OpenClaw | Hermes | | | OpenClaw | Hermes |
|---|---|---| |---|---|---|
| Persistent memory | Yes | Yes | | Persistent memory | Yes | Yes |
| Scheduled jobs (cron) | Yes | Yes | | Scheduled jobs (cron) | Yes | Yes |
| Messaging app access | Yes (15+ platforms, incl. iMessage/WeChat) | Yes (10+ platforms) | | Messaging app access | Yes (24+ platforms, incl. iMessage/WeChat/LINE) | Yes (many platforms) |
| Web UI | Gateway dashboard (monitoring only) | Full three-panel chat UI | | Web UI | Chat UI + control dashboard | Full three-panel chat UI |
| Self-hosted | Yes | Yes | | Self-hosted | Yes | Yes |
| Open source | Yes (MIT) | Yes | | Open source | Yes (MIT) | Yes |
| Self-improving skills | Partial (AI can generate skills; not the default loop) | Yes (automatic, first-class) | | Self-improving skills | Partial (AI can generate; not the default loop) | Yes (automatic, first-class) |
| Browser / computer control | Yes (native Chrome CDP) | Via shell / tools | | Browser / computer control | Yes (native Chrome CDP) | Via shell / tools |
| Voice wake words | Yes (macOS/iOS) | No | | Voice wake words | Yes (macOS/iOS) | No |
| Python / ML ecosystem | No (Node.js) | Yes | | Python / ML ecosystem | No (Node.js) | Yes |
@@ -167,209 +178,312 @@ UI rather than a monitoring dashboard.
| Multi-profile support | Via binding-rule routing | Yes (first-class named profiles) | | Multi-profile support | Via binding-rule routing | Yes (first-class named profiles) |
| Provider-agnostic | Yes | Yes | | Provider-agnostic | Yes | Yes |
| Update reliability | Moderate (documented regressions) | High | | Update reliability | Moderate (documented regressions) | High |
| Memory inspectability | Limited | Yes (markdown files, editable) |
| Self-hosted autonomous execution | Yes | Yes |
### vs. Claude Code (Anthropic) ### vs. Claude Code (Anthropic)
Claude Code is Anthropic's official agentic CLI and one of the best tools in Category 3. Claude Code is Anthropic's official agentic tool and one of the strongest options for focused
In a single focused session it is capable -- deep code understanding, shell access, file coding sessions. It has deep code understanding, shell access, file editing, and multi-step
editing, multi-step reasoning. reasoning. It has been expanding rapidly — it now spans terminal, IDE plugin, desktop app, and
browser surfaces — and the gap is closing in several areas.
Claude Code has been adding features rapidly and the gap is narrowing: What Claude Code has that's worth knowing:
- **Hooks system** -- 13 event types (SessionStart, PreToolUse, PostToolUse, Stop, etc.) with - Hooks system — 26 event types (SessionStart, PreToolUse, PostToolUse, Stop, and more) with
4 handler types (shell command, HTTP endpoint, LLM prompt, sub-agent); deterministic 4 handler types (shell command, HTTP endpoint, LLM prompt, sub-agent); gives deterministic
non-LLM control over the agent lifecycle non-LLM control over the agent lifecycle
- **Plugins / Skills** -- installable via `/plugin install`, hot-reloaded from `~/.claude/skills`, - Plugins / Skills installable via `/plugin install`, hot-reloaded from `~/.claude/skills`,
with a marketplace; skills and slash commands unified as of v2.1.0 with a marketplace; includes the official ralph-wiggum plugin (`/ralph-loop`) for
- **Scheduling** -- `/loop` (session-scoped), cloud-managed cron via `claude.ai/code/scheduled` autonomous iteration toward a completion goal (distinct from `/loop`)
(Anthropic infrastructure, minimum interval applies), and desktop app automations - `/loop` — a native bundled skill, available in every session without any plugin, that runs
- **Messaging channels** -- Telegram, Discord, iMessage, and webhooks via the Channels feature a prompt on a repeating schedule within an active CLI session (polling/monitoring use case);
(research preview, v2.1.80+); deep Slack integration that triggers cloud sessions and creates PRs session-scoped, dies when the terminal closes
- **Claude Cowork** -- a separate product for knowledge workers; connects to 38+ - Scheduling — cloud-managed cron (Anthropic infrastructure, minimum 1-hour interval) and
services via MCP including Slack, Gmail, Microsoft Teams, Notion, Jira, Salesforce, and more desktop app scheduled tasks (run locally while the app is open, minimum 1-minute interval,
- **Memory** -- CLAUDE.md and MEMORY.md for project-level context; auto-memory rolling out full local file access); no self-hosted cron
- Messaging channels — Telegram, Discord, and iMessage via the Channels feature (research
preview, requires Bun runtime); Slack is the most-requested addition and has not yet shipped
- Memory — CLAUDE.md and MEMORY.md for project-level context; auto-memory since v2.1.59+
- Claude Cowork — a separate knowledge-worker product connecting 38+ services via MCP
including Gmail, Microsoft Teams, Notion, Jira, Salesforce, and more
These are real features. The key differences that remain: Claude Code's source was briefly and accidentally made public in March 2026 before being taken
down. The CLI ships as minified/bundled TypeScript compiled with Bun — it is not open source.
- Claude Code's scheduling runs on **Anthropic's cloud** (or requires the desktop app open), Key differences that remain:
not a self-hosted server; cloud jobs have a minimum interval and your data leaves your hardware
- Memory is **project-file-based** (CLAUDE.md / MEMORY.md), not a knowledge graph that - Scheduling requires cloud (Anthropic infrastructure, data off your hardware, 1-hour minimum)
accumulates automatically across all your work; auto-memory is still rolling out or the desktop app (runs locally, but the app must stay open — not a headless server process);
- **Not provider-agnostic** -- routes through Bedrock or Vertex but always hits a Claude model; neither runs as a server daemon the way Hermes cron does
you cannot switch to GPT, Gemini, or a local model - Memory is project-file-based (CLAUDE.md / MEMORY.md plus rolling auto-memory); it doesn't
- **Not open source** -- proprietary; the CLI ships obfuscated JavaScript automatically accumulate a cross-project knowledge graph the way Hermes does
- Messaging channels are a **research preview** requiring Bun runtime; not yet production-grade - Not provider-agnostic — routes through Anthropic, Bedrock, Vertex, or Foundry, but always
a Claude model; you can't switch to GPT, Gemini, or a local model
- Messaging channels are still a research preview, not production
Hermes can use Claude Code as a sub-agent. For large implementation tasks, Hermes can spawn Hermes can use Claude Code as a sub-agent. For large implementation tasks, Hermes can spawn
Claude Code to handle the heavy lifting and fold the result back into its own memory and history. Claude Code to handle the heavy lifting and fold the result back into its own memory and history.
| | Claude Code | Hermes | | | Claude Code | Hermes |
|---|---|---| |---|---|---|
| Persistent memory (automatic) | Partial (CLAUDE.md / MEMORY.md, rolling out) | Yes | | Persistent memory (automatic) | Partial (CLAUDE.md / MEMORY.md + auto-memory v2.1.59+) | Yes |
| Skills / hooks system | Yes (Hooks + Plugin/Skills marketplace) | Yes (auto-generated from experience) | | Skills / hooks system | Yes (26-event Hooks + Plugin/Skills marketplace) | Yes (auto-generated from experience) |
| Scheduled jobs (self-hosted) | No (cloud or desktop-app only) | Yes | | Scheduled jobs (self-hosted) | No (cloud or desktop-app only) | Yes |
| Messaging access | Partial (Telegram/Discord/iMessage via research preview; Slack native) | Yes (10+ platforms, production) | | Messaging access | Partial (Telegram/Discord/iMessage research preview; Slack not yet) | Yes (many platforms, production) |
| Cowork connectors (Slack, Gmail, etc.) | Yes (via Claude Cowork, separate product) | Via agent tool use | | Cowork connectors (Slack, Gmail, etc.) | Yes (via Claude Cowork, separate product) | Via agent tool use |
| Web UI | Yes (claude.ai/code, Anthropic-hosted) | Yes (self-hosted) | | Web UI | Yes (claude.ai/code, Anthropic-hosted) | Yes (self-hosted) |
| Provider-agnostic | No (Claude models only, via Bedrock/Vertex) | Yes (any provider) | | Provider-agnostic | No (Claude models only) | Yes (any provider) |
| Self-hosted scheduling | No | Yes | | Self-hosted scheduling | No | Yes |
| Open source | No | Yes | | Open source | No | Yes |
| Background/cloud agent mode | Yes (cloud-scheduled) | Yes (self-hosted cron) |
| Runs as sub-agent of Hermes | Yes | N/A | | Runs as sub-agent of Hermes | Yes | N/A |
| Memory inspectability | Partial (CLAUDE.md readable; auto-memory less so) | Yes (markdown files) |
### vs. Codex CLI (OpenAI) ### vs. Codex CLI (OpenAI)
Codex CLI is OpenAI's open-source agentic terminal tool (Apache 2.0, ~73k GitHub stars). It Codex CLI (Apache 2.0, ~60k GitHub stars) started as a straightforward terminal tool and has
supports 10+ providers including Anthropic, Google, Mistral, Groq, and local models via Ollama. expanded into a product family. It was rewritten from TypeScript to Rust. It now includes an IDE
It added persistent session memory in v0.100.0 with `codex resume`. The desktop app has an extension, a desktop app with an Automations feature, and Codex Cloud for remote execution. A
Automations feature for scheduled local tasks. Skills system is shared across surfaces. It supports 12+ built-in providers: OpenAI, Anthropic,
Google/Gemini, Mistral, Groq, Ollama, OpenRouter, LM Studio, Together AI, DeepSeek, xAI,
Azure OpenAI, and custom endpoints.
The CLI itself has no native scheduling (open feature request as of early 2026). Memory is The CLI itself has no native scheduling (open feature request). Session continuity is available
session-history-based rather than a living knowledge graph. No messaging app access. A strong via `codex resume`. Memory is session-history-based plus AGENTS.md project context — not a
tool for single-session coding; Hermes adds the always-on layer on top. living knowledge graph that accumulates across all your projects. No first-party messaging
integration. The Automations feature in the desktop app covers scheduled local tasks but doesn't
reach the cross-session, cross-surface continuity Hermes has.
| | Codex CLI | Hermes | | | Codex CLI | Hermes |
|---|---|---| |---|---|---|
| Persistent memory | Partial (session history + AGENTS.md) | Yes (automatic, layered) | | Persistent memory | Partial (session history + AGENTS.md) | Yes (automatic, layered) |
| Scheduled jobs | Partial (desktop app only; CLI has none) | Yes | | Scheduled jobs | Partial (desktop app Automations; CLI has none) | Yes |
| Messaging app access | No | Yes | | Messaging app access | No | Yes |
| Web UI | No | Yes (self-hosted) | | Web UI | No (CLI + desktop app) | Yes (self-hosted) |
| Provider-agnostic | Yes (10+ providers) | Yes (10+ providers) | | Provider-agnostic | Yes (12+ providers) | Yes |
| Self-hosted | Yes | Yes | | Self-hosted | Yes | Yes |
| Open source | Yes (Apache 2.0) | Yes | | Open source | Yes (Apache 2.0) | Yes |
| Background/cloud agent mode | Yes (Codex Cloud) | Yes (self-hosted cron) |
| Self-improving skills | No | Yes |
### vs. OpenCode ### vs. OpenCode
OpenCode is an open-source TUI agentic coding assistant, provider-agnostic across 75+ providers. OpenCode is an open-source TUI agentic coding assistant supporting 75+ providers. It has a WebUI
It has a WebUI embedded in its binary and an official desktop app. It uses SQLite for session embedded in its binary, an official desktop app, SQLite session history, and AGENTS.md project
history and AGENTS.md for project context. context. It supports CLAUDE.md as a fallback for users migrating from Claude Code. There are 30+
community plugins, and community messaging integrations exist for Telegram, Slack, Discord, and
Microsoft Teams — though none are first-party and all require manual setup.
No native scheduled jobs (a community background plugin exists), no first-party messaging OpenCode Go ($10/month) and OpenCode Zen (curated model service) are subscription tiers. The
integration (community Telegram bots exist but require manual setup), and no automatic GitHub Copilot official integration launched January 2026. There is no native scheduling; a
cross-session semantic memory. Good for interactive terminal coding sessions. community background plugin exists. No automatic cross-session semantic memory.
| | OpenCode | Hermes | | | OpenCode | Hermes |
|---|---|---| |---|---|---|
| Persistent memory | Partial (session history + AGENTS.md) | Yes (automatic, layered) | | Persistent memory | Partial (session history + AGENTS.md) | Yes (automatic, layered) |
| Scheduled jobs | No (community plugin only) | Yes | | Scheduled jobs | No (community plugin only) | Yes |
| Messaging app access | No (community Telegram bot only) | Yes (first-party, 10+ platforms) | | Messaging app access | Community integrations only (Telegram/Slack/Discord/Teams) | Yes (first-party, many platforms) |
| Web UI | Yes (embedded + desktop app) | Yes (self-hosted) | | Web UI | Yes (embedded + desktop app) | Yes (self-hosted) |
| Mobile access | No | Yes | | Mobile access | No | Yes |
| Skills system | No | Yes | | Skills / plugins | Yes (30+ community plugins) | Yes (auto-generated, first-party) |
| Provider-agnostic | Yes (75+ providers) | Yes | | Provider-agnostic | Yes (75+ providers) | Yes |
| Open source | Yes | Yes | | Open source | Yes | Yes |
| Self-hosted autonomous execution | No | Yes |
### vs. Cursor / Windsurf / Copilot ### vs. Cursor
Category 2 tools -- exceptional at in-editor autocomplete, inline diffs, and code review. Cursor has changed substantially. The "no memory, no scheduling, no messaging" description was
Not competing for the same job as Hermes, and they work well alongside it. accurate in 2024 and is wrong now.
Windsurf was earliest with workspace-scoped memory (Cascade Memories); Copilot has been Memories (per-project cross-session knowledge base) shipped in beta with v1.0 in June 2025.
shipping repo-level memory since late 2025. Cursor has no native cross-session memory as of Automations launched March 5, 2026 — time-based, event-based (GitHub/Linear/PagerDuty), and
early 2026. None have scheduling or messaging access. communication-based (Slack) triggers that fire background agents on cloud VMs. The web app,
mobile agent, and Slack bot give it multi-surface reach. Cursor v3.0 (April 2, 2026) is
explicitly agent-first with Design Mode and 30+ marketplace plugins. Cursor acquired Supermaven
for autocomplete. As of early 2026 it's valued at $29.3B with $2B ARR. It is not a narrow editor
tool anymore.
Hermes still has a different profile: it's self-hosted and server-resident, the same persistent
identity follows you across every surface without cloud intermediation, and it works with any
model family rather than being cloud-VM-based. For workflows that require data sovereignty,
self-hosted scheduling, or deep Python/ML tooling on your own hardware, Cursor's cloud-agent
architecture is a fundamental mismatch. For teams that want editor-native agents with strong
IDE integration, Cursor's recent evolution is significant.
| | Cursor | Windsurf | Copilot | Hermes | | | Cursor | Windsurf | Copilot | Hermes |
|---|---|---|---|---| |---|---|---|---|---|
| In-editor autocomplete | Excellent | Excellent | Excellent | No | | In-editor autocomplete | Excellent (Supermaven) | Excellent (Cascade) | Excellent | No |
| Inline diff / refactor | Yes | Yes | Yes | Via shell | | Inline diff / refactor | Yes | Yes | Yes | Via shell |
| Cross-session memory | No | Yes (workspace) | Partial (repo, early access) | Yes | | Cross-session memory | Yes (Memories, per-project) | Yes (Cascade Memories, workspace) | Yes (Agentic Memory, repo-scoped, 28-day expiry) | Yes (automatic, persistent) |
| Scheduled background jobs | No | No | No | Yes | | Scheduled background jobs | Yes (Automations, cloud VM) | No | Via Coding Agent (issue-driven) | Yes (self-hosted cron) |
| Messaging app / mobile | No | No | No | Yes | | Messaging app / multi-surface | Yes (Slack bot, web app, mobile) | No | Via Copilot CLI / fleet | Yes (many platforms) |
| Background/cloud agent mode | Yes (Automations on cloud VMs) | No | Yes (Coding Agent, GA Mar 2026) | Yes (self-hosted) |
| Terminal tool use | Limited | Limited | Limited | Full | | Terminal tool use | Limited | Limited | Limited | Full |
| Self-hosted | No | No | No | Yes | | Self-hosted | No | No | No | Yes |
| Provider-agnostic | Partial | Partial | No | Yes | | Self-hosted autonomous execution | No | No | No | Yes |
| Provider-agnostic | Partial | Partial | No (GitHub models) | Yes |
| Open source | No | No | No | Yes | | Open source | No | No | No | Yes |
| Memory inspectability | Partial | Yes (stored locally) | Limited | Yes (markdown files) |
### vs. Claude.ai / ChatGPT ### vs. Claude.ai and ChatGPT
Category 1. For drafting, Q&A, and brainstorming in the moment, both are excellent. These are no longer simple chat tools. The description of "no memory, no scheduling, no
messaging" is inaccurate for both.
Claude.ai memory has been improving -- it now generates memory from chat history, not just Claude Cowork (in Claude Desktop) launched scheduled tasks on February 25, 2026 — hourly,
user-curated entries. Claude.ai can also execute code and read/write files in a sandboxed daily, weekly, weekdays, and on-demand. It runs in an isolated VM with file and shell access.
environment via Artifacts. These are real capabilities, just not the same as direct filesystem Claude has 50+ service connectors as of February 2026 including Slack (launched January 26,
or shell access on your own server. 2026), Gmail, Google Calendar, Google Drive, Microsoft 365, Notion, Asana, Linear, and Jira.
Memory auto-generates from chat history, not just user-curated entries. Code execution and
file access in Artifacts is sandboxed, not the same as shell access on your own server.
| | Claude.ai / ChatGPT | Hermes | ChatGPT has Agent Mode (launched July 17, 2025), Scheduled Tasks (January 2025, recurring
|---|---|---| automated prompts), a computer-using agent, Projects, 50+ connectors including Gmail, GitHub,
| Memory across conversations | Yes (improving; auto-generated from history) | Yes (deep, automatic) | and Google Drive, dual-mode memory (auto + manual), and ChatGPT Pulse for Pro users (daily
| Runs shell commands | No | Yes | research briefings). It is not a passive Q&A interface.
| Code execution | Sandboxed (Artifacts) | Yes (full shell) |
| Reads / writes files | Sandboxed (Artifacts) | Yes (full filesystem) | Where Claude.ai and ChatGPT differ from Hermes: neither is self-hosted, neither is
| Schedules background jobs | No | Yes | provider-agnostic, and neither gives you execution on your own hardware. Connectors and
| Web UI | Yes | Yes | scheduling exist, but they run on Anthropic's or OpenAI's infrastructure. Your memory, session
| Messaging apps | No | Yes | history, and agent execution live on their servers, not yours. For many use cases that's fine
| Self-hosted | No | Yes | — they are capable and well-supported. For privacy-conscious users, regulated environments, or
| Provider-agnostic | No | Yes | workflows that require persistent server-side execution on controlled hardware, it's a
| Open source | No | Yes | disqualifying constraint.
| | Claude.ai | ChatGPT | Hermes |
|---|---|---|---|
| Memory across conversations | Yes (auto-generated from history) | Yes (dual-mode: auto + manual) | Yes (deep, automatic) |
| Scheduled tasks | Yes (Cowork: hourly/daily/weekly) | Yes (since Jan 2025) | Yes (any cron, self-hosted) |
| Service connectors / messaging | Yes (50+ via Cowork) | Yes (50+ connectors) | Yes (many platforms, direct) |
| Runs shell commands | Sandboxed (Cowork VM) | Sandboxed | Yes (full shell) |
| Code execution | Sandboxed | Sandboxed | Yes (full shell) |
| Reads / writes files | Sandboxed | Sandboxed | Yes (full filesystem) |
| Web UI | Yes (Anthropic-hosted) | Yes (OpenAI-hosted) | Yes (self-hosted) |
| Self-hosted | No | No | Yes |
| Provider-agnostic | No | No | Yes |
| Open source | No | No | Yes |
| Self-hosted autonomous execution | No | No | Yes |
| Memory inspectability | Limited | Limited | Yes (markdown files) |
--- ---
## The Compounding Advantage ## The compounding advantage
What matters most about Hermes is that it improves over time. That is the point. What distinguishes Hermes from most of the tools above is that it gets meaningfully better at
your specific workflow over time without manual configuration.
Every time Hermes encounters a new environment, it saves facts to memory. Every time it solves Every time Hermes encounters a new environment, it saves facts to memory. Every time it solves
a problem a new way, it saves the approach as a skill. Every time you correct it, it updates its a problem a new way, it saves the approach as a skill. Every time you correct it, it updates its
profile of you. Every session, every scheduled job, every tool call, the agent gets more profile of you. Every session, every scheduled job, every tool call adds to a body of knowledge
calibrated to you and your workflow. that is specific to you, stored on your hardware, and available to every future interaction.
A Claude Code session on day one and day one hundred are identical. A Hermes agent on day one A Claude Code session on day one and day one hundred are identical — it starts fresh. A Hermes
and day one hundred is smarter about you -- it knows your stack, your conventions, your agent on day one and day one hundred knows your stack, your conventions, your preferences, and
preferences, and the solutions that have worked before. the solutions that have worked before. That's the actual compounding.
--- ---
## Who Hermes Is For ## Who Hermes is for
**Solo developers and power users** who don't want to re-explain their stack every session and Solo developers and power users who don't want to re-explain their stack every session and want
want an AI that actually knows their environment. an AI that actually knows their environment.
**Teams on a shared server** where multiple people want Claude-quality AI access without each Teams on a shared server where multiple people want capable AI access without each paying for
paying for a separate subscription or running local tooling. a separate subscription or running separate local tooling.
**Automation-heavy workflows** where you want an AI running tasks on a schedule, delivering Automation-heavy workflows where you want an AI running tasks on a schedule, delivering results
results to your phone, without babysitting it. to your phone, without babysitting it.
**Privacy-conscious users** who want their conversations, memory, and files on their own Privacy-conscious users who want their conversations, memory, and files on their own hardware.
hardware.
**Multi-model users** who want to switch between OpenAI, Anthropic, Google, DeepSeek, and Multi-model users who want to switch between OpenAI, Anthropic, Google, DeepSeek, and others
others based on cost, capability, or rate limits, without rebuilding their workflow each time. based on cost, capability, or rate limits, without rebuilding their workflow each time.
--- ---
## Scope and Limits ## What Hermes is not
**Hermes lives in the terminal, browser, and messaging apps.** For in-editor autocomplete and Hermes is not the best in-editor autocomplete tool. Cursor and Windsurf do that job better.
inline diffs, use Cursor or Windsurf alongside it -- they do that job better. Use one alongside Hermes.
**You run Hermes on your own server.** That means initial setup, but your data stays on your It is not zero-setup. You are running a server. That means initial configuration, and it means
you're responsible for uptime, upgrades, and backups. The tradeoff is data sovereignty and
control; that only makes sense if you actually want it.
It does not make weaker models magical. Memory and skills help, but the underlying model still
determines reasoning quality. Hermes with a weak model is a well-organized weak model.
It still needs guardrails, approvals, and observability for high-stakes automations. Autonomous
execution on a schedule with shell access is powerful and requires judgment about what to
approve. Terminal commands can require confirmation before running; use that for anything
consequential.
If you need the absolute lowest-friction path to a one-off answer or a quick edit, a chat
interface or an in-editor tool is the right call. Hermes is for continuity and autonomy, not
minimum-friction one-shots.
---
## Scope and limits
Hermes lives in the terminal, browser, and messaging apps. For in-editor autocomplete and inline
diffs, use Cursor or Windsurf — they do that job better and work well alongside Hermes.
You run Hermes on your own server. That means initial setup, but your data stays on your
hardware and you control the schedule, the models, and the costs. hardware and you control the schedule, the models, and the costs.
**Hermes is an orchestration and memory layer.** It makes whatever model you point it at more Hermes is an orchestration and memory layer. It makes whatever model you point at it more useful
useful over time. The models do the reasoning; Hermes makes sure that reasoning accumulates into over time. The models do the reasoning; Hermes makes sure that reasoning accumulates into
something durable. something durable.
--- ---
## Quick Reference ## Security and control
| | OpenClaw | Claude Code | Codex CLI | OpenCode | Cursor | Claude.ai | Hermes | Memory is stored locally on your server as readable, editable files: user profile, agent memory,
|---|---|---|---|---|---|---|---| and skills are all markdown. Session history is in SQLite on your machine. You can inspect,
| Persistent memory (auto) | Yes | Partial† | Partial | Partial | No | Yes (improving) | **Yes** | edit, or delete any of it directly.
| Scheduled / background jobs | Yes | Partial‡ | Partial§ | No | No | No | **Yes (self-hosted)** |
| Messaging app access | Yes (15+ platforms) | Partial (Telegram/Discord preview; Slack native) | No | No | No | No | **Yes (10+ platforms)** |
| Web UI | Dashboard only | Yes (Anthropic cloud) | No | Yes | No | Yes | **Yes (self-hosted)** |
| Skills system | Yes (marketplace) | Yes (Hooks + Plugins) | No | No | No | No | **Yes** |
| Self-improving skills | Partial | No | No | No | No | No | **Yes** |
| Browser / computer control | Yes (Chrome CDP) | No | No | No | No | No | Via shell |
| Python / ML ecosystem | No (Node.js) | No | No | No | No | No | **Yes** |
| In-editor autocomplete | No | No | No | No | Yes | No | No |
| Orchestrates other agents | No | No | No | No | No | No | **Yes** |
| Provider-agnostic | Yes | No (Claude only) | Yes | Yes | Partial | No | **Yes** |
| Self-hosted | Yes | No | Yes | Yes | No | No | **Yes** |
| Open source | Yes (MIT) | No | Yes | Yes | No | No | **Yes** |
| Always-on / autonomous | Yes | No | No | No | No | No | **Yes** |
† Claude Code has CLAUDE.md / MEMORY.md project context and rolling auto-memory, but not full automatic cross-session recall If you want external memory providers, eight are supported: Mem0, Honcho, Hindsight, RetainDB,
‡ Claude Code scheduling: cloud-managed (Anthropic infrastructure) or desktop-app only; no self-hosted cron ByteRover, Supermemory, Holographic, and others. These are optional and configurable.
§ Codex scheduling: desktop app Automations only; CLI has no native scheduling
Execution runs in configurable backends: local shell, Docker, SSH, Daytona, Singularity, or
Modal. You choose what execution environment Hermes operates in and what it can reach.
Terminal commands can require confirmation before running. For any automation that touches
production systems or makes external calls, enable approval controls.
Secrets stay on your hardware. Hermes does not phone home; it calls whatever model APIs you
configure directly.
Multiple profiles give isolation between users or projects. A shared server can have separate
profiles with separate memory, separate skills, and separate history.
---
## Quick reference
| | OpenClaw | Claude Code | Codex | OpenCode | Cursor | Copilot | Claude.ai | ChatGPT | Hermes |
|---|---|---|---|---|---|---|---|---|---|
| Persistent memory (auto) | Yes | Partial† | Partial | Partial | Yes (per-project) | Yes (repo-scoped‡) | Yes | Yes | Yes |
| Scheduled / background jobs | Yes | Partial§ | Partial¶ | No | Yes (Automations) | Via Coding Agent | Yes (Cowork) | Yes | Yes (self-hosted) |
| Messaging / multi-surface | Yes (24+ platforms) | Partial (preview) | No | Community only | Yes (Slack/web/mobile) | Via CLI/fleet | Yes (50+ connectors) | Yes (50+ connectors) | Yes (many platforms) |
| Web UI | Chat UI + control dashboard | Anthropic-hosted | No | Yes | Yes + mobile | github.com | Yes (Claude Desktop) | Yes | Yes (self-hosted) |
| Skills system | Yes (ClawHub marketplace) | Yes (Hooks + Plugins) | Partial (Skills) | Community plugins | Yes (marketplace) | No | No | No | Yes (auto-generated) |
| Self-improving skills | Partial | No | No | No | No | No | No | No | Yes |
| Browser / computer control | Yes (Chrome CDP) | No | No | No | No | No | No | Yes (CUA) | Via shell |
| In-editor autocomplete | No | No | Via extension | No | Excellent | Excellent | No | No | No |
| Orchestrates other agents | No | No | No | No | No | No | No | No | Yes |
| Provider-agnostic | Yes | No (Claude only) | Yes | Yes | Partial | No | No | No | Yes |
| Self-hosted | Yes | No | Yes (CLI) | Yes | No | No | No | No | Yes |
| Self-hosted autonomous execution | Yes | No | No | No | No | No | No | No | Yes |
| Background/cloud agent mode | Yes | Yes (cloud) | Yes (Codex Cloud) | No | Yes (cloud VMs) | Yes (Coding Agent) | Yes (Cowork VM) | Yes (Agent Mode) | Yes (self-hosted) |
| Memory inspectability | Limited | Partial | Partial | Partial | Partial | Limited | Limited | Limited | Yes (markdown files) |
| Open source | Yes (MIT) | No | Yes (Apache 2.0) | Yes | No | No | No | No | Yes |
| Always-on autonomous execution | Yes | No | No | No | No | No | No | No | Yes |
† Claude Code: CLAUDE.md / MEMORY.md project context plus auto-memory since v2.1.59+; no automatic cross-project accumulation
‡ Copilot Agentic Memory: public preview Jan 15, 2026; enabled by default Mar 4, 2026; repo-scoped, auto-expires after 28 days
§ Claude Code scheduling: cloud-managed (Anthropic infrastructure) or desktop-app only; no self-hosted cron
¶ Codex scheduling: desktop app Automations only; CLI has no native scheduling