Unsaved-changes guard:
- _closeSettingsPanel() intercepts all three close paths (X button, overlay
click, Escape key) and checks _settingsDirty before closing
- If dirty: shows inline 'Unsaved changes' bar with Save & Close / Discard
- Discard reverts the live theme preview to what it was when panel opened
- _markSettingsDirty() wired to all inputs via addEventListener in loadSettingsPanel()
- saveSettings() now resets dirty flag and hides the bar on successful save
Theme improvements:
- Add 'Slate' theme: warm charcoal (#2b2d30 bg), a softer/lighter dark option
that sits between Dark and the full light themes
- Rework 'Light' theme: replace pure white (#f5f5f7) with warm off-white
(#f0ede8) -- warmer, lower contrast, less harsh on most displays
- Update /theme command to include 'slate' in valid list
- Add test_settings_set_theme_slate() to test_sprint26.py
- Sprint 12 and 13 headers: add missing (COMPLETED) labels
- Sprint 23 header: corrected from 'Profile/Workspace/Model Coherence' to
'Agentic Transparency + Context Visibility' (what it actually shipped)
- Sprint 24 Track C: removed stale self-referential cleanup items that are now done
- Sprint 26 added: full plan for pluggable UI themes (light/dark/solarized/monokai/nord)
including CSS variable architecture, flicker prevention, /theme slash command,
settings picker with live preview, and test spec
- ROADMAP.md: add v0.32/v0.33 to sprint history table, add Sprint 25/26 to feature checklist
- SPRINTS.md footer: add horizon sprint line
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Three bugs found during review:
1. Class is SessionDB not HermesState -- would silently no-op on every install
2. SessionDB.__init__ takes Path not str -- would crash with AttributeError
3. _execute_write() takes a callable not SQL+params -- wrong signature.
Replaced with public set_session_title() API.
4. Each call opened a persistent SQLite connection and never closed it.
Added try/finally db.close() to prevent WAL leak under sustained load.
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
WebUI sessions were invisible to 'hermes /insights' because the WebUI
bypasses the gateway and calls AIAgent.run_conversation() directly,
never writing to state.db.
New 'Sync usage to /insights' setting (default: off) that mirrors
WebUI session metadata (tokens, cost, model, title) into state.db
after each turn. Uses absolute token counts to avoid double-counting.
Components:
- api/state_sync.py: bridge module with sync_session_start() and
sync_session_usage(). Uses ensure_session() (idempotent) and
update_token_counts(absolute=True). All wrapped in try/except.
- api/config.py: new 'sync_to_insights' boolean setting
- api/streaming.py: calls sync_session_usage() after s.save()
- api/routes.py: same for the non-streaming chat path
- Settings UI: checkbox toggle with description
Default off because:
- Writing to state.db while CLI/gateway also writes could cause
WAL lock contention on busy systems
- Some users may not want WebUI sessions in /insights stats
Closes#92
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The context indicator in the composer footer now shows real data from
the agent's context compressor instead of hardcoded estimates:
- last_prompt_tokens / context_length (e.g. '12.4k / 200k (6%)')
- Bar color: blue <50%, yellow 50-75%, red >75%
- Hover tooltip shows exact numbers + compression threshold
- Cost appended when available
Backend: streaming.py now reads context_length, threshold_tokens, and
last_prompt_tokens from agent.context_compressor after run_conversation()
and includes them in the usage dict sent with the 'done' SSE event.
This matches the CLI's context window display (the bar that shows
current context vs total window).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The agent's run_conversation() already triggers context compression
internally, but the WebUI was unaware of the side effects:
1. Session ID rotation: compression creates a new session_id inside
the agent. The WebUI kept writing to the old session file, causing
silent data loss. Fix: detect agent.session_id mismatch after
run_conversation(), rename the session file, and update in-memory
caches.
2. No user notification: compression was invisible. Fix: emit a
'compressed' SSE event when compression is detected. Frontend shows
a system message and toast.
3. No manual control: Fix: add /compact slash command that sends a
message to the agent requesting context compression. Shows in the
autocomplete dropdown.
Detection works two ways:
- agent.session_id != original session_id (ID rotation)
- agent.context_compressor.compression_count > 0 (compressor state)
Closes#90
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The delete endpoint only removed sessions from the WebUI JSON store,
silently no-oping on CLI sessions (which live in state.db). The trash
button showed 'Conversation deleted' but the session reappeared on
next refresh.
Fix: after the existing WebUI delete, also call delete_cli_session()
which removes the session + messages from state.db. Wrapped in
try/except so WebUI-only sessions still delete normally.
New delete_cli_session() in api/models.py mirrors the existing
get_cli_session_messages() pattern for state.db access.
Closes#87
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When running tests without hermes-agent, 24 tests that depend on cron,
skills, approval, or agent backend modules now skip cleanly instead of
failing with 500 errors.
Detection: conftest.py checks if the agent dir exists and if cron.jobs
and tools.skills_tool are importable. When not available, an explicit
list of 24 test names is auto-marked with pytest.mark.skip.
Result:
- Without agent: 400 passed, 24 skipped, 0 failed
- With agent: all 424 tests run normally (skip logic is a no-op)
A warning banner prints at collection time:
"hermes-agent not found — 24 agent-dependent tests will be skipped"
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- routes.py /api/git-info: get_session raises KeyError on miss, does not
return None -- wrap in try/except KeyError to correctly return 404
(PR #82, api/routes.py line 222)
- style.css ctx-bar used undefined --teal CSS variable -- replaced with
--blue which is defined in :root and fits the existing color palette
(PR #83, static/style.css)
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Agent review: hardcoded 128000 is wrong for Claude (200k), Gemini (1M),
and smaller models (8k-32k). Added a lookup table keyed by model name
substring covering major families with 128k fallback. TODO comment
for fetching exact values from server.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agent review: l[0:2].strip() produced incorrect matches for git status
--porcelain XY format. Now checks both X (index) and Y (worktree)
columns for M/A/R status codes independently.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agent review feedback: ordered array was constructed but never iterated
(the new code uses groups[] instead). Removed the dead variable.
Added comment noting function hoisting for _renderOneSession.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shows a compact bar + label in the composer footer after the first
response, displaying input/output token counts, context window fill
percentage, and estimated cost. Bar turns yellow >50% and red >75%.
Updates on every response completion via the existing usage data from
the done SSE event. Hidden until first response (no usage data yet).
Inspired by PR #75 (@MartinNielsenDev).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the workspace root is a git repo, a badge in the panel header
shows the current branch name, dirty file count, and ahead/behind
status. Updates on every root directory load.
Backend:
- git_info_for_workspace() in api/workspace.py runs lightweight git
commands (rev-parse, status --porcelain, rev-list) with 3s timeout
- New GET /api/git-info endpoint returns branch, dirty count, modified,
untracked, ahead, behind
Frontend:
- _refreshGitBadge() in workspace.js fetches git info on root load
- Git badge element in panel header shows branch + status
- Badge turns gold when workspace has uncommitted changes
Inspired by PR #75 (@MartinNielsenDev).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Token events from SSE now buffer and render at most once per animation
frame via requestAnimationFrame, instead of calling renderMd() and
writing to the DOM on every single token event.
Before: ~100 tokens/sec = ~100 DOM writes/sec (causes jank on heavy output)
After: ~100 tokens/sec batched to ~60 DOM writes/sec (one per frame)
The change is a small wrapper: _scheduleRender() gates rendering behind
a rAF flag so multiple tokens arriving between frames are batched into
a single renderMd() + scrollIfPinned() call.
Inspired by PR #75 (@MartinNielsenDev).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Date group headers (Pinned, Today, Yesterday, Earlier) are now clickable
to collapse/expand their session lists. Collapsed state persists to
localStorage across page reloads.
- Refactored renderSessionListFromCache to group sessions first, then
render groups with collapsible wrappers
- Extracted _renderOneSession() helper for reuse within group bodies
- Chevron indicator rotates -90deg when collapsed
- Pinned group header keeps its gold color
Inspired by PR #75 (@MartinNielsenDev).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extracted genuinely new feature ideas from @MartinNielsenDev's PR #75
and added them to the Advanced/Future section of the roadmap:
- Subagent session tree (sidebar hierarchy with expand/collapse)
- Specialized tool card renderers (diff, terminal, todo views)
- Streaming performance (rAF-throttled token rendering)
- Git integration modal (branch/status/log in workspace)
- Collapsible date groups in session list
- LLM-generated session titles
- Workspace git detection (branch/dirty status)
- Clarify dialog (blocking agent questions)
- Gateway approval polling
- Unified session storage (SessionDB shared with CLI)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the user's config uses a non-Anthropic provider with an
Anthropic-compatible endpoint (e.g. MiniMax at
https://api.minimax.io/anthropic), chat in the WebUI fails silently
with APIConnectionError on every request, while the hermes CLI and
messaging gateway work fine with the same config.
Root cause: both api/routes.py and api/streaming.py constructed
AIAgent using only (model, provider, base_url) from
resolve_model_provider() and never passed api_key. When the base URL
ends in /anthropic, AIAgent uses the anthropic_messages adapter, but
only falls back to ANTHROPIC_TOKEN when provider == "anthropic" (a
safety check to avoid leaking Anthropic credentials to third parties).
For MiniMax and similar providers the effective key becomes "", and
the auth failure surfaces as a generic "Connection error" after three
retries.
The CLI and gateway resolve the key via
hermes_cli.runtime_provider.resolve_runtime_provider(), which reads
MINIMAX_API_KEY (and similar) from ~/.hermes/.env. This patch does the
same before creating the AIAgent in both chat paths.
Fixes#77
Matches the fix applied to api/config.py in PR #72. Both defaults
now consistently use ~/.hermes/webui for a clean generic install.
HERMES_WEBUI_STATE_DIR env var still overrides for anyone running
multiple instances.
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
The previous default pointed to 'webui-mvp' which is the internal
development repo name and meaningless to anyone deploying the public
repo. Changed to the generic '~/.hermes/webui' which is a sensible
default for any deployment.
The state dir remains fully overridable via HERMES_WEBUI_STATE_DIR
for anyone who wants to run multiple instances side by side.
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
When `pip install --target .` is run inside the hermes-agent checkout,
third-party package directories (openai/, pydantic/, requests/, etc.)
end up alongside real Hermes source files. With the agent dir at the
front of sys.path (insert(0)), Python resolves imports from those local
directories, breaking whenever the host platform differs from the
container (e.g. macOS .so files inside a Linux image).
Fix: append agent dir to sys.path instead of prepending. This lets
site-packages resolve pip packages correctly while still allowing
Hermes-specific modules (run_agent, hermes/, etc.) to resolve since
they do not exist in site-packages.
Also improves verify_hermes_imports() to surface the actual exception
message in startup logs, making it much easier to diagnose why a
module failed to import.
Fix five stacking/overflow bugs in static/style.css (no JS changes):
1. Profile dropdown overlaps chat messages
.topbar lacked a stacking context -- added position:relative;z-index:10
so the dropdown (z-index:200 child) always paints above .messages (z-index:0)
2. Workspace dropdown clipped by sidebar overflow:hidden
.sidebar overflow:hidden was swallowing the upward-opening ws-dropdown.
Changed to overflow:visible -- scroll is already on .session-list, not .sidebar.
Added position:relative;z-index:10;overflow:visible to .sidebar-bottom.
3. Slash-command dropdown could render behind tool cards
.composer-wrap had position:relative but no z-index.
Added z-index:10 so cmd-dropdown always sits above .messages (z-index:0).
4. Skill picker dropdown clipped inside Settings modal
.settings-panel had overflow-y:auto which clipped the absolute-positioned
skill picker. Changed to overflow:visible + display:flex;flex-direction:column,
moved overflow-y:auto to .settings-body, raised skill-picker-dropdown to z-index:1100.
5. CLI session badge blocks action buttons on hover
Added .session-item.cli-session:hover::after { display:none } so the gold
'cli' label hides on hover, making archive/delete/pin fully reachable.
6. Workspace dropdown name+path crowded on same line
.ws-opt was a plain block with inline spans. Added flex-direction:column;gap:4px
and display:block to each child so name and path stack cleanly on separate lines.
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
1. Image preview onerror fires on clearPreview (#68)
clearPreview() set previewImg.src='' which triggered the stale onerror
handler, showing 'Could not load image' on every refresh/message.
Fix: null out onerror before clearing src.
2. CLI session badge covers delete button (#69)
The ::after 'cli' label occupied the same space as the hover-revealed
.session-actions overlay, making delete unreachable.
Fix: add padding-right to .cli-session, use margin-left:auto to push
badge right, add pointer-events:none so clicks pass through.
3. Tool cards visible through profile dropdown
The .messages container had no stacking context, so tool cards could
render above the profile dropdown (z-index:200).
Fix: add position:relative;z-index:0 to .messages to establish a
stacking context that keeps all children below overlays.
Closes#68, closes#69
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The webui stores display-only fields on messages (attachments, timestamp,
_ts) for UI rendering. These leaked into the conversation_history passed
to AIAgent.run_conversation(). Most providers ignore unknown fields, but
Z.AI/GLM tries to deserialize 'attachments' as its native ChatAttachments
type, causing HTTP 400 on every subsequent message after an image upload.
Fix: _sanitize_messages_for_api() creates a clean copy with only
API-standard keys (role, content, tool_calls, tool_call_id, name,
refusal) before passing to run_conversation(). Applied to both the
streaming path (streaming.py) and non-streaming path (routes.py).
Closes#66
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add HERMES.md deep-dive, Why Hermes section in README, and screenshot layout
- HERMES.md: full why-Hermes document -- assistant vs. agent mental
model, three pillars (memory/scheduling/reach), four-category
taxonomy of AI tools, per-tool comparison sections with tables
(Claude Code, Codex CLI, OpenCode, Cursor/Copilot, Claude.ai),
compounding advantage, who it's for, what it's not, quick reference
- README: hero screenshot stays full-width; two new UI screenshots in
side-by-side HTML table with captions below
- README: new Why Hermes section with 6-bullet summary, comparison
table, and link to HERMES.md
- README: HERMES.md added to Docs section
- docs/images/: two UI screenshots (workspace browser, sessions view)
* docs: fact-check and update all comparisons; add Open Interpreter section
Researched current state of each tool before updating:
Claude Code:
- Scheduled jobs: now Partial (has /loop session-scoped, cloud-managed
/schedule via claude.ai/code, and desktop app automations); updated
table to reflect this with footnotes distinguishing self-hosted cron
- Persistent memory: Partial (CLAUDE.md, MEMORY.md, rolling auto-memory
but not full automatic cross-session recall)
- Provider-agnostic: No -- supports Bedrock/Vertex but Claude models only
- Web UI: Yes but Anthropic-hosted (not self-hosted)
Codex CLI:
- Persistent memory: Partial (session history + AGENTS.md since v0.100.0)
- Scheduled jobs: Partial (desktop app Automations only; CLI has no native
scheduling as of early 2026, open feature request)
- Provider-agnostic: Yes (10+ providers)
OpenCode:
- Web UI: now Yes (embedded in binary + official desktop app)
- Persistent memory: Partial (SQLite sessions + AGENTS.md, not semantic)
- Messaging: community Telegram bot only, not first-party
Open Interpreter: added as new comparison section
- Most common 'why not just use this' question; addressed head-on
- Session-scoped, no persistent memory by their own docs, no scheduler,
no messaging integration; powerful for one-shot tasks, not always-on
README Why Hermes table: updated to include Open Interpreter column,
fixed Claude Code self-hosted row (No -- scheduling runs on Anthropic
cloud), added footnotes for partial entries
* docs: add OpenClaw comparison; update category framework and quick reference table
OpenClaw (openclaw.ai, MIT, 347k stars) is the most direct Hermes
competitor -- both are open-source, self-hosted, always-on agents with
persistent memory, cron, and messaging integration. Added:
- Full OpenClaw section in HERMES.md with honest comparison: where it
wins (15+ messaging platforms incl. iMessage/WeChat, native Chrome CDP
browser control, voice wake words, ClawHub marketplace) and where
Hermes differs (self-improving skills system, Python/ML ecosystem,
web UI, multi-profile, sub-agent orchestration)
- Category 4 framework updated: now lists both Hermes and OpenClaw,
with the key architectural distinction called out
- Quick reference table expanded to include OpenClaw column (now 8 tools)
- New rows added: self-improving skills, browser/computer control,
Python/ML ecosystem
- README Why Hermes table updated: OpenClaw replaces OpenCode column,
self-improving skills row replaces generic skills row, callout line
at bottom addresses OpenClaw head-on
* docs: major accuracy pass -- OpenClaw deep-dive, Claude Code corrections, drop Open Interpreter
OpenClaw:
- Expanded comparison from a table to a full prose section with
'Where OpenClaw wins' / 'Where Hermes wins' structure
- Honest about OpenClaw strengths: 15+ messaging platforms, native
Chrome CDP browser control, voice wake words, 13k+ ClawHub skills
- Hermes advantages called out clearly: self-improving skills as a
first-class automatic loop (vs marketplace-install model), stability
(documented OpenClaw update regressions, Telegram breakage in early
2026, WhatsApp protocol instability), security (156 CVEs and 1,184
malicious skills found in ClawHub audit vs Hermes's no marketplace
attack surface), Python/ML ecosystem, full web UI vs dashboard-only,
and first-class multi-profile support
- Category 4 framework updated to name both Hermes and OpenClaw
- Table updated: added stability/security rows, corrected web UI row
(OpenClaw has a gateway dashboard but not a full chat UI)
Claude Code corrections (researched against official docs at code.claude.com):
- Skills/Hooks: changed from No to Yes -- has a full Hooks system (13
event types, 4 handler types) and a Plugin/Skills marketplace since
v2.0.12; unified with slash commands in v2.1.0
- Messaging: changed from No to Partial -- Channels feature (Telegram,
Discord, iMessage, Webhooks) in research preview since v2.1.80; deep
Slack integration that triggers cloud sessions and creates PRs
- Added Claude Cowork row: separate product with 38+ connectors
(Slack, Gmail, Teams, Notion, Jira, Salesforce, etc.)
- Scheduling footnote updated: cloud-managed has 1-hour minimum interval
- Provider-agnostic clarified: routes through Bedrock/Vertex but always
Claude models; cannot swap to GPT or Gemini
Open Interpreter removed:
- Less relevant comparison than OpenClaw for the 'always-on agent' frame
- Kept coverage focused on the tools people actually compare Hermes to
Quick reference table:
- Now 7 tools wide (added OpenClaw, kept Claude Code, Codex, OpenCode,
Cursor, Claude.ai, Hermes)
- New rows: self-improving skills, browser/computer control, stability
- Updated: Claude Code messaging to Partial, OpenClaw web UI to
'Dashboard only', skills rows differentiated by type
* docs: apply full editorial pass from hermes-edit-list.md
Writing patterns fixed:
- Em dashes reduced by ~80%; replaced with commas, periods, parens
- All 'Not X, it's Y' negative parallelism rewritten as positive
statements; 'What Hermes Is Not' section renamed 'Scope and Limits'
and reframed positively throughout
- 'It compounds.' standalone flourish removed
- 'meaningfully' removed everywhere (was appearing 3+ times)
- 'leverages' -> 'uses' in README
- 'remembers everything' softened to 'retains context across sessions'
- Bolded Hermes column in Quick Reference table un-bolded (only genuine
differentiator cells kept bold: self-improving skills, always-on,
orchestrates other agents)
- 'The honest summary' framing removed from OpenClaw section
- 'Hermes is different.' cliche transition cut from README
- Rule-of-three slogans trimmed (e.g. 'Same agent, same memory...')
- 'tired of re-explaining' -> 'don't want to re-explaining'
Duplicate content removed:
- 'day one / day one hundred' comparison kept only in Compounding
Advantage section; removed from Pillar 1
Factual accuracy fixes:
- Claude.ai comparison updated: memory now auto-generated from history
(not just user-curated); code execution and file read/write noted
as sandboxed (Artifacts), not flat No
- Category 2: Windsurf framed as 'earliest' on memory, Copilot
'catching up'; removed overconfident 'most mature' claim
- Category 4 qualifier: 'as of early 2026' added
- '1-hour minimum' for Claude Code cloud scheduling softened to
'minimum interval applies' (specific claim unverified)
- Claude Code scheduling table note: 'cloud or desktop-app only'
(was just 'cloud-managed or session-scoped')
- README claim 'No other open-source tool combines...' removed;
was false because OpenClaw does combine all three
- OpenClaw self-improving skills: 'No' -> 'Partial' with clarification
- README OpenClaw callout: 'relies on a marketplace' softened to
'skill system centers on a community marketplace'
- 'meaningfully more stable' -> 'more stable'; 'supply chain issues'
-> 'security incidents involving malicious skills'
- OpenClaw star count: '347k+' -> '~347k' (moving fast)
- Stability row added to OpenClaw table; bold removed from table
---------
Co-authored-by: Hermes <hermes@localhost>
Adds a server-side boolean setting (default: false) that controls whether
CLI sessions from state.db appear in the sidebar. Off by default so the
sidebar is clean until the user explicitly opts in.
- api/config.py: add show_cli_sessions to _SETTINGS_DEFAULTS and _SETTINGS_BOOL_KEYS
- api/routes.py: gate get_cli_sessions() call on the setting at request time
- static/index.html: checkbox in settings panel with description
- static/panels.js: load/save checkbox, refresh session list on save
- static/boot.js: load on startup alongside send_key and show_token_usage
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
The sessions table in the CLI state.db does not have a 'profile' column --
selecting s.profile caused an OperationalError which was silently caught by
'except Exception: return []', making get_cli_sessions() always return empty.
Fix: remove s.profile from the SELECT (it doesn't exist in the CLI schema)
and derive the profile from get_active_profile_name() instead, which is the
right value anyway since the CLI DB has no profile concept.
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
get_cli_sessions() and get_cli_session_messages() were using HERMES_HOME
(the profile the server was launched under) to find state.db. This meant
a server launched under the webui profile would read webui's state.db
(full of cron runs) instead of the user's actual CLI sessions.
Fix: use get_active_hermes_home() which tracks whichever profile the user
has selected in the UI. This means:
- default profile active -> reads ~/.hermes/state.db (interactive CLI)
- camanji profile active -> reads ~/.hermes/profiles/camanji/state.db
Falls back to HERMES_HOME env var if profiles module unavailable.
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>