Commit Graph

252 Commits

Author SHA1 Message Date
nesquena-hermes
dc6be230ce fix: start.sh state dir default webui-mvp -> webui (#73)
Matches the fix applied to api/config.py in PR #72. Both defaults
now consistently use ~/.hermes/webui for a clean generic install.
HERMES_WEBUI_STATE_DIR env var still overrides for anyone running
multiple instances.

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-04 11:29:34 -07:00
nesquena-hermes
123207e0a6 fix: default STATE_DIR to ~/.hermes/webui instead of webui-mvp (#72)
The previous default pointed to 'webui-mvp' which is the internal
development repo name and meaningless to anyone deploying the public
repo. Changed to the generic '~/.hermes/webui' which is a sensible
default for any deployment.

The state dir remains fully overridable via HERMES_WEBUI_STATE_DIR
for anyone who wants to run multiple instances side by side.

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-04 11:27:11 -07:00
Varun Chopra
d05e15e612 fix: resolve pip packages from site-packages instead of agent dir
When `pip install --target .` is run inside the hermes-agent checkout,
third-party package directories (openai/, pydantic/, requests/, etc.)
end up alongside real Hermes source files. With the agent dir at the
front of sys.path (insert(0)), Python resolves imports from those local
directories, breaking whenever the host platform differs from the
container (e.g. macOS .so files inside a Linux image).

Fix: append agent dir to sys.path instead of prepending. This lets
site-packages resolve pip packages correctly while still allowing
Hermes-specific modules (run_agent, hermes/, etc.) to resolve since
they do not exist in site-packages.

Also improves verify_hermes_imports() to surface the actual exception
message in startup logs, making it much easier to diagnose why a
module failed to import.
2026-04-04 23:29:33 +05:30
nesquena-hermes
2b92fe0aa9 fix: overlay z-index clipping, CLI badge hover, workspace dropdown spacing (#71)
Fix five stacking/overflow bugs in static/style.css (no JS changes):

1. Profile dropdown overlaps chat messages
   .topbar lacked a stacking context -- added position:relative;z-index:10
   so the dropdown (z-index:200 child) always paints above .messages (z-index:0)

2. Workspace dropdown clipped by sidebar overflow:hidden
   .sidebar overflow:hidden was swallowing the upward-opening ws-dropdown.
   Changed to overflow:visible -- scroll is already on .session-list, not .sidebar.
   Added position:relative;z-index:10;overflow:visible to .sidebar-bottom.

3. Slash-command dropdown could render behind tool cards
   .composer-wrap had position:relative but no z-index.
   Added z-index:10 so cmd-dropdown always sits above .messages (z-index:0).

4. Skill picker dropdown clipped inside Settings modal
   .settings-panel had overflow-y:auto which clipped the absolute-positioned
   skill picker. Changed to overflow:visible + display:flex;flex-direction:column,
   moved overflow-y:auto to .settings-body, raised skill-picker-dropdown to z-index:1100.

5. CLI session badge blocks action buttons on hover
   Added .session-item.cli-session:hover::after { display:none } so the gold
   'cli' label hides on hover, making archive/delete/pin fully reachable.

6. Workspace dropdown name+path crowded on same line
   .ws-opt was a plain block with inline spans. Added flex-direction:column;gap:4px
   and display:block to each child so name and path stack cleanly on separate lines.

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-04 09:34:43 -07:00
Nathan Esquenazi
dac58c8162 Merge pull request #70 from nesquena/fix/ui-polish-cli-badge-image-error-zindex
fix: UI polish -- image error, CLI badge overlap, dropdown z-index
2026-04-04 09:18:01 -07:00
Nathan Esquenazi
6a4b20f3f2 fix: three UI glitches -- image error, CLI badge overlap, dropdown z-index
1. Image preview onerror fires on clearPreview (#68)
   clearPreview() set previewImg.src='' which triggered the stale onerror
   handler, showing 'Could not load image' on every refresh/message.
   Fix: null out onerror before clearing src.

2. CLI session badge covers delete button (#69)
   The ::after 'cli' label occupied the same space as the hover-revealed
   .session-actions overlay, making delete unreachable.
   Fix: add padding-right to .cli-session, use margin-left:auto to push
   badge right, add pointer-events:none so clicks pass through.

3. Tool cards visible through profile dropdown
   The .messages container had no stacking context, so tool cards could
   render above the profile dropdown (z-index:200).
   Fix: add position:relative;z-index:0 to .messages to establish a
   stacking context that keeps all children below overlays.

Closes #68, closes #69

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:09:31 -07:00
Nathan Esquenazi
90b5ad8d99 fix: strip webui metadata from messages before sending to LLM API (#67)
The webui stores display-only fields on messages (attachments, timestamp,
_ts) for UI rendering. These leaked into the conversation_history passed
to AIAgent.run_conversation(). Most providers ignore unknown fields, but
Z.AI/GLM tries to deserialize 'attachments' as its native ChatAttachments
type, causing HTTP 400 on every subsequent message after an image upload.

Fix: _sanitize_messages_for_api() creates a clean copy with only
API-standard keys (role, content, tool_calls, tool_call_id, name,
refusal) before passing to run_conversation(). Applied to both the
streaming path (streaming.py) and non-streaming path (routes.py).

Closes #66

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:13:12 -07:00
nesquena-hermes
57a4f573f6 docs: HERMES.md deep-dive, Why Hermes in README, screenshot layout
* docs: add HERMES.md deep-dive, Why Hermes section in README, and screenshot layout

- HERMES.md: full why-Hermes document -- assistant vs. agent mental
  model, three pillars (memory/scheduling/reach), four-category
  taxonomy of AI tools, per-tool comparison sections with tables
  (Claude Code, Codex CLI, OpenCode, Cursor/Copilot, Claude.ai),
  compounding advantage, who it's for, what it's not, quick reference
- README: hero screenshot stays full-width; two new UI screenshots in
  side-by-side HTML table with captions below
- README: new Why Hermes section with 6-bullet summary, comparison
  table, and link to HERMES.md
- README: HERMES.md added to Docs section
- docs/images/: two UI screenshots (workspace browser, sessions view)

* docs: fact-check and update all comparisons; add Open Interpreter section

Researched current state of each tool before updating:

Claude Code:
- Scheduled jobs: now Partial (has /loop session-scoped, cloud-managed
  /schedule via claude.ai/code, and desktop app automations); updated
  table to reflect this with footnotes distinguishing self-hosted cron
- Persistent memory: Partial (CLAUDE.md, MEMORY.md, rolling auto-memory
  but not full automatic cross-session recall)
- Provider-agnostic: No -- supports Bedrock/Vertex but Claude models only
- Web UI: Yes but Anthropic-hosted (not self-hosted)

Codex CLI:
- Persistent memory: Partial (session history + AGENTS.md since v0.100.0)
- Scheduled jobs: Partial (desktop app Automations only; CLI has no native
  scheduling as of early 2026, open feature request)
- Provider-agnostic: Yes (10+ providers)

OpenCode:
- Web UI: now Yes (embedded in binary + official desktop app)
- Persistent memory: Partial (SQLite sessions + AGENTS.md, not semantic)
- Messaging: community Telegram bot only, not first-party

Open Interpreter: added as new comparison section
- Most common 'why not just use this' question; addressed head-on
- Session-scoped, no persistent memory by their own docs, no scheduler,
  no messaging integration; powerful for one-shot tasks, not always-on

README Why Hermes table: updated to include Open Interpreter column,
fixed Claude Code self-hosted row (No -- scheduling runs on Anthropic
cloud), added footnotes for partial entries

* docs: add OpenClaw comparison; update category framework and quick reference table

OpenClaw (openclaw.ai, MIT, 347k stars) is the most direct Hermes
competitor -- both are open-source, self-hosted, always-on agents with
persistent memory, cron, and messaging integration. Added:

- Full OpenClaw section in HERMES.md with honest comparison: where it
  wins (15+ messaging platforms incl. iMessage/WeChat, native Chrome CDP
  browser control, voice wake words, ClawHub marketplace) and where
  Hermes differs (self-improving skills system, Python/ML ecosystem,
  web UI, multi-profile, sub-agent orchestration)
- Category 4 framework updated: now lists both Hermes and OpenClaw,
  with the key architectural distinction called out
- Quick reference table expanded to include OpenClaw column (now 8 tools)
- New rows added: self-improving skills, browser/computer control,
  Python/ML ecosystem
- README Why Hermes table updated: OpenClaw replaces OpenCode column,
  self-improving skills row replaces generic skills row, callout line
  at bottom addresses OpenClaw head-on

* docs: major accuracy pass -- OpenClaw deep-dive, Claude Code corrections, drop Open Interpreter

OpenClaw:
- Expanded comparison from a table to a full prose section with
  'Where OpenClaw wins' / 'Where Hermes wins' structure
- Honest about OpenClaw strengths: 15+ messaging platforms, native
  Chrome CDP browser control, voice wake words, 13k+ ClawHub skills
- Hermes advantages called out clearly: self-improving skills as a
  first-class automatic loop (vs marketplace-install model), stability
  (documented OpenClaw update regressions, Telegram breakage in early
  2026, WhatsApp protocol instability), security (156 CVEs and 1,184
  malicious skills found in ClawHub audit vs Hermes's no marketplace
  attack surface), Python/ML ecosystem, full web UI vs dashboard-only,
  and first-class multi-profile support
- Category 4 framework updated to name both Hermes and OpenClaw
- Table updated: added stability/security rows, corrected web UI row
  (OpenClaw has a gateway dashboard but not a full chat UI)

Claude Code corrections (researched against official docs at code.claude.com):
- Skills/Hooks: changed from No to Yes -- has a full Hooks system (13
  event types, 4 handler types) and a Plugin/Skills marketplace since
  v2.0.12; unified with slash commands in v2.1.0
- Messaging: changed from No to Partial -- Channels feature (Telegram,
  Discord, iMessage, Webhooks) in research preview since v2.1.80; deep
  Slack integration that triggers cloud sessions and creates PRs
- Added Claude Cowork row: separate product with 38+ connectors
  (Slack, Gmail, Teams, Notion, Jira, Salesforce, etc.)
- Scheduling footnote updated: cloud-managed has 1-hour minimum interval
- Provider-agnostic clarified: routes through Bedrock/Vertex but always
  Claude models; cannot swap to GPT or Gemini

Open Interpreter removed:
- Less relevant comparison than OpenClaw for the 'always-on agent' frame
- Kept coverage focused on the tools people actually compare Hermes to

Quick reference table:
- Now 7 tools wide (added OpenClaw, kept Claude Code, Codex, OpenCode,
  Cursor, Claude.ai, Hermes)
- New rows: self-improving skills, browser/computer control, stability
- Updated: Claude Code messaging to Partial, OpenClaw web UI to
  'Dashboard only', skills rows differentiated by type

* docs: apply full editorial pass from hermes-edit-list.md

Writing patterns fixed:
- Em dashes reduced by ~80%; replaced with commas, periods, parens
- All 'Not X, it's Y' negative parallelism rewritten as positive
  statements; 'What Hermes Is Not' section renamed 'Scope and Limits'
  and reframed positively throughout
- 'It compounds.' standalone flourish removed
- 'meaningfully' removed everywhere (was appearing 3+ times)
- 'leverages' -> 'uses' in README
- 'remembers everything' softened to 'retains context across sessions'
- Bolded Hermes column in Quick Reference table un-bolded (only genuine
  differentiator cells kept bold: self-improving skills, always-on,
  orchestrates other agents)
- 'The honest summary' framing removed from OpenClaw section
- 'Hermes is different.' cliche transition cut from README
- Rule-of-three slogans trimmed (e.g. 'Same agent, same memory...')
- 'tired of re-explaining' -> 'don't want to re-explaining'

Duplicate content removed:
- 'day one / day one hundred' comparison kept only in Compounding
  Advantage section; removed from Pillar 1

Factual accuracy fixes:
- Claude.ai comparison updated: memory now auto-generated from history
  (not just user-curated); code execution and file read/write noted
  as sandboxed (Artifacts), not flat No
- Category 2: Windsurf framed as 'earliest' on memory, Copilot
  'catching up'; removed overconfident 'most mature' claim
- Category 4 qualifier: 'as of early 2026' added
- '1-hour minimum' for Claude Code cloud scheduling softened to
  'minimum interval applies' (specific claim unverified)
- Claude Code scheduling table note: 'cloud or desktop-app only'
  (was just 'cloud-managed or session-scoped')
- README claim 'No other open-source tool combines...' removed;
  was false because OpenClaw does combine all three
- OpenClaw self-improving skills: 'No' -> 'Partial' with clarification
- README OpenClaw callout: 'relies on a marketplace' softened to
  'skill system centers on a community marketplace'
- 'meaningfully more stable' -> 'more stable'; 'supply chain issues'
  -> 'security incidents involving malicious skills'
- OpenClaw star count: '347k+' -> '~347k' (moving fast)
- Stability row added to OpenClaw table; bold removed from table

---------

Co-authored-by: Hermes <hermes@localhost>
2026-04-03 22:03:19 -07:00
Nathan Esquenazi
c1320b4712 Merge pull request #62 from nesquena/docs/v0.30.1-release
docs: v0.30.1 release — CLI bridge fixes + README update
2026-04-03 21:12:15 -07:00
Nathan Esquenazi
d3b693524f docs: v0.30.1 release — CLI bridge fixes, README update
CHANGELOG: add v0.30.1 entry covering PRs #57-#61 (CLI session bridge
fixes: sidebar rendering, profile-aware state.db path, silent SQL error,
show/hide toggle in Settings.

README: add CLI session bridge, token/cost display, subagent cards,
/usage command, skills linked files, show CLI sessions toggle.

Version label: v0.30 -> v0.30.1 in index.html, SPRINTS, CHANGELOG footer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
EOF
)
2026-04-03 21:11:52 -07:00
nesquena-hermes
66f95e08c2 feat: 'Show CLI sessions' toggle in Settings (#61)
Adds a server-side boolean setting (default: false) that controls whether
CLI sessions from state.db appear in the sidebar. Off by default so the
sidebar is clean until the user explicitly opts in.

- api/config.py: add show_cli_sessions to _SETTINGS_DEFAULTS and _SETTINGS_BOOL_KEYS
- api/routes.py: gate get_cli_sessions() call on the setting at request time
- static/index.html: checkbox in settings panel with description
- static/panels.js: load/save checkbox, refresh session list on save
- static/boot.js: load on startup alongside send_key and show_token_usage

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-03 21:06:23 -07:00
nesquena-hermes
1a4d56c215 fix: CLI DB has no profile column, and silent SQL error swallowed results (#60)
The sessions table in the CLI state.db does not have a 'profile' column --
selecting s.profile caused an OperationalError which was silently caught by
'except Exception: return []', making get_cli_sessions() always return empty.

Fix: remove s.profile from the SELECT (it doesn't exist in the CLI schema)
and derive the profile from get_active_profile_name() instead, which is the
right value anyway since the CLI DB has no profile concept.

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-03 21:02:01 -07:00
nesquena-hermes
b2c2f32584 fix: CLI session bridge reads active profile's state.db not server launch profile (#59)
get_cli_sessions() and get_cli_session_messages() were using HERMES_HOME
(the profile the server was launched under) to find state.db. This meant
a server launched under the webui profile would read webui's state.db
(full of cron runs) instead of the user's actual CLI sessions.

Fix: use get_active_hermes_home() which tracks whichever profile the user
has selected in the UI. This means:
  - default profile active -> reads ~/.hermes/state.db (interactive CLI)
  - camanji profile active -> reads ~/.hermes/profiles/camanji/state.db

Falls back to HERMES_HOME env var if profiles module unavailable.

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-03 21:00:04 -07:00
nesquena-hermes
15fde033c3 fix: wire up CLI session display in sidebar (3 frontend gaps) (#58)
The backend CLI session bridge (PR #56) was complete but the frontend
never connected to it:

1. css class never applied -- el.className never included 'cli-session'
   so the gold border and 'cli' badge CSS was dead code. Fixed: append
   ' cli-session' when s.is_cli_session is true.

2. import never triggered -- click handler always called loadSession()
   directly, never POST /api/session/import_cli. Fixed: for CLI sessions,
   call import_cli first (idempotent -- safe to call on every click),
   then fall through to loadSession() which now finds the imported copy.

3. profile filter silently hid CLI sessions -- filter required
   s.profile === S.activeProfile, but CLI sessions may have profile=null
   if the SQLite DB has no profile column. Fixed: CLI sessions always
   pass the filter (s.is_cli_session || s.profile === S.activeProfile).

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-03 20:55:29 -07:00
nesquena-hermes
10a1e57c9b docs: fix test count in v0.30 CHANGELOG (424 not 426) (#57)
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-03 20:44:58 -07:00
Nathan Esquenazi
4f10080501 Merge pull request #56 from thadreber-web/feat/cli-session-bridge
feat: CLI session bridge - read CLI sessions from agent SQLite store
2026-04-03 20:42:21 -07:00
Nathan Esquenazi
f8ea02c14d merge: resolve conflicts with master (v0.29), bump to v0.30
Resolved CHANGELOG.md and SPRINTS.md conflicts: master added v0.29
(Sprint 23: Agentic Transparency), CLI bridge becomes v0.30.
Updated all version references to v0.30.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 20:42:11 -07:00
Nathan Esquenazi
122fe955b6 docs: v0.29 release notes for CLI session bridge, version bump
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 20:39:27 -07:00
Nathan Esquenazi
017d7f1eca fix: add missing HOME import to models.py (NameError crash)
get_cli_sessions() and get_cli_session_messages() reference HOME but
it was not imported from api.config. This caused /api/sessions to 500
on every request, breaking the entire session list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 20:37:34 -07:00
Thad Reber
cabda6b77a feat: CLI session bridge - read CLI sessions from agent SQLite store
Read CLI sessions from the agent's state.db and surface them in the
WebUI sidebar alongside local sessions, with read-only display and
import-on-click to avoid data duplication.

Key changes:
- get_cli_sessions(): reads sessions list via parameterized SQL,
  wrapped in sqlite3 context manager (no connection leaks)
- get_cli_session_messages(): reads messages for a CLI session
  via parameterized SQL, also context-managed
- GET /api/sessions: merges WebUI + CLI sessions with dedup
  (WebUI takes priority on same session_id)
- GET /api/session: falls back to CLI store if not a WebUI session
- POST /api/session/import_cli: imports a CLI session into the
  WebUI store (idempotent, no duplicates on re-import)
- Imported sessions use get_last_workspace() for the workspace field
  (not a hardcoded string) and carry the active profile tag
- CSS: .cli-session with ::after 'cli' indicator (no theme changes)

Fixes review feedback:
- SQLite connections use 'with' context managers (no leaks)
- Workspace uses real path via get_last_workspace()
- Profile awareness via api.profiles.get_active_profile_name()
- Parameterized SQL queries throughout (no injection risk)
- Graceful fallback when sqlite3 or state.db is missing
2026-04-03 19:54:54 -07:00
nesquena-hermes
33fca2383c docs: v0.29 release notes + roadmap/sprint plan updates
- CHANGELOG.md: add v0.29 entry covering all Sprint 23 deliverables
  (token/cost display, subagent cards, skill picker, linked files viewer,
  workspace tree persistence, timestamp fixes, XSS + security fixes)
- ROADMAP.md: update to v0.29, add Sprint 23 to history table, check off
  token/cost, skill linked files, skill picker in cron (3 items closed)
- TESTING.md: update automated test count 415 -> 424
- SPRINTS.md: add Sprint 24 (web polish bug fix pass) and Sprint 25
  (macOS native desktop app) forward plans; remove stale stub entries

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
2026-04-03 19:36:18 -07:00
Nathan Esquenazi
be951a4d1d Merge pull request #54 from nesquena/fix/stricter-docker-regex
fix(ci): stricter semver regex for Docker tags
2026-04-03 19:33:14 -07:00
Nathan Esquenazi
bba9a236c3 fix(ci): use stricter semver regex for Docker tag extraction
Use v(\d+\.\d+(?:\.\d+)?) instead of v(.*) to only match real
version numbers (v0.29, v0.29.1), not arbitrary strings.
Keep latest unconditional since all tag pushes are releases.

Based on review of PR #52 approach vs #53.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 19:32:50 -07:00
Nathan Esquenazi
846565484b Merge pull request #53 from nesquena/fix/release-docker-tags
fix(ci): Docker tag extraction for 2-part versions (v0.29)
2026-04-03 19:28:18 -07:00
Nathan Esquenazi
1a579ef9cf fix(ci): use match pattern for Docker tags to support 2-part versions
The semver pattern in docker/metadata-action requires 3-part versions
(e.g. v0.29.0). Our tags use 2-part (v0.29), causing the metadata
step to produce empty tags, which made build-push-action fail.

Fix: use type=match with regex to extract the version string directly,
plus type=raw for the latest tag unconditionally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 19:27:41 -07:00
Nathan Esquenazi
1605d65226 Merge pull request #51 from nesquena/sprint-23-agentic-transparency
Sprint 23: Agentic transparency — token costs, subagent cards, skill picker, linked files
2026-04-03 19:16:42 -07:00
Nathan Esquenazi
3d4d7f2b53 fix: test hardening for sprint 23 — handle missing cron/skills modules in test env
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 19:16:17 -07:00
Nathan Esquenazi
2fb2ddeaaa feat: token usage toggle (setting + /usage command) + timestamp fixes
Token usage display:
- Add 'show_token_usage' boolean to settings (default: false, off by default)
- Settings panel: checkbox 'Show token usage after responses'
- /usage slash command: instant toggle with toast feedback, persists to
  server, updates checkbox if settings panel is open, re-renders messages
- Boot: load show_token_usage alongside send_key on startup
- ui.js: gate usage badge on window._showTokenUsage flag

Timestamps:
- streaming.py: stamp 'timestamp' on every message that lacks one at
  conversation completion; old messages (no timestamp field) now get a
  wall-clock time the first time they're touched by a new turn
- messages.js: stamp _ts on the last assistant message at done-event time
  so the time shows immediately on the current turn before next reload
- Timestamps already render in the UI (Sprint 14): faint time on each
  role header line, full opacity on hover, full date in title tooltip
2026-04-03 19:11:36 -07:00
Nathan Esquenazi
b1d687ba22 feat: persist workspace tree expanded state across refreshes
Store expanded directory paths in localStorage keyed by workspace path
(key: 'hermes-webui-expanded:{workspacePath}'). On root load (loadDir('.')),
restore the saved set for the current workspace and pre-fetch dir contents
for any restored expanded directories so the tree renders fully on first
paint without requiring a second click to expand.

Saves on every expand/collapse toggle. Switching workspaces automatically
picks up that workspace's own saved state. Per-workspace (not per-session)
so the same tree state is shared across sessions using the same workspace,
which is the natural expectation.
2026-04-03 19:11:36 -07:00
Nathan Esquenazi
c1dcd73502 fix: security, correctness, and test hardening from review
- routes.py: reject glob wildcards (* ? [ ]) in skill name param to
  prevent rglob wildcard injection when serving linked files
- panels.js: replace inline onclick+esc() with data-* attributes and
  addEventListener for skill tag removal and linked-file clicks;
  esc() is HTML-safe but not JS-safe -- apostrophes in names caused
  JS syntax errors and _cronSelectedSkills array corruption
- ui.js: fix _fmtTokens(null/undefined) returning 'null'/'undefined'
  by guarding with (!n||n<0) -> '0'; add data-role attribute to msg-row
  elements so usage badge correctly targets the last assistant row
  instead of the last row regardless of speaker
- tests: rename test_sprint24.py -> test_sprint23.py (wrong sprint #);
  add 3 new tests: path traversal rejection, wildcard name rejection,
  cron create with skills; strengthen existing tests to assert field
  presence explicitly (was using .get(field, 0)==0 which never caught
  a missing field)
2026-04-03 19:11:36 -07:00
Nathan Esquenazi
df06c1cdca feat: Sprint 23 — agentic transparency + polish
Track A: Token/cost display
- Read agent usage attrs (session_prompt_tokens, session_completion_tokens,
  session_estimated_cost_usd) after run_conversation in streaming.py
- Add input_tokens, output_tokens, estimated_cost fields to Session model
- Include usage in done SSE event payload
- Store usage on S.lastUsage in messages.js done handler
- Render usage badge below last assistant message (input/output/cost)

Track B: Subagent delegation cards
- Add subagent_progress to toolIcon map with shuffle emoji
- Special-case subagent_progress in buildToolCard: "Subagent" label,
  strip double emoji from preview, add tool-card-subagent CSS class
- Indented border-left styling for subagent cards
- Clean delegate_task display name

Track C: Skill picker in cron create form
- Add skill search input + tag chips to cron create form HTML
- Skill picker JS in panels.js: search/filter, click-to-add tags,
  remove tag chips, pre-fetch skill list on form open
- submitCronCreate sends skills array in POST body
- Skill picker dropdown + tag CSS

Track D: Skill linked files viewer
- Add file query param to /api/skills/content endpoint
- Serve linked files from skill directory with path traversal protection
- Ensure linked_files key always present in skill content response
- Render linked files section below SKILL.md content in preview panel
- openSkillFile function for viewing individual linked files

Track E: Bug fixes and code quality
- Expand Session.__init__ and compact() to readable multi-line format
- Remove inline import json as _j2 inside loop in streaming.py
- Fix tool_calls: capture args from assistant messages, skip unresolved names
- Store args snapshot in persisted tool_calls for reload display

6 new tests. Total: 421 (409 passing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 18:33:49 -07:00
Nathan Esquenazi
2c0f6e80b6 Merge pull request #49 from nesquena/docs/update-all-markdown-v0.28
docs: update all markdown to v0.28.1 state
2026-04-03 14:20:29 -07:00
Nathan Esquenazi
4a4af209ad docs: update all markdown to v0.28.1 state
- README: add GHCR pre-built images to Docker section, update line counts
  and test count (426 tests, 22 files), add CI/CD to architecture tree
- ROADMAP: update header to v0.28.1/426 tests, mark all user-requested
  features as shipped, collapse completed Waves 3-7 into summary table,
  update architecture line counts, add CI/CD row
- CHANGELOG: add v0.28.1 entry for CI pipeline + multi-arch Docker builds,
  update footer version
- SPRINTS: update header and footer to v0.28.1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:18:50 -07:00
Nathan Esquenazi
279690e4c1 Merge pull request #48 from nesquena/fix/ci-action-versions
fix(ci): use standard version tags for GitHub Actions
2026-04-03 14:10:39 -07:00
Nathan Esquenazi
2766314e81 fix(ci): use standard version tags for GitHub Actions
The SHA-pinned versions from the security hardening commit referenced
non-existent commit hashes, causing the workflow to fail with 'unable
to resolve action'. Switch to standard major version tags (v4, v3, v2,
v6, v5) which are the recommended approach for GitHub-maintained and
well-known actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:09:36 -07:00
nesquena-hermes
9d69408610 ci: add GitHub Actions workflow for multi-arch Docker + releases (#47)
ci: multi-arch Docker builds + GitHub Releases on tag push
2026-04-03 14:02:52 -07:00
Nathan Esquenazi
4a3b9571f1 fix(ci): pin all GitHub Actions to full commit SHAs for supply chain security
Pinned all 7 third-party actions from mutable version tags to immutable
commit SHAs. Mutable tags (e.g. @v4) can be force-pushed by the action
author (or a compromised account) to inject malicious code into the workflow,
which runs with write access to the repo and GHCR registry.

Also moved 'permissions' from workflow level to job level (best practice:
scope permissions as narrowly as possible).

Pin mapping:
  actions/checkout@v4               -> @11bd71901bbe...  (v4.2.2)
  softprops/action-gh-release@v2    -> @c062e08bd532...  (v2.2.1)
  docker/setup-qemu-action@v3       -> @49b3bc8e6bdd...  (v3.2.0)
  docker/setup-buildx-action@v3     -> @c47758b77c97...  (v3.7.1)
  docker/login-action@v3            -> @9780b0c442fb...  (v3.3.0)
  docker/metadata-action@v5         -> @369eb591f429...  (v5.6.1)
  docker/build-push-action@v6       -> @ca877d9245fe...  (v6.10.0)
2026-04-03 21:02:08 +00:00
Nathan Esquenazi
c488031fe3 Merge pull request #45 from nesquena/fix/profile-creation-docker-fallback
fix: profile creation fallback for Docker (#44)
2026-04-03 14:01:04 -07:00
Nathan Esquenazi
94b080fa1e docs: v0.27 release notes, version bump for profile creation fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:00:46 -07:00
Nathan Esquenazi
e6663596ce fix(review): 4 issues found in agent review of PR #45
BUG-1 (medium): _validate_profile_name() used re.match() with a $ anchor.
re.match() with $ is truthy for 'name\n' because match() allows trailing
content after the $ in multiline mode. Changed to re.fullmatch() which
requires the entire string to match — trailing newlines now correctly rejected.

BUG-2 (medium/defense-in-depth): create_profile_api() validated 'name' via
_validate_profile_name() but passed clone_from directly to hermes_cli and
_create_profile_fallback() without validation. Added clone_from validation
inside create_profile_api() (skipping 'default' which is a valid clone source).
routes.py already validates it at the HTTP layer; this adds API-layer defense.

BUG-3 (low): When hermes_cli is not importable (the exact Docker case this PR
targets), list_profiles_api() also returns only the stub default dict and
can't find the newly created profile by name. The fallback return was a
2-key dict {name, path} — incomplete vs the 9-key schema everywhere else.
Expanded to the full profile dict with all fields so API clients get
consistent data regardless of hermes_cli availability.

OBS-4 (low/TOCTOU): _create_profile_fallback() checked profile_dir.exists()
then called mkdir(exist_ok=True). If a concurrent request created the dir
between those two calls, mkdir silently succeeded — defeating the
FileExistsError guard. Changed to mkdir(exist_ok=False) so the OS raises
FileExistsError atomically if the dir appears in the race window.

Tests: 423 passed, 0 failed.
2026-04-03 13:58:43 -07:00
Nathan Esquenazi
16553be59d fix: profile creation fallback when hermes_cli unavailable (Docker)
When hermes-agent is not discoverable (common in Docker), create_profile_api()
raised a hard RuntimeError while list and delete already had manual fallbacks.

Changes:
- Add _create_profile_fallback() that bootstraps profile directory structure
  directly (matching upstream hermes_cli.profiles: 8 subdirs + config clone)
- Extract _validate_profile_name() so validation works without hermes_cli
- Add constants _PROFILE_ID_RE, _PROFILE_DIRS, _CLONE_CONFIG_FILES matching
  upstream hermes-agent
- Remove :ro from docker-compose.yml hermes home mount so profiles dir is
  writable inside the container

Closes #44

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:58:43 -07:00
Nathan Esquenazi
6a61f36280 ci: add GitHub Actions workflow for multi-arch Docker + releases
On tag push (v*):
- Creates a GitHub Release with auto-generated release notes
- Builds multi-arch Docker image (linux/amd64, linux/arm64)
- Pushes to ghcr.io/nesquena/hermes-webui with semver tags
- Uses GitHub Actions cache for faster builds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:55:41 -07:00
Nathan Esquenazi
b03ddf78c9 Merge pull request #46 from nesquena/fix/profile-switch-default-home
fix: Profile system polish — 10 post-Sprint-23 fixes (v0.26)
2026-04-03 13:44:16 -07:00
Nathan Esquenazi
5c9edfc7bf docs: v0.26 release notes, remove planning artifact, update versions
- Add v0.26 CHANGELOG entry (10 post-Sprint-23 fixes)
- Remove SPRINT_23_PLAN.md (planning artifact, not runtime docs)
- Bump version label to v0.26 in index.html
- Update SPRINTS header and footer to v0.26 / 426 tests
- Update CHANGELOG footer to v0.26 / 426 tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:44:06 -07:00
Nathan Esquenazi
733957cea1 docs: add Sprint 23 planning document (profile/workspace/model coherence spec) 2026-04-03 20:37:48 +00:00
Nathan Esquenazi
e61382ef71 fix: pass fallback_model to AIAgent; show rate-limit error inline instead of 'Connection lost'
Two fixes for Camanji rate limit UX:

1. api/streaming.py — pass fallback_model from profile config to AIAgent
   The agent already supports fallback_model (a dict with provider/model/base_url)
   for automatic rate-limit recovery, but streaming.py never read it from config
   or passed it to AIAgent.  Now reads get_config().get('fallback_model') at
   call time (not module-level snapshot) and passes it through.
   Also reads platform_toolsets.cli from the active profile's config at call
   time so profiles with custom toolset lists use the right tools.

   Camanji has fallback_model: {provider: openrouter, model: anthropic/claude-sonnet-4.6}
   so hitting the direct-Anthropic rate limit will now automatically retry via
   OpenRouter before giving up.

2. api/streaming.py + static/messages.js — show error inline, not 'Connection lost'
   Previously: agent threw -> put('error', msg) -> SSE connection closed ->
   browser's network-level 'error' event fired -> generic 'Connection lost'.
   The actual error message was invisible to the user.

   Fix: renamed server-side error event to 'apperror' (distinct from the SSE
   spec's network error event).  Added source.addEventListener('apperror', ...)
   in messages.js that renders the error as a styled assistant message:
     ⏱️ Rate limit reached: <full message>
     *Rate limit reached. Fallback model exhausted. Try again in a moment.*
   Also added source.addEventListener('warning', ...) for non-fatal notices
   (future use: fallback-activated status bar update).

Tests: 426 passed, 0 failed.
2026-04-03 20:34:52 +00:00
Nathan Esquenazi
da43a6a09a fix: switching profiles mid-conversation starts a new session instead of cross-tagging
A session with messages belongs to the profile it was created under. Switching
profiles while a conversation is in progress should not retag that session or
update its workspace/model in place — that would corrupt the session's context.

New behavior:
- Session has NO messages (empty): profile switch updates it in place (model,
  workspace). Works exactly as before — nothing was started yet.
- Session HAS messages (in progress): profile switch calls newSession() to
  start a fresh session tagged to the new profile. The old session is left
  untouched. Toast: 'Switched to profile: X — new conversation started'.
- Agent busy: blocked as before, no change.

Also: S._profileDefaultWorkspace is now consumed (set to null) inside
newSession() after the first use, so it doesn't keep forcing the same
workspace on every subsequent new session after a switch.
2026-04-03 20:27:50 +00:00
Nathan Esquenazi
4eae6c98f9 fix: cross-provider model pick causes Connection lost on non-OpenRouter profiles
Root cause: resolve_model_provider() had a branch:
  if config_provider and config_provider != 'openrouter' and prefix in _PROVIDER_MODELS:
      return bare, prefix, None

When Camanji profile (config_provider='anthropic') picked openai/gpt-5.4-mini
from the OpenRouter dropdown, prefix='openai' matched _PROVIDER_MODELS and
config_provider was not 'openrouter', so it returned ('gpt-5.4-mini', 'openai', None).
The agent then demanded OPENAI_API_KEY directly -- not found -- RuntimeError --
stream crashed -- 'Connection lost'.

Fix: if prefix != config_provider (cross-provider selection), always route through
openrouter with the full provider/model string. Only strip the prefix and call a
direct provider API when the config_provider EXACTLY matches the model prefix.

Cases verified:
  openrouter + openai/gpt-5.4-mini     -> (openai/gpt-5.4-mini, openrouter)  ✓
  anthropic  + openai/gpt-5.4-mini     -> (openai/gpt-5.4-mini, openrouter)  ✓ FIXED
  anthropic  + anthropic/claude-...    -> (claude-..., anthropic)             ✓
  anthropic  + claude-sonnet-4-6 bare  -> (claude-sonnet-4-6, anthropic)      ✓
  openrouter + anthropic/claude-...    -> (anthropic/claude-..., openrouter)  ✓

Tests: 426 passed, 0 failed.
2026-04-03 20:23:25 +00:00
Nathan Esquenazi
c71439d8ab fix: model picker correctly updates on profile switch without flicker or raw injection
Root cause: three interacting bugs caused the model picker to show the wrong
model or flicker after a profile switch.

Bug 1 — syncTopbar() fought switchToProfile().
After switchToProfile() set the picker to the profile's model, syncTopbar()
was called (via renderSessionList -> loadSession, then explicitly at the end)
and overwrote it with S.session.model -- the old session's model.
Fix: added S._pendingProfileModel flag. switchToProfile() sets it;
syncTopbar() checks it first, applies the override, then clears it.
S.session.model is also updated to the resolved value so subsequent
syncTopbar() calls are consistent.

Bug 2 — Raw option injected at top of list for mismatched model IDs.
Profile configs store model IDs like 'claude-sonnet-4-6' (hermes-agent
format: hyphens, no namespace prefix) but the dropdown has
'anthropic/claude-sonnet-4.6' (OpenRouter format: dots, with prefix).
The old code did sel.value = id, found no match, then injected a new
<option> at the top of the list -- creating a lowercase duplicate that
didn't match any real provider group entry.
Fix: _findModelInDropdown() normalises both sides (strip prefix, hyphens->dots,
lowercase) and finds the best matching existing option. No new options are ever
injected for profile switching.

Bug 3 — populateModelDropdown() injected raw option on cold load.
Same issue: if default_model from /api/models didn't exactly match a dropdown
value, an extra option was added. Fixed by using _applyModelToDropdown()
which only selects existing options.

New helpers in ui.js:
  _findModelInDropdown(modelId, sel) -- smart fuzzy match, returns matched value
  _applyModelToDropdown(modelId, sel) -- sets picker, returns resolved value

Tests: 426 passed, 0 failed.
2026-04-03 20:10:47 +00:00
Nathan Esquenazi
ad755e49e5 fix: workspace isolation, session filtering, and clean migration path
Three interrelated fixes:

1. api/workspace.py — clean workspace isolation with auto-migration
   _clean_workspace_list(): sanitizes any workspace list by:
   - Removing test artifacts (webui-mvp-test, test-workspace paths)
   - Removing paths that no longer exist on disk
   - Removing cross-profile leaks (paths under ~/.hermes/profiles/*)
   - Renaming 'default' workspace label to 'Home' (avoids confusion
     with the 'default' profile name)

   _migrate_global_workspaces(): one-time migration for upgrading users.
   Reads the legacy global workspaces.json, runs _clean_workspace_list,
   rewrites it cleaned. This runs automatically on first load after upgrade
   for the default profile only.

   load_workspaces(): now cleans every read and persists cleaned version
   if anything changed. Named profiles always start fresh (no global leak).
   Empty results fall back to 'Home' entry pointing at profile's workspace.
   Default label for auto-generated single-entry lists is 'Home', not 'default'.

2. api/models.py — legacy session profile backfill (already committed,
   this commit adds the sessions.js filter tightening counterpart)

3. static/sessions.js — strict profile filter
   Removed the '!s.profile' escape hatch from the profile filter.
   Server now backfills profile='default' on legacy sessions, so every
   session has an explicit tag. Filter is now exact:
     s.profile === S.activeProfile
   Named profiles see zero legacy clutter. Default profile sees its own
   sessions. 'All profiles' toggle still shows everything.

Migration story for users pulling this update:
- Existing sessions (profile=null) -> attributed to 'default' at read time
- Global workspaces.json -> cleaned of test artifacts and cross-profile paths
  on first server start after upgrade
- Named profile workspace files -> cleaned on first read, persisted clean
- No manual intervention needed

Tests: 426 passed, 0 failed.
2026-04-03 20:01:12 +00:00