ARCHITECTURE.md: - 6→7 JS modules (added commands.js), updated all line counts - Added api/auth.py to file inventory - Added HERMES_WEBUI_PASSWORD env var - Added projects.json to state directory listing - Replaced PORTABILITY.md ref with BUGS.md - Updated test file references (test_sprint1-19, 327 functions) ROADMAP.md: - Version Sprint 17/v0.19 → Sprint 19/v0.21, test count 294→327 - Added Sprint 18 + 19 rows to sprint history table - Updated architecture table (api/ 2491 lines, JS 3148 lines) - Added sections: Workspace, Slash Commands, Security, Thinking - Added Sprint 20-24 to Advanced/Future (voice, mobile, multi-profile, desktop, extended commands) SPRINTS.md: - Header v0.20→v0.21, 318→327 tests - "Where we are now" updated from v0.18 to v0.21 - Removed two stale/duplicate "Sprint 18" sections (Voice + Subagent) - Added completed Sprint 18 (thinking + tree + preview fix) - Added completed Sprint 19 (auth + security) - Added planned Sprints 20-24 (voice, mobile, multi-profile, desktop, commands) - Parity tables fully updated with current Done/Deferred status CHANGELOG.md: - Added v0.21 Sprint 19 entry (auth, security headers, 20MB limit) TESTING.md: - Header "through Sprint 2" → "through Sprint 19 (v0.21)" - Added test count and pytest command to header - Added 9 new manual test sections covering Sprints 11-19 - Updated footer with current stats Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
24 KiB
Hermes Web UI -- Forward Sprint Plan
Current state: v0.21 | 327 tests (304 passing) | Daily driver ready This document plans the path from here to two targets:
Target A: 1:1 feature parity with the Hermes CLI (everything you can do from the terminal, you can do from the browser)
Target B: 1:1 parity with Claude's reproducible features (the full Claude browser UI experience, minus things only Anthropic can build)
Sprints are ordered by impact. Each builds on the one before. Past sprint history lives in CHANGELOG.md.
Where we are now (v0.21)
CLI parity: ~90% complete. Core agent loop, all tools visible, workspace file ops with tree view, cron/skills/memory CRUD, session management, streaming, cancel, multi-provider models, custom endpoint discovery, slash commands, thinking/reasoning display, password auth -- all solid. Gaps are subagent visibility, toolset control, and code execution.
Claude parity: ~70% complete. Chat, streaming, file browser, session management, tool cards, syntax highlighting, model switching, projects, settings, Mermaid diagrams, mobile layout, breadcrumb workspace nav, slash commands, thinking display, auth -- all present. Gaps are artifacts, voice, TTS, sharing, mobile-optimized layout.
Sprint 11 -- Multi-Provider Models + Streaming Smoothness (COMPLETED)
Theme: Use any Hermes-supported model provider from the UI, and make heavy agentic work feel fast and fluid.
Why now: Two high-impact gaps converge here. First, the model dropdown is hardcoded to ~10 OpenRouter model strings. If Hermes is configured with direct Anthropic, OpenAI, Google, or other API providers, the web UI can't use them. This means users who set up Hermes with native API keys are locked out of their own models in the browser. Second, the streaming render path rebuilds the entire message list on every tool event, causing visible flicker during heavy agentic work.
Track A: Bugs
- Tool card DOM thrash: renderMessages() rebuilds all cards on each tool event. Switch to incremental append (append new card to existing group, no full rebuild).
- Scroll position lost on re-render during streaming (messages jump).
Track B: Features
- Multi-provider model support: Query Hermes agent's configured providers
and available models at startup via a new
GET /api/modelsendpoint. The model dropdown populates dynamically from whatever providers the user has configured (OpenRouter, direct OpenAI, direct Anthropic, Google, DeepSeek, etc.). Group by provider. Fall back to the current hardcoded list if the agent query fails. This ensures the web UI can use any model the CLI can. - Incremental tool card streaming: Instead of renderMessages() on each tool event, maintain a live card group element per turn and append/update cards in place. The assistant text row below the cards also updates incrementally (already does via assistantBody.innerHTML).
- Smooth scroll: Pin scroll to bottom during streaming unless user has manually scrolled up (read-back mode). Resume pinning when user scrolls back to bottom.
Track C: Architecture
api/routes.py: extract the 49 if/elif route handlers from server.py's Handler class into a dedicated routes module. server.py becomes a true ~50-line shell: imports, Handler stub that delegates to routes, main(). Completes the server split started in Sprint 10.
Tests: ~15 new. Total: ~205. Hermes CLI parity impact: High (model provider parity is a major CLI gap) Claude parity impact: Low (streaming smoothness)
Sprint 12 -- Settings Panel + Reliability + Session QoL
Theme: Persist your preferences, survive network blips, and organize sessions.
Why now: Three daily-driver friction points converge. First, default model and workspace aren't persisted server-side -- every restart loses them. Second, SSH tunnel hiccups during long agent runs silently kill the response with no recovery. Third, after 50+ sessions the flat chronological list makes it hard to keep important conversations accessible.
Track A: Bugs
- Workspace validation on add doesn't check symlinks (shows as invalid when it's actually a valid symlink to a directory).
Track B: Features
- Settings panel: A gear icon in the topbar opens a slide-in settings panel.
Sections: Default Model, Default Workspace. Persisted server-side in
~/.hermes/webui-mvp/settings.json. Server reads settings on startup and uses them as defaults.GET /api/settings+POST /api/settingsendpoints. - SSE auto-reconnect: When the EventSource connection drops mid-stream
(network blip, SSH tunnel hiccup), auto-reconnect once using the same
stream_id. The server-side queue holds undelivered events. If reconnect fails after 5s, show error banner. This is the #1 reliability gap for remote VPS usage. - Pin sessions: A star icon on any session in the sidebar. Pinned sessions
float to the top of the list above date groups. Persisted on the session
JSON as
pinned: true. Toggle on click. Simple and high quality-of-life. - Import session from JSON: Drag a
.jsonexport file into the sidebar (or click an import button) to restore it as a new session. Mirrors the existing JSON export. Useful for moving sessions between machines.
Track C: Architecture
- Settings schema:
settings.jsonwith typed fields, validated on load, with sane defaults. Served viaGET /api/settings, written viaPOST /api/settings. - SSE reconnect: server keeps
STREAMS[stream_id]alive for 60s after client disconnect, allowing reconnect with the same stream_id.
Tests: ~15 new. Total: ~216. Hermes CLI parity impact: Medium (settings persistence, reliability) Claude parity impact: Medium (settings panel, pinned conversations)
Sprint 13 -- Alerts, Session QoL, Polish
Theme: Know what Hermes is doing, and small quality-of-life wins.
Why now: Cron jobs run silently. Background errors surface nowhere. You have no way to know a long-running task finished (or failed) while you were on another tab. Meanwhile, a few small UX gaps (no session duplicate, no tab title) add up to daily friction.
Track A: Bugs
- Symlink workspace validation — confirmed already fixed (
.resolve()follows symlinks beforeis_dir()check).
Track B: Features
- Cron completion alerts:
GET /api/crons/recent?since=TIMESTAMPendpoint. UI polls every 30s (only when tab is focused). Toast notification on each completion. Red badge count on Tasks nav tab, cleared when tab is opened. - Background agent error alerts: When a streaming session errors out and the user is on a different session, show a persistent red banner above the message area: "Session X encountered an error." Click "View" to navigate, "Dismiss" to clear.
- Session duplicate: Copy icon on each session in the sidebar (visible on hover). Creates a new session with same workspace/model, titled "(copy)".
- Browser tab title:
document.titleupdates to show the active session title (e.g. "My Task — Hermes"). Resets to "Hermes" when no session active.
Tests: ~10 new. Total: ~221. Hermes CLI parity impact: Medium (cron visibility, error surfacing) Claude parity impact: Low
Sprint 14 -- Visual Polish + Workspace Ops + Session Organization (COMPLETED)
Theme: Polish the visual experience, close workspace file gaps, and organize sessions properly.
Track B: Features
- Mermaid diagram rendering: Code blocks tagged
mermaidrender as diagrams inline. Mermaid.js loaded lazily from CDN. Dark theme. Falls back to code block on parse error. - Message timestamps: Subtle HH:MM time next to each role label. Full
date/time on hover. User messages tagged with
_tson send. - Date grouping fix: Session list uses
created_atfor groups instead ofupdated_at. Prevents sessions jumping between groups on auto-title. - File rename: Double-click any filename in the workspace panel to
rename inline (same pattern as session rename).
POST /api/file/rename. - Folder create: Folder icon button in workspace panel header.
POST /api/file/create-dir. Prompt for folder name. - Session tags: Add
#tagto session titles. Tags extracted and shown as colored chips in the sidebar. Click a tag to filter the session list. - Session archive: Archive button on each session (box icon). Archived
sessions hidden from sidebar by default. "Show N archived" toggle at top
of list.
POST /api/session/archiveendpoint.
Tests: ~12 new. Total: ~233. Hermes CLI parity impact: Medium (file rename, folder create) Claude parity impact: Medium (Mermaid, tags, archive)
Sprint 15 -- Session Projects + Code Copy + Tool Card Toggle (COMPLETED)
Theme: Organize work the way you think, not just chronologically. Plus two quick UX wins for code and agentic workflows.
Why now: After 100+ sessions the sidebar is a flat chronological list. Finding sessions from 2 weeks ago, or keeping work separated by project, requires the search box. Session projects are the single biggest remaining organizational gap vs. Claude's project folders.
Track A: Bugs
- None.
Track B: Features
- Session projects: Named groups for organizing sessions. A project
filter bar (subtle chips) sits between the search input and the session
list. Each project has a name and color. Click a chip to filter sessions
to that project; "All" shows everything. Create projects inline (+
button), rename (double-click chip), delete (right-click). Assign
sessions via folder icon button (hover-reveal) with a dropdown picker.
Projects stored in
projects.json. Session model gainsproject_idfield (null = unassigned). Fully backward-compatible with existing sessions. Endpoints:GET /api/projects,POST /api/projects/create,POST /api/projects/rename,POST /api/projects/delete,POST /api/session/move. - Code block copy button: Every code block gets a "Copy" button. Positioned in the language header bar (or top-right corner for plain code blocks). Click copies code to clipboard, shows "Copied!" for 1.5s.
- Tool card expand/collapse: When a message has 2+ tool cards, an "Expand all / Collapse all" toggle appears above the card group. Scoped per message group, not global.
Track C: Architecture
projects.jsonflat file storage for project list (same pattern asworkspaces.jsonandsettings.json).project_idfield on Session model with backward-compatible null default._index.jsonincludesproject_idfor fast client-side filtering.
Tests: 13 new. Total: ~237. Hermes CLI parity impact: Low (CLI has no session organization) Claude parity impact: Very High (projects are a core Claude concept)
Candidates for later sprints
- Artifacts + code execution (HTML/SVG preview, inline Python execution)
- Voice input via Whisper
- Subagent delegation cards (enhanced tool card rendering)
Sprint 16 -- Session Sidebar Visual Polish (COMPLETED)
Theme: Make the session list feel high-quality and delightful.
Why now: The session sidebar had two visible UX bugs: titles truncated unnecessarily because action icons reserved space even when hidden, and the project folder icon felt "sticky" and awkward. Emoji icons rendered inconsistently across platforms. These were the most common visual complaints.
Track A: Bugs (from BUGS.md)
- Session title truncation. Action icons (pin, move, archive, dup, trash)
were always in the DOM with
flex-shrink:0, reserving ~30px even when invisible. Fix: wrapped all actions in a.session-actionsoverlay container withposition:absolute. Titles now use full available width. Actions appear on hover with a gradient fade from the right edge. - Folder button feels sticky. Replaced
.has-projectpersistent blue button with a colored left border matching the project color. The folder button now only appears in the hover overlay like all other actions.
Track B: Features
- SVG action icons. Replaced all emoji HTML entities (★, 📂, 📦, ⊕, 🗑)
with monochrome SVG line icons that inherit
currentColor. Consistent rendering across macOS, Linux, and Windows. Icons: pin (star), folder, archive (box), duplicate (overlapping squares), trash (bin with lines). - Pin indicator. Small gold filled-star icon rendered inline before the title only when the session is actually pinned. Unpinned sessions get full title width with zero space reservation.
- Project border indicator. Sessions assigned to a project show a colored left border matching the project color, replacing the old always-visible blue folder button.
- Hover overlay polish. Actions container uses a gradient background that fades from transparent to the sidebar color, creating a smooth emergence effect. Overlay hides automatically during inline rename.
Deferred to Sprint 17
- Slash commands (basic set with
commands.jsmodule) - Thinking/reasoning display for extended-thinking models
- Slash command autocomplete popup
Tests: 74 new (test_sprint16.py: safe HTML rendering, XSS security, sidebar polish). Total: 289. Hermes CLI parity impact: Low Claude parity impact: Medium (sidebar polish matches Claude's quality bar)
Sprint 17 -- Workspace Polish + Slash Commands + Settings (COMPLETED)
Theme: Workspace polish, slash commands, and composer settings.
Why now: Three things converge: @nothingmn filed Issue #22 requesting a tree/accordion workspace view (breadcrumb navigation is the foundation for that), slash commands were deferred from Sprint 16, and Issue #26 (send key personalization) fits naturally since we are already touching the keydown handler for slash command autocomplete.
Track A: Workspace Breadcrumb Navigation
- Breadcrumb path bar. When users click into subdirectories, a breadcrumb
bar appears showing the path (e.g.
~ / src / components) with clickable segments to navigate back. Hidden at root level for a clean UI. - Up button. Arrow-up button in the panel header navigates to the parent directory. Hidden when already at workspace root.
- Current directory tracking.
S.currentDirstate property tracks the active directory. File operations (rename, delete, new file, new folder) stay in the current directory instead of jumping back to root. - New file/folder in subdirectories. Creating files or folders now respects the current directory, creating them in the viewed subdirectory.
Track B: Slash Commands Foundation
- commands.js module. New 7th JS module with command registry, parser, autocomplete dropdown, and built-in command handlers.
- Built-in commands:
/help(list commands),/clear(clear conversation),/model <name>(switch model with fuzzy match),/workspace <name>(switch workspace),/new(start new session). - Autocomplete dropdown. Typing
/in the composer shows a filtered dropdown. Arrow keys navigate, Tab/Enter select, Escape closes. Positioned above the composer using the workspace dropdown CSS pattern. - Transparent pass-through. Unrecognized
/commands pass through to the agent normally (not intercepted).
Track C: Send Key Setting (Issue #26)
send_keysetting. New setting in Settings panel: "Enter" (default) or "Ctrl+Enter". Persisted tosettings.json. Loaded on boot.- Keydown handler rewrite. Combined handler for autocomplete navigation
and send key preference. When
ctrl+enteris selected, plain Enter inserts a newline and Ctrl/Cmd+Enter sends.
Deferred to Sprint 18
- Thinking/reasoning display for extended-thinking models
- Voice input via Whisper
- Workspace tree/accordion view (full implementation of Issue #22)
Tests: 6 new (test_sprint17.py). Total: 318. Hermes CLI parity impact: Low (slash commands add convenience) Claude parity impact: Medium (workspace nav, slash commands match Claude UX)
Sprint 18 -- Thinking Display + Workspace Tree + Preview Fix (COMPLETED)
Theme: Show the model's reasoning, improve workspace navigation, fix UX bug.
Why now: Thinking/reasoning display was deferred twice (Sprint 16 → 17 → 18). Workspace tree view was the #1 community request (Issue #22). File preview staying open on directory navigation was a daily-driver annoyance.
Track A: Bugs
- File preview auto-close. When viewing a file in the right panel and
navigating directories (breadcrumbs, up button, folder clicks), the preview
stayed visible with stale content. Fix: extracted
clearPreview()as a named function in boot.js and call it fromloadDir()in workspace.js.
Track B: Features
- Thinking/reasoning display. Assistant messages with structured content
arrays containing
type:'thinking'ortype:'reasoning'blocks now render as collapsible gold-themed cards above the response text. Collapsed by default, click header to expand. Works with Claude extended thinking and o3 reasoning tokens when preserved in the message array. - Workspace tree view (Issue #22). Directories expand/collapse in-place
with toggle arrows. Single-click toggles, double-click navigates (breadcrumb
view). Subdirectory contents fetched lazily and cached in
S._dirCache. Nesting depth shown via indentation. Empty directories show "(empty)".
Tests: 0 new (pure CSS/DOM changes). Total: 318. Hermes CLI parity impact: Low Claude parity impact: High (reasoning display matches Claude's UI)
Sprint 19 -- Auth + Security Hardening (COMPLETED)
Theme: Make this safe to leave running beyond localhost.
Why now: Issue #23 requested authentication. Auth is the last production hardening feature before the app is safe to expose to a network.
Track A: Bugs
- No request size limit. POST bodies were unbounded (DoS risk). Added 20MB
cap in
read_body().
Track B: Features
- Password authentication (Issue #23). Off by default — zero friction for
localhost. Enable via
HERMES_WEBUI_PASSWORDenv var or Settings panel. Password-only (no username — single-user app). Signed HMAC HTTP-only cookie with 24h TTL. Minimal dark-themed login page at/login. API calls without auth return 401; page loads redirect to/login. Settings panel gains "Access Password" field and "Sign Out" button. - Security headers. All responses now include
X-Content-Type-Options: nosniff,X-Frame-Options: DENY,Referrer-Policy: same-origin.
Track C: Architecture
- New
api/auth.pymodule: password hashing (SHA-256 + STATE_DIR salt), signed session cookies, auth middleware, public path allowlist. - Auth check in
server.pydo_GET/do_POST before routing. password_hashadded to_SETTINGS_DEFAULTSin config.py._set_passwordspecial field in save_settings for secure password updates.
Tests: 9 new. Total: 327. Hermes CLI parity impact: Low (CLI has no auth concerns) Claude parity impact: High (Claude is authenticated)
Sprint 20 -- Voice + TTS (PLANNED)
Theme: Input and output beyond the keyboard.
Why now: Voice works in the Hermes CLI. Mirror that capability in the web UI. TTS playback makes long responses more accessible. Both are achievable with existing Whisper and TTS APIs.
Track B: Features
- Voice input (Whisper). Microphone icon in composer. Hold to record, release to transcribe. Transcribed text editable before send.
- TTS playback. Speaker icon on assistant messages. Audio playback via OpenAI TTS or ElevenLabs API. Optional auto-play in settings.
Sprint 21 -- Mobile Responsive (PLANNED)
Theme: A genuinely good mobile experience, not just responsive CSS.
Track B: Features
- Collapsible sidebar. Hamburger menu replaces the always-visible sidebar.
- Touch-friendly session list. Tap to navigate, swipe gestures.
- Right panel as tab. Files panel hidden by default, accessible via tab.
- Composer focus behavior. Expands on focus, keyboard-aware.
- Consider a separate mobile-optimized layout rather than just media queries.
Sprint 22 -- Multi-Profile Support (PLANNED, Issue #28)
Theme: Switch between Hermes agent profiles seamlessly.
Track B: Features
- Profile picker. Sidebar or topbar dropdown to switch profiles.
- Per-profile config. Each profile has its own skills, memory, config.yaml.
- Seamless switching. No restart required.
Sprint 23 -- Desktop Application (PLANNED)
Theme: Native desktop experience.
Track B: Features
- Electron or Tauri wrapper. Native window, menu bar, notifications.
- Auto-start option. Launch on login.
- Packaged distribution. .dmg (macOS), .exe (Windows).
Sprint 24 -- Extended Command Support (PLANNED)
Theme: Deeper slash command and skill integration.
Track B: Features
- Skill-aware autocomplete.
/skill-nametriggers installed skills. - Command chaining. Compose multi-step commands.
- Agent tool exposure. Surface agent capabilities as slash commands.
Feature Parity Summary
Hermes CLI Parity (as of Sprint 19)
| CLI Feature | Status |
|---|---|
| Chat / agent loop | Done (v0.3) |
| Streaming responses | Done (v0.5) |
| Tool call visibility | Done (v0.11) |
| File ops (read/write/search/patch) | Done (v0.6) |
| Terminal commands | Done via workspace |
| Cron job management | Done (v0.9) |
| Skills management | Done (v0.9) |
| Memory read/write | Done (v0.9) |
| Session history | Done (v0.3) |
| Workspace switching | Done (v0.7) |
| Model selection | Done (v0.3) |
| Multi-provider model support | Done (Sprint 11) |
| Settings persistence | Done (Sprint 12) |
| Cron completion alerts | Done (Sprint 13) |
| Slash commands | Done (Sprint 17) |
| Thinking/reasoning display | Done (Sprint 18) |
| Auth / login | Done (Sprint 19) |
| Voice input | Sprint 20 |
| Subagent visibility | Deferred |
| Code execution (Jupyter) | Deferred |
| Toolset control | Deferred |
| Virtual scroll (perf) | Deferred |
Claude Parity (as of Sprint 19)
| Claude Feature | Status |
|---|---|
| Dark theme, 3-panel layout | Done (v0.1) |
| Streaming chat | Done (v0.5) |
| Model switching | Done (v0.3) |
| File attachments | Done (v0.6) |
| Syntax highlighting | Done (v0.10) |
| Tool use visibility | Done (v0.11) |
| Edit/regenerate messages | Done (v0.10) |
| Session management | Done (v0.6) |
| Mermaid diagrams | Done (Sprint 14) |
| Projects / folders | Done (Sprint 15) |
| Pinned/starred sessions | Done (Sprint 12) |
| Notifications | Done (Sprint 13) |
| Settings panel | Done (Sprint 12) |
| Reasoning display | Done (Sprint 18) |
| Auth / login | Done (Sprint 19) |
| Mobile layout (basic) | Done (v0.16.1) |
| Workspace tree view | Done (Sprint 18) |
| Slash commands | Done (Sprint 17) |
| Voice input | Sprint 20 |
| TTS playback | Sprint 20 |
| Artifacts (HTML/SVG preview) | Deferred |
| Code execution inline | Deferred |
| Mobile-optimized layout | Sprint 21 |
| Sharing / public URLs | Not planned (requires server infra) |
| Claude-specific features | Not replicable (Projects AI, artifacts sync) |
What is intentionally not planned
- Sharing / public conversation URLs: Requires a hosted backend with access control and CDN. Out of scope for a personal VPS deployment.
- Claude-specific model features: Claude-native Projects memory, extended artifacts sync, Anthropic's proprietary reasoning UI. These are Anthropic infrastructure, not reproducible.
- Real-time collaboration: Multiple users in the same session simultaneously. Single-user assumption throughout.
- Plugin marketplace: Hermes skills cover this use case already.
Last updated: April 3, 2026 Current version: v0.21 | 327 tests (304 passing) Next sprint: Sprint 20 (Voice + TTS)