docs: comprehensive update of all markdown files for v0.21

ARCHITECTURE.md:
- 6→7 JS modules (added commands.js), updated all line counts
- Added api/auth.py to file inventory
- Added HERMES_WEBUI_PASSWORD env var
- Added projects.json to state directory listing
- Replaced PORTABILITY.md ref with BUGS.md
- Updated test file references (test_sprint1-19, 327 functions)

ROADMAP.md:
- Version Sprint 17/v0.19 → Sprint 19/v0.21, test count 294→327
- Added Sprint 18 + 19 rows to sprint history table
- Updated architecture table (api/ 2491 lines, JS 3148 lines)
- Added sections: Workspace, Slash Commands, Security, Thinking
- Added Sprint 20-24 to Advanced/Future (voice, mobile, multi-profile,
  desktop, extended commands)

SPRINTS.md:
- Header v0.20→v0.21, 318→327 tests
- "Where we are now" updated from v0.18 to v0.21
- Removed two stale/duplicate "Sprint 18" sections (Voice + Subagent)
- Added completed Sprint 18 (thinking + tree + preview fix)
- Added completed Sprint 19 (auth + security)
- Added planned Sprints 20-24 (voice, mobile, multi-profile, desktop, commands)
- Parity tables fully updated with current Done/Deferred status

CHANGELOG.md:
- Added v0.21 Sprint 19 entry (auth, security headers, 20MB limit)

TESTING.md:
- Header "through Sprint 2" → "through Sprint 19 (v0.21)"
- Added test count and pytest command to header
- Added 9 new manual test sections covering Sprints 11-19
- Updated footer with current stats

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nathan Esquenazi
2026-04-03 06:06:00 -07:00
parent b8b62722ec
commit 66bd84accb
5 changed files with 292 additions and 142 deletions

View File

@@ -18,7 +18,7 @@ a central chat area, and a right panel for workspace file browsing.
The design philosophy is deliberately minimal. There is no build step, no bundler, no The design philosophy is deliberately minimal. There is no build step, no bundler, no
frontend framework. The Python server is split into a routing shell (server.py) and frontend framework. The Python server is split into a routing shell (server.py) and
business logic modules (api/). The frontend is six vanilla JS modules loaded from static/. business logic modules (api/). The frontend is seven vanilla JS modules loaded from static/.
This makes the code easy to modify from a terminal or by an agent. This makes the code easy to modify from a terminal or by an agent.
--- ---
@@ -26,38 +26,40 @@ This makes the code easy to modify from a terminal or by an agent.
## 2. File Inventory ## 2. File Inventory
<repo>/ <repo>/
server.py Thin routing shell + HTTP Handler. ~76 lines. Pure Python. server.py Thin routing shell + HTTP Handler + auth middleware. ~79 lines.
Delegates all route handling to api/routes.py. Delegates all route handling to api/routes.py.
start.sh Discovery script: finds agent dir, Python, starts server. start.sh Discovery script: finds agent dir, Python, starts server.
api/ api/
__init__.py Package marker __init__.py Package marker
routes.py All GET + POST route handlers (~1016 lines) auth.py Optional password authentication, signed cookies (~149 lines)
config.py Shared configuration, constants, global state, model discovery (~640 lines) routes.py All GET + POST route handlers (~1109 lines)
helpers.py HTTP helpers: j(), bad(), require(), safe_resolve() (~57 lines) config.py Shared configuration, constants, global state, model discovery (~654 lines)
helpers.py HTTP helpers: j(), bad(), require(), safe_resolve(), security headers (~71 lines)
models.py Session model + CRUD (~132 lines) models.py Session model + CRUD (~132 lines)
workspace.py File ops: list_dir, read_file_content, workspace helpers (~77 lines) workspace.py File ops: list_dir, read_file_content, workspace helpers (~77 lines)
upload.py Multipart parser, file upload handler (~77 lines) upload.py Multipart parser, file upload handler (~77 lines)
streaming.py SSE engine, run_agent integration, cancel support (~222 lines) streaming.py SSE engine, run_agent integration, cancel support (~222 lines)
static/ static/
index.html HTML template (served from disk) index.html HTML template (served from disk)
style.css All CSS style.css All CSS (~590 lines)
ui.js DOM helpers, renderMd, tool cards, model dropdown (~846 lines) ui.js DOM helpers, renderMd, tool cards, model dropdown, file tree (~957 lines)
workspace.js File tree, preview, file ops (~169 lines) workspace.js File preview, file ops, loadDir, clearPreview (~185 lines)
sessions.js Session CRUD, list rendering, search, SVG icons, overlay actions (~532 lines) sessions.js Session CRUD, list rendering, search, SVG icons, overlay actions (~532 lines)
messages.js send(), SSE event handlers, approval, transcript (~293 lines) messages.js send(), SSE event handlers, approval, transcript (~297 lines)
panels.js Cron, skills, memory, workspace, todo, switchPanel (~771 lines) panels.js Cron, skills, memory, workspace, todo, switchPanel, settings (~813 lines)
boot.js Event wiring + boot IIFE (~175 lines) commands.js Slash command registry, parser, autocomplete dropdown (~156 lines)
boot.js Event wiring, keydown handlers, boot IIFE (~208 lines)
tests/ tests/
conftest.py Isolated test server (port 8788, separate HERMES_HOME) (~240 lines) conftest.py Isolated test server (port 8788, separate HERMES_HOME) (~240 lines)
test_sprint1-16.py Feature tests per sprint (14 files, Sprints 1-11 + 16) test_sprint{1-19}.py Feature tests per sprint (17 files, 327 test functions)
test_regressions.py Permanent regression gate test_regressions.py Permanent regression gate (23 tests)
AGENTS.md Instruction file for agents working in this directory. AGENTS.md Instruction file for agents working in this directory.
ROADMAP.md Feature and product roadmap document. ROADMAP.md Feature and product roadmap document.
SPRINTS.md Forward sprint plan with CLI + Claude parity targets. SPRINTS.md Forward sprint plan with CLI + Claude parity targets.
ARCHITECTURE.md THIS FILE. ARCHITECTURE.md THIS FILE.
TESTING.md Manual browser test plan and automated coverage reference. TESTING.md Manual browser test plan and automated coverage reference.
CHANGELOG.md Release notes per sprint. CHANGELOG.md Release notes per sprint.
PORTABILITY.md Portability design spec for download-and-run installs. BUGS.md Bug backlog and fixed items tracker.
requirements.txt Python dependencies. requirements.txt Python dependencies.
.env.example Sample environment variable overrides. .env.example Sample environment variable overrides.
@@ -67,7 +69,8 @@ State directory (runtime data, separate from source):
sessions/ One JSON file per session: {session_id}.json sessions/ One JSON file per session: {session_id}.json
workspaces.json Registered workspaces list workspaces.json Registered workspaces list
last_workspace.txt Last-used workspace path last_workspace.txt Last-used workspace path
settings.json (future) User settings settings.json User settings (default model, workspace, send key, password hash)
projects.json Session project groups (name, color, id)
Log file: Log file:
@@ -91,6 +94,7 @@ Environment variables controlling behavior:
HERMES_WEBUI_STATE_DIR Where sessions/ folder lives HERMES_WEBUI_STATE_DIR Where sessions/ folder lives
HERMES_CONFIG_PATH Path to ~/.hermes/config.yaml HERMES_CONFIG_PATH Path to ~/.hermes/config.yaml
HERMES_WEBUI_DEFAULT_MODEL Default LLM model string HERMES_WEBUI_DEFAULT_MODEL Default LLM model string
HERMES_WEBUI_PASSWORD Optional: enable password auth (off by default)
Test isolation environment variables (set by conftest.py): Test isolation environment variables (set by conftest.py):

View File

@@ -5,6 +5,33 @@
--- ---
## [v0.21] Sprint 19 -- Auth + Security Hardening
*April 3, 2026 | 327 tests*
### Features
- **Password authentication (Issue #23).** Optional password auth, off by default.
Enable via `HERMES_WEBUI_PASSWORD` env var or Settings panel. Password-only
(single-user app). Signed HMAC HTTP-only cookie with 24h TTL. Minimal dark-themed
login page at `/login`. API calls without auth return 401; page loads redirect.
New `api/auth.py` module with hashing, verification, session management.
- **Security headers.** All responses now include `X-Content-Type-Options: nosniff`,
`X-Frame-Options: DENY`, `Referrer-Policy: same-origin`.
- **POST body size limit.** Non-upload POST bodies capped at 20MB via `read_body()`.
- **Settings panel additions.** "Access Password" field and "Sign Out" button
(only visible when auth is active).
### Architecture
- New `api/auth.py`: password hashing (SHA-256 + STATE_DIR salt), signed cookies,
auth middleware, public path allowlist.
- Auth check in `server.py` do_GET/do_POST before routing.
- `password_hash` added to `_SETTINGS_DEFAULTS`.
### Tests
- 9 new tests in `test_sprint19.py`: auth status, login flow, security headers,
cache-control, settings password field. Total: **327 tests (304 passing)**.
---
## [v0.20] Sprint 18 -- File Preview Auto-Close + Thinking Display + Workspace Tree ## [v0.20] Sprint 18 -- File Preview Auto-Close + Thinking Display + Workspace Tree
*April 3, 2026 | 318 tests* *April 3, 2026 | 318 tests*

View File

@@ -3,8 +3,8 @@
> Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI. > Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI.
> Everything you can do from the CLI terminal, you can do from this UI. > Everything you can do from the CLI terminal, you can do from this UI.
> >
> Last updated: Sprint 17 / v0.19 (April 3, 2026) > Last updated: Sprint 19 / v0.21 (April 3, 2026)
> Tests: 294 passing > Tests: 327 total (304 passing, 23 pre-existing failures)
> Source: <repo>/ > Source: <repo>/
--- ---
@@ -32,8 +32,10 @@
| Sprint 13 | Alerts + polish | Cron completion alerts (polling + badge), background error banner, session duplicate, browser tab title | 221 | | Sprint 13 | Alerts + polish | Cron completion alerts (polling + badge), background error banner, session duplicate, browser tab title | 221 |
| Sprint 14 | Visual polish + workspace ops | Mermaid diagrams, message timestamps, file rename, folder create, session tags, session archive | 233 | | Sprint 14 | Visual polish + workspace ops | Mermaid diagrams, message timestamps, file rename, folder create, session tags, session archive | 233 |
| Sprint 15 | Session projects + code copy | Session projects/folders, code block copy button, tool card expand/collapse toggle | 237 | | Sprint 15 | Session projects + code copy | Session projects/folders, code block copy button, tool card expand/collapse toggle | 237 |
| Sprint 16 | Session sidebar visual polish | SVG action icons, overlay hover actions, pin indicator, project border, custom model discovery, GLM-5.1 | 237 | | Sprint 16 | Session sidebar visual polish | SVG action icons, overlay hover actions, pin indicator, project border, safe HTML rendering | 289 |
| Sprint 17 | Workspace polish + slash commands + settings | Breadcrumb navigation, slash command autocomplete, send key setting (#26) | 294 | | Sprint 17 | Workspace polish + slash commands + settings | Breadcrumb navigation, slash command autocomplete, send key setting (#26) | 318 |
| Sprint 18 | Thinking display + workspace tree | File preview auto-close, thinking/reasoning cards, expandable directory tree (#22) | 318 |
| Sprint 19 | Auth + security hardening | Password auth (off by default), login page, security headers, 20MB body limit (#23) | 327 |
--- ---
@@ -41,10 +43,10 @@
| Layer | Location | Status | | Layer | Location | Status |
|-------|----------|--------| |-------|----------|--------|
| Python server | <repo>/server.py (~76 lines) + api/ modules (~2145 lines) | Thin shell + business logic in api/ | | Python server | <repo>/server.py (~79 lines) + api/ modules (~2491 lines) | Thin shell + auth middleware + business logic in api/ |
| HTML template | <repo>/static/index.html | Served from disk | | HTML template | <repo>/static/index.html | Served from disk |
| CSS | <repo>/static/style.css (~560 lines) | Served from disk | | CSS | <repo>/static/style.css (~590 lines) | Served from disk |
| JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot,commands}.js | 7 modules, ~2990 lines total | | JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot,commands}.js | 7 modules, ~3148 lines total |
| Runtime state | ~/.hermes/webui-mvp/sessions/ | Session JSON files | | Runtime state | ~/.hermes/webui-mvp/sessions/ | Session JSON files |
| Test server | Port 8788, state dir ~/.hermes/webui-mvp-test/ | Isolated, wiped per run | | Test server | Port 8788, state dir ~/.hermes/webui-mvp-test/ | Isolated, wiped per run |
| Production server | Port 8787 | SSH tunnel from Mac | | Production server | Port 8787 | SSH tunnel from Mac |
@@ -149,22 +151,42 @@
### Configuration ### Configuration
- [x] Settings panel (default model, default workspace) (Sprint 12) - [x] Settings panel (default model, default workspace) (Sprint 12)
- [x] Send key preference (Enter or Ctrl+Enter) (Sprint 17)
- [x] Password authentication (Sprint 19)
- [ ] Enable/disable toolsets per session (deferred) - [ ] Enable/disable toolsets per session (deferred)
### Notifications ### Notifications
- [x] Cron job completion alerts (Sprint 13) - [x] Cron job completion alerts (Sprint 13)
- [x] Background agent error alerts (Sprint 13) - [x] Background agent error alerts (Sprint 13)
### Workspace
- [x] Breadcrumb navigation in subdirectories (Sprint 17)
- [x] Workspace tree view with expand/collapse (Sprint 18, Issue #22)
- [x] File preview auto-close on directory navigation (Sprint 18)
### Slash Commands
- [x] Command registry + autocomplete dropdown (Sprint 17)
- [x] Built-in: /help, /clear, /model, /workspace, /new (Sprint 17)
### Security
- [x] Password auth with signed cookies (Sprint 19, Issue #23)
- [x] Security headers (X-Content-Type-Options, X-Frame-Options) (Sprint 19)
- [x] POST body size limit (20MB) (Sprint 19)
### Thinking / Reasoning
- [x] Collapsible thinking cards for extended-thinking models (Sprint 18)
### Advanced / Future ### Advanced / Future
- [ ] Voice input via Whisper (Wave 6) - [ ] Voice input via Whisper (Sprint 20)
- [ ] TTS playback of responses (Wave 6) - [ ] TTS playback of responses (Sprint 20)
- [ ] Subagent delegation cards (Wave 6) - [ ] Subagent delegation cards (deferred)
- [x] Background task cancel (activity bar Cancel button) - [x] Background task cancel (activity bar Cancel button)
- [ ] Code execution cell (Wave 6) - [ ] Code execution cell (deferred)
- [ ] Password authentication (Wave 7) - [ ] Mobile responsive layout (Sprint 21)
- [ ] HTTPS / reverse proxy (Wave 7) - [ ] Multi-profile support (Sprint 22, Issue #28)
- [ ] Mobile responsive layout (Wave 7) - [ ] Desktop application (Sprint 23)
- [ ] Virtual scroll for large lists (Wave 7) - [ ] Extended slash command / skill integration (Sprint 24)
- [ ] Virtual scroll for large lists (deferred)
--- ---

View File

@@ -1,6 +1,6 @@
# Hermes Web UI -- Forward Sprint Plan # Hermes Web UI -- Forward Sprint Plan
> Current state: v0.20 | 318 tests | Daily driver ready > Current state: v0.21 | 327 tests (304 passing) | Daily driver ready
> This document plans the path from here to two targets: > This document plans the path from here to two targets:
> >
> Target A: 1:1 feature parity with the Hermes CLI (everything you can do from the > Target A: 1:1 feature parity with the Hermes CLI (everything you can do from the
@@ -14,17 +14,19 @@
--- ---
## Where we are now (v0.18) ## Where we are now (v0.21)
**CLI parity: ~85% complete.** Core agent loop, all tools visible, workspace **CLI parity: ~90% complete.** Core agent loop, all tools visible, workspace
file ops, cron/skills/memory CRUD, session management, streaming, cancel, file ops with tree view, cron/skills/memory CRUD, session management, streaming,
multi-provider models, custom endpoint discovery -- all solid. Gaps are cancel, multi-provider models, custom endpoint discovery, slash commands,
subagent visibility, toolset control, and code execution. thinking/reasoning display, password auth -- all solid. Gaps are subagent
visibility, toolset control, and code execution.
**Claude parity: ~65% complete.** Chat, streaming, file browser, session **Claude parity: ~70% complete.** Chat, streaming, file browser, session
management, tool cards, syntax highlighting, model switching, projects, management, tool cards, syntax highlighting, model switching, projects,
settings, Mermaid diagrams, mobile layout -- all present. Gaps are settings, Mermaid diagrams, mobile layout, breadcrumb workspace nav, slash
artifacts, voice, reasoning display, sharing. commands, thinking display, auth -- all present. Gaps are artifacts, voice,
TTS, sharing, mobile-optimized layout.
--- ---
@@ -323,122 +325,136 @@ handler for slash command autocomplete.
--- ---
## Sprint 18 -- Voice + Multimodal Input ## Sprint 18 -- Thinking Display + Workspace Tree + Preview Fix (COMPLETED)
**Theme:** Input beyond the keyboard. **Theme:** Show the model's reasoning, improve workspace navigation, fix UX bug.
**Why now:** Voice is a meaningful quality-of-life feature for longer sessions **Why now:** Thinking/reasoning display was deferred twice (Sprint 16 → 17 → 18).
and is achievable with Whisper. Image input closes the last modality gap with Workspace tree view was the #1 community request (Issue #22). File preview
Claude (Claude accepts image paste natively -- we do too, but only as staying open on directory navigation was a daily-driver annoyance.
file uploads, not clipboard screenshots into the conversation directly).
### Track A: Bugs ### Track A: Bugs
- Image paste currently requires a click-to-attach flow. Direct paste into the - **File preview auto-close.** When viewing a file in the right panel and
message textarea should embed the image inline (as a preview chip) and queue navigating directories (breadcrumbs, up button, folder clicks), the preview
it for upload on Send. (Partially works -- clean up edge cases.) stayed visible with stale content. Fix: extracted `clearPreview()` as a named
- Large image uploads (>5MB) time out the upload step silently. function in boot.js and call it from `loadDir()` in workspace.js.
### Track B: Features ### Track B: Features
- **Voice input (Whisper):** A microphone icon in the composer. Hold to record, - **Thinking/reasoning display.** Assistant messages with structured content
release to transcribe via `POST /api/transcribe` (calls local Whisper or arrays containing `type:'thinking'` or `type:'reasoning'` blocks now render
OpenAI Whisper API). Transcribed text appears in the message input, editable as collapsible gold-themed cards above the response text. Collapsed by
before send. Supports the full "voice -> text -> Hermes response" loop. default, click header to expand. Works with Claude extended thinking and
- **TTS playback:** A speaker icon on assistant messages. Calls a TTS endpoint o3 reasoning tokens when preserved in the message array.
(ElevenLabs or OpenAI TTS) and plays the audio. Toggle per-message. Optional - **Workspace tree view (Issue #22).** Directories expand/collapse in-place
auto-play mode in settings. with toggle arrows. Single-click toggles, double-click navigates (breadcrumb
- **Vision input improvements:** Paste a screenshot directly from clipboard into view). Subdirectory contents fetched lazily and cached in `S._dirCache`.
the conversation (not just the tray). Shows as an inline preview chip with Nesting depth shown via indentation. Empty directories show "(empty)".
the image thumbnail. On Send, uploads and includes in the message.
### Track C: Architecture **Tests:** 0 new (pure CSS/DOM changes). Total: 318.
- Audio pipeline: `POST /api/transcribe` streams audio bytes, returns transcript. **Hermes CLI parity impact:** Low
`GET /api/tts?text=...` returns audio/mpeg. Both use lazy import of Whisper **Claude parity impact:** High (reasoning display matches Claude's UI)
and TTS libraries to keep cold start fast.
**Tests:** ~12 new. Total: ~271.
**Hermes CLI parity impact:** Medium (voice not in CLI, but adds capability)
**Claude parity impact:** High (Claude has native voice mode)
--- ---
## Sprint 18 -- Subagent Visibility + Agentic Transparency ## Sprint 19 -- Auth + Security Hardening (COMPLETED)
**Theme:** Watch Hermes think, not just respond. **Theme:** Make this safe to leave running beyond localhost.
**Why now:** When Hermes delegates to subagents (delegate_task, spawns parallel **Why now:** Issue #23 requested authentication. Auth is the last production
workstreams), the UI shows nothing. On long multi-agent tasks you have no idea hardening feature before the app is safe to expose to a network.
what's happening. This is the last major "CLI feels better" gap for power users.
### Track A: Bugs ### Track A: Bugs
- Tool cards for delegate_task show no information about what the subagent was - **No request size limit.** POST bodies were unbounded (DoS risk). Added 20MB
asked to do or what it returned. cap in `read_body()`.
- The activity bar text truncates at 55 chars -- tool previews for long terminal
commands show nothing useful.
### Track B: Features ### Track B: Features
- **Subagent delegation cards:** When `delegate_task` fires, show an expandable - **Password authentication (Issue #23).** Off by default — zero friction for
card with the subagent's goal, status (pending/running/done), and result localhost. Enable via `HERMES_WEBUI_PASSWORD` env var or Settings panel.
summary. Multiple subagents from one call appear as a card group. Uses the Password-only (no username — single-user app). Signed HMAC HTTP-only cookie
existing tool card infrastructure. with 24h TTL. Minimal dark-themed login page at `/login`. API calls without
- **Background task monitor:** A "Tasks" indicator in the topbar (separate from auth return 401; page loads redirect to `/login`. Settings panel gains
the cron Tasks panel). Shows count of active agent threads. Click opens a "Access Password" field and "Sign Out" button.
popover listing all in-flight streams with session names and elapsed times. - **Security headers.** All responses now include `X-Content-Type-Options: nosniff`,
Cancel any individual thread. This is the full job queue visibility the CLI `X-Frame-Options: DENY`, `Referrer-Policy: same-origin`.
implicitly has via `ps aux`.
- **Thinking/reasoning display:** When the model emits reasoning tokens (o3,
Claude extended thinking), show them in a collapsible "Reasoning" card above
the response. Collapsed by default. This matches Claude's reasoning display.
### Track C: Architecture ### Track C: Architecture
- Task registry: extend STREAMS to include session name, start time, and task - New `api/auth.py` module: password hashing (SHA-256 + STATE_DIR salt), signed
description. New `GET /api/tasks/active` endpoint returns all running streams session cookies, auth middleware, public path allowlist.
with metadata. - Auth check in `server.py` do_GET/do_POST before routing.
- `password_hash` added to `_SETTINGS_DEFAULTS` in config.py.
- `_set_password` special field in save_settings for secure password updates.
**Tests:** ~14 new. Total: ~285. **Tests:** 9 new. Total: 327.
**Hermes CLI parity impact:** Very High (subagent and task visibility is the **Hermes CLI parity impact:** Low (CLI has no auth concerns)
last major CLI gap) **Claude parity impact:** High (Claude is authenticated)
**Claude parity impact:** High (Claude shows reasoning, tool use visibly)
--- ---
## Sprint 19 -- Auth, HTTPS, and Production Hardening ## Sprint 20 -- Voice + TTS (PLANNED)
**Theme:** Make this safe to leave running. **Theme:** Input and output beyond the keyboard.
**Why now:** Everything else is done. This is the sprint you run when you want **Why now:** Voice works in the Hermes CLI. Mirror that capability in the web UI.
to expose the UI beyond localhost -- to a team, a mobile device, or a public TTS playback makes long responses more accessible. Both are achievable with
address. existing Whisper and TTS APIs.
### Track A: Bugs
- Server has no request size limit on non-upload endpoints (potential DoS).
- Session JSON files have no size cap (a runaway agent could write GBs).
### Track B: Features ### Track B: Features
- **Password authentication:** A login page with a configurable password - **Voice input (Whisper).** Microphone icon in composer. Hold to record,
(HERMES_WEBUI_PASSWORD env var). Signed cookie session (24h expiry). release to transcribe. Transcribed text editable before send.
Single-user model -- no accounts, no registration. - **TTS playback.** Speaker icon on assistant messages. Audio playback via
- **HTTPS / reverse proxy guide:** A one-page `DEPLOY.md` with instructions OpenAI TTS or ElevenLabs API. Optional auto-play in settings.
for running behind nginx + Let's Encrypt on a VPS. Configuration snippets
for systemd service, nginx config, certbot.
- **Mobile responsive layout:** Collapsible sidebar (hamburger). Touch-friendly
session list (swipe to delete, tap to navigate). Composer expands on focus.
Right panel hidden by default on mobile, accessible via a Files tab.
- **Rate limiting:** Simple per-IP token bucket on the chat/start endpoint
(configurable, default 10 req/min) to prevent accidental hammering.
### Track C: Architecture ---
- Helmet headers: X-Content-Type-Options, X-Frame-Options, HSTS (when served
over HTTPS). Simple middleware in the Handler.
**Tests:** ~12 new. Total: ~297. ## Sprint 21 -- Mobile Responsive (PLANNED)
**Hermes CLI parity impact:** Low (CLI has no auth/HTTPS concerns)
**Claude parity impact:** Very High (Claude is authenticated, HTTPS only) **Theme:** A genuinely good mobile experience, not just responsive CSS.
### Track B: Features
- **Collapsible sidebar.** Hamburger menu replaces the always-visible sidebar.
- **Touch-friendly session list.** Tap to navigate, swipe gestures.
- **Right panel as tab.** Files panel hidden by default, accessible via tab.
- **Composer focus behavior.** Expands on focus, keyboard-aware.
- Consider a separate mobile-optimized layout rather than just media queries.
---
## Sprint 22 -- Multi-Profile Support (PLANNED, Issue #28)
**Theme:** Switch between Hermes agent profiles seamlessly.
### Track B: Features
- **Profile picker.** Sidebar or topbar dropdown to switch profiles.
- **Per-profile config.** Each profile has its own skills, memory, config.yaml.
- **Seamless switching.** No restart required.
---
## Sprint 23 -- Desktop Application (PLANNED)
**Theme:** Native desktop experience.
### Track B: Features
- **Electron or Tauri wrapper.** Native window, menu bar, notifications.
- **Auto-start option.** Launch on login.
- **Packaged distribution.** .dmg (macOS), .exe (Windows).
---
## Sprint 24 -- Extended Command Support (PLANNED)
**Theme:** Deeper slash command and skill integration.
### Track B: Features
- **Skill-aware autocomplete.** `/skill-name` triggers installed skills.
- **Command chaining.** Compose multi-step commands.
- **Agent tool exposure.** Surface agent capabilities as slash commands.
--- ---
## Feature Parity Summary ## Feature Parity Summary
### After Sprint 18 (Hermes CLI parity: complete) ### Hermes CLI Parity (as of Sprint 19)
| CLI Feature | Status | | CLI Feature | Status |
|-------------|--------| |-------------|--------|
@@ -454,15 +470,18 @@ address.
| Workspace switching | Done (v0.7) | | Workspace switching | Done (v0.7) |
| Model selection | Done (v0.3) | | Model selection | Done (v0.3) |
| Multi-provider model support | Done (Sprint 11) | | Multi-provider model support | Done (Sprint 11) |
| Toolset control | Sprint 12 |
| Settings persistence | Done (Sprint 12) | | Settings persistence | Done (Sprint 12) |
| Subagent visibility | Sprint 18 |
| Background task monitor | Sprint 18 |
| Code execution (Jupyter) | Sprint 17+ |
| Cron completion alerts | Done (Sprint 13) | | Cron completion alerts | Done (Sprint 13) |
| Slash commands | Done (Sprint 17) |
| Thinking/reasoning display | Done (Sprint 18) |
| Auth / login | Done (Sprint 19) |
| Voice input | Sprint 20 |
| Subagent visibility | Deferred |
| Code execution (Jupyter) | Deferred |
| Toolset control | Deferred |
| Virtual scroll (perf) | Deferred | | Virtual scroll (perf) | Deferred |
### After Sprint 19 (Claude parity: ~90% complete) ### Claude Parity (as of Sprint 19)
| Claude Feature | Status | | Claude Feature | Status |
|----------------|--------| |----------------|--------|
@@ -474,19 +493,21 @@ address.
| Tool use visibility | Done (v0.11) | | Tool use visibility | Done (v0.11) |
| Edit/regenerate messages | Done (v0.10) | | Edit/regenerate messages | Done (v0.10) |
| Session management | Done (v0.6) | | Session management | Done (v0.6) |
| Artifacts (HTML/SVG preview) | Sprint 17+ |
| Code execution inline | Sprint 17+ |
| Mermaid diagrams | Done (Sprint 14) | | Mermaid diagrams | Done (Sprint 14) |
| Projects / folders | Done (Sprint 15) | | Projects / folders | Done (Sprint 15) |
| Pinned/starred sessions | Done (Sprint 12) | | Pinned/starred sessions | Done (Sprint 12) |
| Reasoning display | Sprint 18 |
| Voice input | Sprint 17 |
| TTS playback | Sprint 17 |
| Notifications | Done (Sprint 13) | | Notifications | Done (Sprint 13) |
| Settings panel | Done (Sprint 12) | | Settings panel | Done (Sprint 12) |
| Auth / login | Sprint 19 | | Reasoning display | Done (Sprint 18) |
| HTTPS | Sprint 19 | | Auth / login | Done (Sprint 19) |
| Mobile layout | Done (v0.16.1) | | Mobile layout (basic) | Done (v0.16.1) |
| Workspace tree view | Done (Sprint 18) |
| Slash commands | Done (Sprint 17) |
| Voice input | Sprint 20 |
| TTS playback | Sprint 20 |
| Artifacts (HTML/SVG preview) | Deferred |
| Code execution inline | Deferred |
| Mobile-optimized layout | Sprint 21 |
| Sharing / public URLs | Not planned (requires server infra) | | Sharing / public URLs | Not planned (requires server infra) |
| Claude-specific features | Not replicable (Projects AI, artifacts sync) | | Claude-specific features | Not replicable (Projects AI, artifacts sync) |
@@ -504,5 +525,5 @@ address.
--- ---
*Last updated: April 3, 2026* *Last updated: April 3, 2026*
*Current version: v0.19 | 318 tests* *Current version: v0.21 | 327 tests (304 passing)*
*Next sprint: Sprint 18 (Voice + Multimodal Input)* *Next sprint: Sprint 20 (Voice + TTS)*

View File

@@ -1,12 +1,15 @@
# Hermes Web UI: Browser Testing Plan # Hermes Web UI: Browser Testing Plan
> This document is for manual browser testing by you or by a Claude browser agent. > This document is for manual browser testing by you or by a Claude browser agent.
> It covers every user-facing feature of the UI through Sprint 2. > It covers user-facing features of the UI through Sprint 19 (v0.21).
> Each section is written as a step-by-step test procedure with expected outcomes. > Each section is written as a step-by-step test procedure with expected outcomes.
> A browser agent (e.g. Claude with Chrome access) can execute this plan directly. > A browser agent (e.g. Claude with Chrome access) can execute this plan directly.
> >
> Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser. > Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser.
> Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}. > Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}.
>
> Automated tests: 327 total (304 passing, 23 pre-existing failures).
> Run: `pytest tests/ -v --timeout=60`
--- ---
@@ -1593,8 +1596,81 @@ FAIL: User message gone, blank chat, response lands in wrong session.
--- ---
*Last updated: Post-Sprint 10 concurrency sweeps, March 31, 2026* ---
*Total automated tests: 190/190*
*Regression gate: tests/test_regressions.py (23 tests, one per introduced bug)* ## Sections Added Post-Sprint 10 (Sprints 11-19)
*Run: python -m pytest tests/ -v*
The following features were added in Sprints 11-19 and need manual browser testing.
Each has automated API-level tests in `tests/test_sprint{N}.py`.
### Sprint 11: Multi-Provider Models
- Open model dropdown. Verify models grouped by provider (OpenAI, Anthropic, Google, etc.)
- If custom `base_url` configured in config.yaml, verify local models appear in dropdown.
- Switch model. Send a message. Verify response uses selected model.
### Sprint 12: Settings + Pin + Import
- Click gear icon. Settings overlay opens.
- Change default model, save. Restart server. Verify setting persisted.
- Pin a session (star icon in hover overlay). Verify it floats to top of list.
- Export session as JSON. Import it back. Verify messages restored.
### Sprint 13: Alerts + Session QoL
- Duplicate a session (copy icon in hover overlay). Verify "(copy)" title.
- Browser tab title updates to active session name. Switch sessions — title changes.
### Sprint 14: Visual Polish + Workspace Ops
- Create a mermaid code block in a response. Verify diagram renders inline.
- Message timestamps visible next to role labels (hover for full date).
- Double-click a file in workspace panel to rename. Enter saves, Escape cancels.
- Create a folder via folder icon in workspace header.
- Add `#tag` to session title. Verify tag chip appears in sidebar. Click to filter.
- Archive a session. Verify it disappears. Toggle "Show archived" to see it.
### Sprint 15: Session Projects
- Click "+" in project bar to create a project. Type name, Enter.
- Click a project chip to filter sessions.
- Hover a session → click folder icon → assign to project via picker.
- Verify colored left border appears on assigned session.
- Double-click project chip to rename. Right-click to delete.
- Code blocks have a "Copy" button. Click → "Copied!" feedback.
- Messages with 2+ tool cards show "Expand all / Collapse all" toggle.
### Sprint 16: Sidebar Visual Polish
- Session titles use full sidebar width (no truncated space for hidden icons).
- Hover a session → action buttons appear from right with gradient fade.
- All icons are monochrome SVGs (not emoji). Consistent across platforms.
- Pinned sessions show small gold star inline. Unpinned = no star, full title width.
- Active session has gold highlight (not blue). Overlay gradient matches.
- Double-click to rename → overlay hides during rename.
### Sprint 17: Workspace + Slash Commands + Send Key
- Navigate into a subdirectory. Breadcrumb bar appears with clickable segments.
- Up button in panel header navigates to parent. Hidden at root.
- Type `/` in composer → autocomplete dropdown appears. Arrow keys navigate.
- Type `/help` → lists all commands. `/clear` clears conversation. `/model` switches.
- Settings panel: change send key to Ctrl+Enter. Verify Enter inserts newline.
### Sprint 18: Thinking + Tree View + Preview Fix
- View a file in workspace. Click a breadcrumb or folder → preview closes automatically.
- Click a directory toggle arrow (▸) → expands in-place showing children.
- Click again (▾) → collapses. Double-click navigates into it (breadcrumb view).
- If model returns thinking blocks (Claude extended thinking), verify collapsible gold card appears above response.
### Sprint 19: Auth + Security
- No password set: everything works as normal. No login page.
- Set `HERMES_WEBUI_PASSWORD=test` env var. Restart. All pages redirect to `/login`.
- Login page: minimal card, password field, "Sign in" button.
- Enter correct password → redirected to `/`. Cookie set (24h).
- Enter wrong password → error message, stay on login page.
- Settings panel: set password via "Access Password" field. Auth activates.
- "Sign Out" button visible when auth active. Click → redirected to /login.
- API calls without auth cookie → 401 JSON response.
- Check response headers: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`.
---
*Last updated: Sprint 19 / v0.21, April 3, 2026*
*Total automated tests: 327 (304 passing, 23 pre-existing failures in Sprint 3/5/7)*
*Regression gate: tests/test_regressions.py (23 tests)*
*Run: pytest tests/ -v --timeout=60*
*Source: <repo>/* *Source: <repo>/*