docs: comprehensive update of all markdown files for v0.21

ARCHITECTURE.md: - 6→7 JS modules (added commands.js), updated all line counts - Added api/auth.py to file inventory - Added HERMES_WEBUI_PASSWORD env var - Added projects.json to state directory listing - Replaced PORTABILITY.md ref with BUGS.md - Updated test file references (test_sprint1-19, 327 functions) ROADMAP.md: - Version Sprint 17/v0.19 → Sprint 19/v0.21, test count 294→327 - Added Sprint 18 + 19 rows to sprint history table - Updated architecture table (api/ 2491 lines, JS 3148 lines) - Added sections: Workspace, Slash Commands, Security, Thinking - Added Sprint 20-24 to Advanced/Future (voice, mobile, multi-profile, desktop, extended commands) SPRINTS.md: - Header v0.20→v0.21, 318→327 tests - "Where we are now" updated from v0.18 to v0.21 - Removed two stale/duplicate "Sprint 18" sections (Voice + Subagent) - Added completed Sprint 18 (thinking + tree + preview fix) - Added completed Sprint 19 (auth + security) - Added planned Sprints 20-24 (voice, mobile, multi-profile, desktop, commands) - Parity tables fully updated with current Done/Deferred status CHANGELOG.md: - Added v0.21 Sprint 19 entry (auth, security headers, 20MB limit) TESTING.md: - Header "through Sprint 2" → "through Sprint 19 (v0.21)" - Added test count and pytest command to header - Added 9 new manual test sections covering Sprints 11-19 - Updated footer with current stats Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 06:06:00 -07:00
parent b8b62722ec
commit 66bd84accb
5 changed files with 292 additions and 142 deletions
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -18,7 +18,7 @@ a central chat area, and a right panel for workspace file browsing.
 The design philosophy is deliberately minimal. There is no build step, no bundler, no
 frontend framework. The Python server is split into a routing shell (server.py) and
-business logic modules (api/). The frontend is six vanilla JS modules loaded from static/.
+business logic modules (api/). The frontend is seven vanilla JS modules loaded from static/.
 This makes the code easy to modify from a terminal or by an agent.
 ---
@@ -26,38 +26,40 @@ This makes the code easy to modify from a terminal or by an agent.
 ## 2. File Inventory
    <repo>/
-    server.py              Thin routing shell + HTTP Handler. ~76 lines. Pure Python.
+    server.py              Thin routing shell + HTTP Handler + auth middleware. ~79 lines.
                           Delegates all route handling to api/routes.py.
    start.sh               Discovery script: finds agent dir, Python, starts server.
    api/
      __init__.py          Package marker
-      routes.py            All GET + POST route handlers (~1016 lines)
+      auth.py              Optional password authentication, signed cookies (~149 lines)
-      config.py            Shared configuration, constants, global state, model discovery (~640 lines)
+      routes.py            All GET + POST route handlers (~1109 lines)
-      helpers.py           HTTP helpers: j(), bad(), require(), safe_resolve() (~57 lines)
+      config.py            Shared configuration, constants, global state, model discovery (~654 lines)
      helpers.py           HTTP helpers: j(), bad(), require(), safe_resolve(), security headers (~71 lines)
      models.py            Session model + CRUD (~132 lines)
      workspace.py         File ops: list_dir, read_file_content, workspace helpers (~77 lines)
      upload.py            Multipart parser, file upload handler (~77 lines)
      streaming.py         SSE engine, run_agent integration, cancel support (~222 lines)
    static/
      index.html           HTML template (served from disk)
-      style.css            All CSS
+      style.css            All CSS (~590 lines)
-      ui.js                DOM helpers, renderMd, tool cards, model dropdown (~846 lines)
+      ui.js                DOM helpers, renderMd, tool cards, model dropdown, file tree (~957 lines)
-      workspace.js         File tree, preview, file ops (~169 lines)
+      workspace.js         File preview, file ops, loadDir, clearPreview (~185 lines)
      sessions.js          Session CRUD, list rendering, search, SVG icons, overlay actions (~532 lines)
-      messages.js          send(), SSE event handlers, approval, transcript (~293 lines)
+      messages.js          send(), SSE event handlers, approval, transcript (~297 lines)
-      panels.js            Cron, skills, memory, workspace, todo, switchPanel (~771 lines)
+      panels.js            Cron, skills, memory, workspace, todo, switchPanel, settings (~813 lines)
-      boot.js              Event wiring + boot IIFE (~175 lines)
+      commands.js          Slash command registry, parser, autocomplete dropdown (~156 lines)
      boot.js              Event wiring, keydown handlers, boot IIFE (~208 lines)
    tests/
      conftest.py          Isolated test server (port 8788, separate HERMES_HOME) (~240 lines)
-      test_sprint1-16.py   Feature tests per sprint (14 files, Sprints 1-11 + 16)
+      test_sprint{1-19}.py Feature tests per sprint (17 files, 327 test functions)
-      test_regressions.py  Permanent regression gate
+      test_regressions.py  Permanent regression gate (23 tests)
    AGENTS.md              Instruction file for agents working in this directory.
    ROADMAP.md             Feature and product roadmap document.
    SPRINTS.md             Forward sprint plan with CLI + Claude parity targets.
    ARCHITECTURE.md        THIS FILE.
    TESTING.md             Manual browser test plan and automated coverage reference.
    CHANGELOG.md           Release notes per sprint.
-    PORTABILITY.md         Portability design spec for download-and-run installs.
+    BUGS.md                Bug backlog and fixed items tracker.
    requirements.txt       Python dependencies.
    .env.example           Sample environment variable overrides.
@@ -67,7 +69,8 @@ State directory (runtime data, separate from source):
    sessions/          One JSON file per session: {session_id}.json
    workspaces.json    Registered workspaces list
    last_workspace.txt Last-used workspace path
-    settings.json      (future) User settings
+    settings.json      User settings (default model, workspace, send key, password hash)
    projects.json      Session project groups (name, color, id)
 Log file:
@@ -91,6 +94,7 @@ Environment variables controlling behavior:
    HERMES_WEBUI_STATE_DIR         Where sessions/ folder lives
    HERMES_CONFIG_PATH             Path to ~/.hermes/config.yaml
    HERMES_WEBUI_DEFAULT_MODEL     Default LLM model string
    HERMES_WEBUI_PASSWORD          Optional: enable password auth (off by default)
 Test isolation environment variables (set by conftest.py):
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,33 @@
 ---
 ## [v0.21] Sprint 19 -- Auth + Security Hardening
 *April 3, 2026 | 327 tests*
 ### Features
 - **Password authentication (Issue #23).** Optional password auth, off by default.
  Enable via `HERMES_WEBUI_PASSWORD` env var or Settings panel. Password-only
  (single-user app). Signed HMAC HTTP-only cookie with 24h TTL. Minimal dark-themed
  login page at `/login`. API calls without auth return 401; page loads redirect.
  New `api/auth.py` module with hashing, verification, session management.
 - **Security headers.** All responses now include `X-Content-Type-Options: nosniff`,
  `X-Frame-Options: DENY`, `Referrer-Policy: same-origin`.
 - **POST body size limit.** Non-upload POST bodies capped at 20MB via `read_body()`.
 - **Settings panel additions.** "Access Password" field and "Sign Out" button
  (only visible when auth is active).
 ### Architecture
 - New `api/auth.py`: password hashing (SHA-256 + STATE_DIR salt), signed cookies,
  auth middleware, public path allowlist.
 - Auth check in `server.py` do_GET/do_POST before routing.
 - `password_hash` added to `_SETTINGS_DEFAULTS`.
 ### Tests
 - 9 new tests in `test_sprint19.py`: auth status, login flow, security headers,
  cache-control, settings password field. Total: **327 tests (304 passing)**.
 ---
 ## [v0.20] Sprint 18 -- File Preview Auto-Close + Thinking Display + Workspace Tree
 *April 3, 2026 | 318 tests*
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -3,8 +3,8 @@
 > Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI.
 > Everything you can do from the CLI terminal, you can do from this UI.
 >
-> Last updated: Sprint 17 / v0.19 (April 3, 2026)
+> Last updated: Sprint 19 / v0.21 (April 3, 2026)
-> Tests: 294 passing
+> Tests: 327 total (304 passing, 23 pre-existing failures)
 > Source: <repo>/
 ---
@@ -32,8 +32,10 @@
 | Sprint 13 | Alerts + polish | Cron completion alerts (polling + badge), background error banner, session duplicate, browser tab title | 221 |
 | Sprint 14 | Visual polish + workspace ops | Mermaid diagrams, message timestamps, file rename, folder create, session tags, session archive | 233 |
 | Sprint 15 | Session projects + code copy | Session projects/folders, code block copy button, tool card expand/collapse toggle | 237 |
-| Sprint 16 | Session sidebar visual polish | SVG action icons, overlay hover actions, pin indicator, project border, custom model discovery, GLM-5.1 | 237 |
+| Sprint 16 | Session sidebar visual polish | SVG action icons, overlay hover actions, pin indicator, project border, safe HTML rendering | 289 |
-| Sprint 17 | Workspace polish + slash commands + settings | Breadcrumb navigation, slash command autocomplete, send key setting (#26) | 294 |
+| Sprint 17 | Workspace polish + slash commands + settings | Breadcrumb navigation, slash command autocomplete, send key setting (#26) | 318 |
 | Sprint 18 | Thinking display + workspace tree | File preview auto-close, thinking/reasoning cards, expandable directory tree (#22) | 318 |
 | Sprint 19 | Auth + security hardening | Password auth (off by default), login page, security headers, 20MB body limit (#23) | 327 |
 ---
@@ -41,10 +43,10 @@
 | Layer | Location | Status |
 |-------|----------|--------|
-| Python server | <repo>/server.py (~76 lines) + api/ modules (~2145 lines) | Thin shell + business logic in api/ |
+| Python server | <repo>/server.py (~79 lines) + api/ modules (~2491 lines) | Thin shell + auth middleware + business logic in api/ |
 | HTML template | <repo>/static/index.html | Served from disk |
-| CSS | <repo>/static/style.css (~560 lines) | Served from disk |
+| CSS | <repo>/static/style.css (~590 lines) | Served from disk |
-| JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot,commands}.js | 7 modules, ~2990 lines total |
+| JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot,commands}.js | 7 modules, ~3148 lines total |
 | Runtime state | ~/.hermes/webui-mvp/sessions/ | Session JSON files |
 | Test server | Port 8788, state dir ~/.hermes/webui-mvp-test/ | Isolated, wiped per run |
 | Production server | Port 8787 | SSH tunnel from Mac |
@@ -149,22 +151,42 @@
 ### Configuration
 - [x] Settings panel (default model, default workspace) (Sprint 12)
 - [x] Send key preference (Enter or Ctrl+Enter) (Sprint 17)
 - [x] Password authentication (Sprint 19)
 - [ ] Enable/disable toolsets per session (deferred)
 ### Notifications
 - [x] Cron job completion alerts (Sprint 13)
 - [x] Background agent error alerts (Sprint 13)
 ### Workspace
 - [x] Breadcrumb navigation in subdirectories (Sprint 17)
 - [x] Workspace tree view with expand/collapse (Sprint 18, Issue #22)
 - [x] File preview auto-close on directory navigation (Sprint 18)
 ### Slash Commands
 - [x] Command registry + autocomplete dropdown (Sprint 17)
 - [x] Built-in: /help, /clear, /model, /workspace, /new (Sprint 17)
 ### Security
 - [x] Password auth with signed cookies (Sprint 19, Issue #23)
 - [x] Security headers (X-Content-Type-Options, X-Frame-Options) (Sprint 19)
 - [x] POST body size limit (20MB) (Sprint 19)
 ### Thinking / Reasoning
 - [x] Collapsible thinking cards for extended-thinking models (Sprint 18)
 ### Advanced / Future
- [ ] Voice input via Whisper (Wave 6)
+- [ ] Voice input via Whisper (Sprint 20)
- [ ] TTS playback of responses (Wave 6)
+- [ ] TTS playback of responses (Sprint 20)
- [ ] Subagent delegation cards (Wave 6)
+- [ ] Subagent delegation cards (deferred)
 - [x] Background task cancel (activity bar Cancel button)
- [ ] Code execution cell (Wave 6)
+- [ ] Code execution cell (deferred)
- [ ] Password authentication (Wave 7)
+- [ ] Mobile responsive layout (Sprint 21)
- [ ] HTTPS / reverse proxy (Wave 7)
+- [ ] Multi-profile support (Sprint 22, Issue #28)
- [ ] Mobile responsive layout (Wave 7)
+- [ ] Desktop application (Sprint 23)
- [ ] Virtual scroll for large lists (Wave 7)
+- [ ] Extended slash command / skill integration (Sprint 24)
 - [ ] Virtual scroll for large lists (deferred)
 ---
--- a/SPRINTS.md
+++ b/SPRINTS.md
@@ -1,6 +1,6 @@
 # Hermes Web UI -- Forward Sprint Plan
-> Current state: v0.20 | 318 tests | Daily driver ready
+> Current state: v0.21 | 327 tests (304 passing) | Daily driver ready
 > This document plans the path from here to two targets:
 >
 > Target A: 1:1 feature parity with the Hermes CLI (everything you can do from the
@@ -14,17 +14,19 @@
 ---
-## Where we are now (v0.18)
+## Where we are now (v0.21)
-**CLI parity: ~85% complete.** Core agent loop, all tools visible, workspace
+**CLI parity: ~90% complete.** Core agent loop, all tools visible, workspace
-file ops, cron/skills/memory CRUD, session management, streaming, cancel,
+file ops with tree view, cron/skills/memory CRUD, session management, streaming,
-multi-provider models, custom endpoint discovery -- all solid. Gaps are
+cancel, multi-provider models, custom endpoint discovery, slash commands,
-subagent visibility, toolset control, and code execution.
+thinking/reasoning display, password auth -- all solid. Gaps are subagent
 visibility, toolset control, and code execution.
-**Claude parity: ~65% complete.** Chat, streaming, file browser, session
+**Claude parity: ~70% complete.** Chat, streaming, file browser, session
 management, tool cards, syntax highlighting, model switching, projects,
-settings, Mermaid diagrams, mobile layout -- all present. Gaps are
+settings, Mermaid diagrams, mobile layout, breadcrumb workspace nav, slash
-artifacts, voice, reasoning display, sharing.
+commands, thinking display, auth -- all present. Gaps are artifacts, voice,
 TTS, sharing, mobile-optimized layout.
 ---
@@ -323,122 +325,136 @@ handler for slash command autocomplete.
 ---
-## Sprint 18 -- Voice + Multimodal Input
+## Sprint 18 -- Thinking Display + Workspace Tree + Preview Fix (COMPLETED)
-**Theme:** Input beyond the keyboard.
+**Theme:** Show the model's reasoning, improve workspace navigation, fix UX bug.
-**Why now:** Voice is a meaningful quality-of-life feature for longer sessions
+**Why now:** Thinking/reasoning display was deferred twice (Sprint 16 → 17 → 18).
-and is achievable with Whisper. Image input closes the last modality gap with
+Workspace tree view was the #1 community request (Issue #22). File preview
-Claude (Claude accepts image paste natively -- we do too, but only as
+staying open on directory navigation was a daily-driver annoyance.
 file uploads, not clipboard screenshots into the conversation directly).
 ### Track A: Bugs
- Image paste currently requires a click-to-attach flow. Direct paste into the
+- **File preview auto-close.** When viewing a file in the right panel and
-  message textarea should embed the image inline (as a preview chip) and queue
+  navigating directories (breadcrumbs, up button, folder clicks), the preview
-  it for upload on Send. (Partially works -- clean up edge cases.)
+  stayed visible with stale content. Fix: extracted `clearPreview()` as a named
- Large image uploads (>5MB) time out the upload step silently.
+  function in boot.js and call it from `loadDir()` in workspace.js.
 ### Track B: Features
- **Voice input (Whisper):** A microphone icon in the composer. Hold to record,
+- **Thinking/reasoning display.** Assistant messages with structured content
-  release to transcribe via `POST /api/transcribe` (calls local Whisper or
+  arrays containing `type:'thinking'` or `type:'reasoning'` blocks now render
-  OpenAI Whisper API). Transcribed text appears in the message input, editable
+  as collapsible gold-themed cards above the response text. Collapsed by
-  before send. Supports the full "voice -> text -> Hermes response" loop.
+  default, click header to expand. Works with Claude extended thinking and
- **TTS playback:** A speaker icon on assistant messages. Calls a TTS endpoint
+  o3 reasoning tokens when preserved in the message array.
-  (ElevenLabs or OpenAI TTS) and plays the audio. Toggle per-message. Optional
+- **Workspace tree view (Issue #22).** Directories expand/collapse in-place
-  auto-play mode in settings.
+  with toggle arrows. Single-click toggles, double-click navigates (breadcrumb
- **Vision input improvements:** Paste a screenshot directly from clipboard into
+  view). Subdirectory contents fetched lazily and cached in `S._dirCache`.
-  the conversation (not just the tray). Shows as an inline preview chip with
+  Nesting depth shown via indentation. Empty directories show "(empty)".
  the image thumbnail. On Send, uploads and includes in the message.
-### Track C: Architecture
+**Tests:** 0 new (pure CSS/DOM changes). Total: 318.
- Audio pipeline: `POST /api/transcribe` streams audio bytes, returns transcript.
+**Hermes CLI parity impact:** Low
-  `GET /api/tts?text=...` returns audio/mpeg. Both use lazy import of Whisper
+**Claude parity impact:** High (reasoning display matches Claude's UI)
  and TTS libraries to keep cold start fast.
 **Tests:** ~12 new. Total: ~271.
 **Hermes CLI parity impact:** Medium (voice not in CLI, but adds capability)
 **Claude parity impact:** High (Claude has native voice mode)
 ---
-## Sprint 18 -- Subagent Visibility + Agentic Transparency
+## Sprint 19 -- Auth + Security Hardening (COMPLETED)
-**Theme:** Watch Hermes think, not just respond.
+**Theme:** Make this safe to leave running beyond localhost.
-**Why now:** When Hermes delegates to subagents (delegate_task, spawns parallel
+**Why now:** Issue #23 requested authentication. Auth is the last production
-workstreams), the UI shows nothing. On long multi-agent tasks you have no idea
+hardening feature before the app is safe to expose to a network.
 what's happening. This is the last major "CLI feels better" gap for power users.
 ### Track A: Bugs
- Tool cards for delegate_task show no information about what the subagent was
+- **No request size limit.** POST bodies were unbounded (DoS risk). Added 20MB
-  asked to do or what it returned.
+  cap in `read_body()`.
 - The activity bar text truncates at 55 chars -- tool previews for long terminal
  commands show nothing useful.
 ### Track B: Features
- **Subagent delegation cards:** When `delegate_task` fires, show an expandable
+- **Password authentication (Issue #23).** Off by default — zero friction for
-  card with the subagent's goal, status (pending/running/done), and result
+  localhost. Enable via `HERMES_WEBUI_PASSWORD` env var or Settings panel.
-  summary. Multiple subagents from one call appear as a card group. Uses the
+  Password-only (no username — single-user app). Signed HMAC HTTP-only cookie
-  existing tool card infrastructure.
+  with 24h TTL. Minimal dark-themed login page at `/login`. API calls without
- **Background task monitor:** A "Tasks" indicator in the topbar (separate from
+  auth return 401; page loads redirect to `/login`. Settings panel gains
-  the cron Tasks panel). Shows count of active agent threads. Click opens a
+  "Access Password" field and "Sign Out" button.
-  popover listing all in-flight streams with session names and elapsed times.
+- **Security headers.** All responses now include `X-Content-Type-Options: nosniff`,
-  Cancel any individual thread. This is the full job queue visibility the CLI
+  `X-Frame-Options: DENY`, `Referrer-Policy: same-origin`.
  implicitly has via `ps aux`.
 - **Thinking/reasoning display:** When the model emits reasoning tokens (o3,
  Claude extended thinking), show them in a collapsible "Reasoning" card above
  the response. Collapsed by default. This matches Claude's reasoning display.
 ### Track C: Architecture
- Task registry: extend STREAMS to include session name, start time, and task
+- New `api/auth.py` module: password hashing (SHA-256 + STATE_DIR salt), signed
-  description. New `GET /api/tasks/active` endpoint returns all running streams
+  session cookies, auth middleware, public path allowlist.
-  with metadata.
+- Auth check in `server.py` do_GET/do_POST before routing.
 - `password_hash` added to `_SETTINGS_DEFAULTS` in config.py.
 - `_set_password` special field in save_settings for secure password updates.
-**Tests:** ~14 new. Total: ~285.
+**Tests:** 9 new. Total: 327.
-**Hermes CLI parity impact:** Very High (subagent and task visibility is the
+**Hermes CLI parity impact:** Low (CLI has no auth concerns)
-  last major CLI gap)
+**Claude parity impact:** High (Claude is authenticated)
 **Claude parity impact:** High (Claude shows reasoning, tool use visibly)
 ---
-## Sprint 19 -- Auth, HTTPS, and Production Hardening
+## Sprint 20 -- Voice + TTS (PLANNED)
-**Theme:** Make this safe to leave running.
+**Theme:** Input and output beyond the keyboard.
-**Why now:** Everything else is done. This is the sprint you run when you want
+**Why now:** Voice works in the Hermes CLI. Mirror that capability in the web UI.
-to expose the UI beyond localhost -- to a team, a mobile device, or a public
+TTS playback makes long responses more accessible. Both are achievable with
-address.
+existing Whisper and TTS APIs.
 ### Track A: Bugs
 - Server has no request size limit on non-upload endpoints (potential DoS).
 - Session JSON files have no size cap (a runaway agent could write GBs).
 ### Track B: Features
- **Password authentication:** A login page with a configurable password
+- **Voice input (Whisper).** Microphone icon in composer. Hold to record,
-  (HERMES_WEBUI_PASSWORD env var). Signed cookie session (24h expiry).
+  release to transcribe. Transcribed text editable before send.
-  Single-user model -- no accounts, no registration.
+- **TTS playback.** Speaker icon on assistant messages. Audio playback via
- **HTTPS / reverse proxy guide:** A one-page `DEPLOY.md` with instructions
+  OpenAI TTS or ElevenLabs API. Optional auto-play in settings.
  for running behind nginx + Let's Encrypt on a VPS. Configuration snippets
  for systemd service, nginx config, certbot.
 - **Mobile responsive layout:** Collapsible sidebar (hamburger). Touch-friendly
  session list (swipe to delete, tap to navigate). Composer expands on focus.
  Right panel hidden by default on mobile, accessible via a Files tab.
 - **Rate limiting:** Simple per-IP token bucket on the chat/start endpoint
  (configurable, default 10 req/min) to prevent accidental hammering.
-### Track C: Architecture
+---
 - Helmet headers: X-Content-Type-Options, X-Frame-Options, HSTS (when served
  over HTTPS). Simple middleware in the Handler.
-**Tests:** ~12 new. Total: ~297.
+## Sprint 21 -- Mobile Responsive (PLANNED)
-**Hermes CLI parity impact:** Low (CLI has no auth/HTTPS concerns)
+
-**Claude parity impact:** Very High (Claude is authenticated, HTTPS only)
+**Theme:** A genuinely good mobile experience, not just responsive CSS.
 ### Track B: Features
 - **Collapsible sidebar.** Hamburger menu replaces the always-visible sidebar.
 - **Touch-friendly session list.** Tap to navigate, swipe gestures.
 - **Right panel as tab.** Files panel hidden by default, accessible via tab.
 - **Composer focus behavior.** Expands on focus, keyboard-aware.
 - Consider a separate mobile-optimized layout rather than just media queries.
 ---
 ## Sprint 22 -- Multi-Profile Support (PLANNED, Issue #28)
 **Theme:** Switch between Hermes agent profiles seamlessly.
 ### Track B: Features
 - **Profile picker.** Sidebar or topbar dropdown to switch profiles.
 - **Per-profile config.** Each profile has its own skills, memory, config.yaml.
 - **Seamless switching.** No restart required.
 ---
 ## Sprint 23 -- Desktop Application (PLANNED)
 **Theme:** Native desktop experience.
 ### Track B: Features
 - **Electron or Tauri wrapper.** Native window, menu bar, notifications.
 - **Auto-start option.** Launch on login.
 - **Packaged distribution.** .dmg (macOS), .exe (Windows).
 ---
 ## Sprint 24 -- Extended Command Support (PLANNED)
 **Theme:** Deeper slash command and skill integration.
 ### Track B: Features
 - **Skill-aware autocomplete.** `/skill-name` triggers installed skills.
 - **Command chaining.** Compose multi-step commands.
 - **Agent tool exposure.** Surface agent capabilities as slash commands.
 ---
 ## Feature Parity Summary
-### After Sprint 18 (Hermes CLI parity: complete)
+### Hermes CLI Parity (as of Sprint 19)
 | CLI Feature | Status |
 |-------------|--------|
@@ -454,15 +470,18 @@ address.
 | Workspace switching | Done (v0.7) |
 | Model selection | Done (v0.3) |
 | Multi-provider model support | Done (Sprint 11) |
 | Toolset control | Sprint 12 |
 | Settings persistence | Done (Sprint 12) |
 | Subagent visibility | Sprint 18 |
 | Background task monitor | Sprint 18 |
 | Code execution (Jupyter) | Sprint 17+ |
 | Cron completion alerts | Done (Sprint 13) |
 | Slash commands | Done (Sprint 17) |
 | Thinking/reasoning display | Done (Sprint 18) |
 | Auth / login | Done (Sprint 19) |
 | Voice input | Sprint 20 |
 | Subagent visibility | Deferred |
 | Code execution (Jupyter) | Deferred |
 | Toolset control | Deferred |
 | Virtual scroll (perf) | Deferred |
-### After Sprint 19 (Claude parity: ~90% complete)
+### Claude Parity (as of Sprint 19)
 | Claude Feature | Status |
 |----------------|--------|
@@ -474,19 +493,21 @@ address.
 | Tool use visibility | Done (v0.11) |
 | Edit/regenerate messages | Done (v0.10) |
 | Session management | Done (v0.6) |
 | Artifacts (HTML/SVG preview) | Sprint 17+ |
 | Code execution inline | Sprint 17+ |
 | Mermaid diagrams | Done (Sprint 14) |
 | Projects / folders | Done (Sprint 15) |
 | Pinned/starred sessions | Done (Sprint 12) |
 | Reasoning display | Sprint 18 |
 | Voice input | Sprint 17 |
 | TTS playback | Sprint 17 |
 | Notifications | Done (Sprint 13) |
 | Settings panel | Done (Sprint 12) |
-| Auth / login | Sprint 19 |
+| Reasoning display | Done (Sprint 18) |
-| HTTPS | Sprint 19 |
+| Auth / login | Done (Sprint 19) |
-| Mobile layout | Done (v0.16.1) |
+| Mobile layout (basic) | Done (v0.16.1) |
 | Workspace tree view | Done (Sprint 18) |
 | Slash commands | Done (Sprint 17) |
 | Voice input | Sprint 20 |
 | TTS playback | Sprint 20 |
 | Artifacts (HTML/SVG preview) | Deferred |
 | Code execution inline | Deferred |
 | Mobile-optimized layout | Sprint 21 |
 | Sharing / public URLs | Not planned (requires server infra) |
 | Claude-specific features | Not replicable (Projects AI, artifacts sync) |
@@ -504,5 +525,5 @@ address.
 ---
 *Last updated: April 3, 2026*
-*Current version: v0.19 | 318 tests*
+*Current version: v0.21 | 327 tests (304 passing)*
-*Next sprint: Sprint 18 (Voice + Multimodal Input)*
+*Next sprint: Sprint 20 (Voice + TTS)*
--- a/TESTING.md
+++ b/TESTING.md
@@ -1,12 +1,15 @@
 # Hermes Web UI: Browser Testing Plan
 > This document is for manual browser testing by you or by a Claude browser agent.
-> It covers every user-facing feature of the UI through Sprint 2.
+> It covers user-facing features of the UI through Sprint 19 (v0.21).
 > Each section is written as a step-by-step test procedure with expected outcomes.
 > A browser agent (e.g. Claude with Chrome access) can execute this plan directly.
 >
 > Prerequisites: SSH tunnel is active on port 8787. Open http://localhost:8787 in browser.
 > Server health check: curl http://127.0.0.1:8787/health should return {"status":"ok"}.
 >
 > Automated tests: 327 total (304 passing, 23 pre-existing failures).
 > Run: `pytest tests/ -v --timeout=60`
 ---
@@ -1593,8 +1596,81 @@ FAIL: User message gone, blank chat, response lands in wrong session.
 ---
-*Last updated: Post-Sprint 10 concurrency sweeps, March 31, 2026*
+---
-*Total automated tests: 190/190*
+
-*Regression gate: tests/test_regressions.py (23 tests, one per introduced bug)*
+## Sections Added Post-Sprint 10 (Sprints 11-19)
-*Run: python -m pytest tests/ -v*
+
 The following features were added in Sprints 11-19 and need manual browser testing.
 Each has automated API-level tests in `tests/test_sprint{N}.py`.
 ### Sprint 11: Multi-Provider Models
 - Open model dropdown. Verify models grouped by provider (OpenAI, Anthropic, Google, etc.)
 - If custom `base_url` configured in config.yaml, verify local models appear in dropdown.
 - Switch model. Send a message. Verify response uses selected model.
 ### Sprint 12: Settings + Pin + Import
 - Click gear icon. Settings overlay opens.
 - Change default model, save. Restart server. Verify setting persisted.
 - Pin a session (star icon in hover overlay). Verify it floats to top of list.
 - Export session as JSON. Import it back. Verify messages restored.
 ### Sprint 13: Alerts + Session QoL
 - Duplicate a session (copy icon in hover overlay). Verify "(copy)" title.
 - Browser tab title updates to active session name. Switch sessions — title changes.
 ### Sprint 14: Visual Polish + Workspace Ops
 - Create a mermaid code block in a response. Verify diagram renders inline.
 - Message timestamps visible next to role labels (hover for full date).
 - Double-click a file in workspace panel to rename. Enter saves, Escape cancels.
 - Create a folder via folder icon in workspace header.
 - Add `#tag` to session title. Verify tag chip appears in sidebar. Click to filter.
 - Archive a session. Verify it disappears. Toggle "Show archived" to see it.
 ### Sprint 15: Session Projects
 - Click "+" in project bar to create a project. Type name, Enter.
 - Click a project chip to filter sessions.
 - Hover a session → click folder icon → assign to project via picker.
 - Verify colored left border appears on assigned session.
 - Double-click project chip to rename. Right-click to delete.
 - Code blocks have a "Copy" button. Click → "Copied!" feedback.
 - Messages with 2+ tool cards show "Expand all / Collapse all" toggle.
 ### Sprint 16: Sidebar Visual Polish
 - Session titles use full sidebar width (no truncated space for hidden icons).
 - Hover a session → action buttons appear from right with gradient fade.
 - All icons are monochrome SVGs (not emoji). Consistent across platforms.
 - Pinned sessions show small gold star inline. Unpinned = no star, full title width.
 - Active session has gold highlight (not blue). Overlay gradient matches.
 - Double-click to rename → overlay hides during rename.
 ### Sprint 17: Workspace + Slash Commands + Send Key
 - Navigate into a subdirectory. Breadcrumb bar appears with clickable segments.
 - Up button in panel header navigates to parent. Hidden at root.
 - Type `/` in composer → autocomplete dropdown appears. Arrow keys navigate.
 - Type `/help` → lists all commands. `/clear` clears conversation. `/model` switches.
 - Settings panel: change send key to Ctrl+Enter. Verify Enter inserts newline.
 ### Sprint 18: Thinking + Tree View + Preview Fix
 - View a file in workspace. Click a breadcrumb or folder → preview closes automatically.
 - Click a directory toggle arrow (▸) → expands in-place showing children.
 - Click again (▾) → collapses. Double-click navigates into it (breadcrumb view).
 - If model returns thinking blocks (Claude extended thinking), verify collapsible gold card appears above response.
 ### Sprint 19: Auth + Security
 - No password set: everything works as normal. No login page.
 - Set `HERMES_WEBUI_PASSWORD=test` env var. Restart. All pages redirect to `/login`.
 - Login page: minimal card, password field, "Sign in" button.
 - Enter correct password → redirected to `/`. Cookie set (24h).
 - Enter wrong password → error message, stay on login page.
 - Settings panel: set password via "Access Password" field. Auth activates.
 - "Sign Out" button visible when auth active. Click → redirected to /login.
 - API calls without auth cookie → 401 JSON response.
 - Check response headers: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`.
 ---
 *Last updated: Sprint 19 / v0.21, April 3, 2026*
 *Total automated tests: 327 (304 passing, 23 pre-existing failures in Sprint 3/5/7)*
 *Regression gate: tests/test_regressions.py (23 tests)*
 *Run: pytest tests/ -v --timeout=60*
 *Source: <repo>/*