docs: update markdown files for v0.18.1 (safe HTML rendering, 289 tests)

- CHANGELOG: add v0.18.1 entry (safe HTML rendering, inlineMd, safety net, active session gold style, 74 new tests) - ARCHITECTURE: update ui.js line count (809->846), document renderMd pre-pass/safety net/inlineMd/SAFE_TAGS, update test file count (14), update Phase I test count (289) - ROADMAP: bump version and test count - SPRINTS: bump version, test count, Sprint 16 test total Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:38:44 -07:00
parent a92fff75a3
commit f3ae8305dc
4 changed files with 72 additions and 29 deletions
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -41,7 +41,7 @@ This makes the code easy to modify from a terminal or by an agent.
    static/
      index.html           HTML template (served from disk)
      style.css            All CSS
-      ui.js                DOM helpers, renderMd, tool cards, model dropdown (~809 lines)
+      ui.js                DOM helpers, renderMd, tool cards, model dropdown (~846 lines)
      workspace.js         File tree, preview, file ops (~169 lines)
      sessions.js          Session CRUD, list rendering, search, SVG icons, overlay actions (~532 lines)
      messages.js          send(), SSE event handlers, approval, transcript (~293 lines)
@@ -49,7 +49,7 @@ This makes the code easy to modify from a terminal or by an agent.
      boot.js              Event wiring + boot IIFE (~175 lines)
    tests/
      conftest.py          Isolated test server (port 8788, separate HERMES_HOME) (~240 lines)
-      test_sprint1-11.py   Feature tests per sprint (13 files, Sprints 1-11)
+      test_sprint1-16.py   Feature tests per sprint (14 files, Sprints 1-11 + 16)
      test_regressions.py  Permanent regression gate
    AGENTS.md              Instruction file for agents working in this directory.
    ROADMAP.md             Feature and product roadmap document.
@@ -330,11 +330,11 @@ read_file_content(workspace, rel):
 ### 5.1 Structure

 The frontend is served from static/ as separate files: one HTML template, one CSS file,
-and six JavaScript modules (~2,750 lines total). External dependencies: Prism.js (syntax
+and six JavaScript modules (~2,786 lines total). External dependencies: Prism.js (syntax
 highlighting) and Mermaid.js (diagrams) from CDN, both loaded async/deferred with SRI hashes.

 Six JS modules loaded in order at end of <body>:
-  1. ui.js       (~809 lines) DOM helpers, renderMd, tool card rendering, global state
+  1. ui.js       (~846 lines) DOM helpers, renderMd, tool card rendering, global state
  2. workspace.js (~169 lines) File tree, preview, file operations
  3. sessions.js  (~532 lines) Session CRUD, list rendering, search, SVG icons, overlay actions, project picker
  4. messages.js  (~293 lines) send(), SSE event handlers, approval, transcript
@@ -414,25 +414,43 @@ Boot IIFE:

 ### 5.4 Markdown Renderer (renderMd)

-A hand-rolled regex chain. Processes in this order:
-1. Code blocks (``` lang ... ```) -> <pre><code> with language header
-2. Inline code (`...`) -> <code>
-3. Bold+italic (***..***) -> <strong><em>
-4. Bold (**...**) -> <strong>
-5. Italic (*...*) -> <em>
-6. Headings (# ## ###) -> <h1> <h2> <h3>
-7. Horizontal rules (---+) -> <hr>
-8. Blockquotes (> ...) -> <blockquote>
-9. Unordered lists (- or * or + at line start) -> <ul><li>
-10. Ordered lists (N. at line start) -> <ol><li>
-11. Links ([text](https://...)) -> <a href target=_blank>
-12. Paragraph wrapping: remaining double-newline-separated blocks -> <p>
+A hand-rolled regex chain with HTML safety. Processes in this order:
+
+Pre-pass (v0.18.1):
+0a. Stash fenced code blocks and backtick spans (fence_stash array)
+0b. Convert safe HTML tags to markdown equivalents:
+    <strong>/<b> -> **text**, <em>/<i> -> *text*, <code> -> `text`, <br> -> newline
+0c. Restore stashed code blocks
+
+Pipeline:
+1. Mermaid blocks (```mermaid ... ```) -> <div class="mermaid-block">
+2. Code blocks (``` lang ... ```) -> <pre><code> with language header
+3. Inline code (`...`) -> <code>
+4. Bold+italic (***..***) -> <strong><em>
+5. Bold (**...**) -> <strong>
+6. Italic (*...*) -> <em>
+7. Headings (# ## ###) -> <h1> <h2> <h3> (uses inlineMd() for content)
+8. Horizontal rules (---+) -> <hr>
+9. Blockquotes (> ...) -> <blockquote> (uses inlineMd() for content)
+10. Unordered lists (- or * or + at line start) -> <ul><li> (uses inlineMd())
+11. Ordered lists (N. at line start) -> <ol><li> (uses inlineMd())
+12. Links ([text](https://...)) -> <a href target=_blank>
+13. Tables (| col | col |) -> <table>
+14. Safety net: escape any HTML tag not in SAFE_TAGS allowlist via esc()
+15. Paragraph wrapping: remaining double-newline-separated blocks -> <p>
+
+inlineMd() helper (v0.18.1):
+    Processes inline bold/italic/code/links within list items, blockquotes,
+    and headings. Escapes unknown tags via SAFE_INLINE allowlist. Replaces
+    the old direct esc() calls which would double-escape pre-pass output.
+
+SAFE_TAGS allowlist:
+    strong, em, code, pre, h1-6, ul, ol, li, table, thead, tbody, tr, th,
+    td, hr, blockquote, p, br, a, div. Everything else is escaped.

 Known gaps:
- Tables: not supported, render as plain text
 - Nested lists: single regex pass, multi-level indentation not handled
 - Mixed bold+link in same line: may produce garbled output
- Inline HTML: not sanitized (esc() only runs on code content)

 ### 5.5 Model Chip Label (Fixed in Sprint 1)

@@ -628,7 +646,7 @@ Current structure:
        ui.js, workspace.js, sessions.js, messages.js, panels.js, boot.js
      tests/
        conftest.py           Isolated test server on port 8788
-        test_sprint1-11.py    Feature tests per sprint (13 files)
+        test_sprint1-16.py    Feature tests per sprint (14 files)
        test_regressions.py   Permanent regression gate

 Route extraction to api/routes.py completed in Sprint 11. server.py is now a ~76-line
@@ -727,10 +745,10 @@ Optional password gate for non-SSH-tunnel deployments.

 ### Phase I: Test Infrastructure -- COMPLETE

-237 tests across 13 test files + regression gate. Isolated test server on port 8788
+289 tests across 14 test files + regression gate. Isolated test server on port 8788
 with separate HERMES_HOME, wiped per run. Production data never touched.

-Test files: `test_sprint1.py` through `test_sprint11.py`, `test_regressions.py`.
+Test files: `test_sprint1.py` through `test_sprint11.py`, `test_sprint16.py`, `test_regressions.py`.
 Fixtures in `conftest.py`: auto-cleanup, cron isolation, workspace reset.

 Remaining: no CI (GitHub Actions), no frontend tests (browser-based).
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,31 @@

 ---

+## [v0.18.1] Safe HTML Rendering + Sprint 16 Tests
+*April 2, 2026 | 289 tests*
+
+### Features
+- **Safe HTML rendering in AI responses.** AI models sometimes emit HTML tags
+  (`<strong>`, `<em>`, `<code>`, `<br>`) in their responses. Previously these
+  showed as literal escaped text. A new pre-pass in `renderMd()` converts safe
+  HTML tags to markdown equivalents before the pipeline runs. Code blocks and
+  backtick spans are stashed first so their content is never touched.
+- **`inlineMd()` helper.** New function for processing inline formatting inside
+  list items, blockquotes, and headings. The old code called `esc()` directly,
+  which escaped tags that had already been converted by the pre-pass.
+- **Safety net.** After the full pipeline, any HTML tags not in the output
+  allowlist (`SAFE_TAGS`) are escaped via `esc()`. XSS fully blocked -- 7
+  attack vectors tested.
+- **Active session gold style.** Active session uses gold/amber (`#e8a030`)
+  instead of blue, matching the logo gradient. Project border-left skipped
+  when active (gold always wins).
+
+### Tests
+- **74 new tests** in `test_sprint16.py`: static analysis (6), behavioral (10),
+  exact regression (1), XSS security (7), edge cases (51). Total: 289 passed.
+
+---
+
 ## [v0.18] Sprint 16 -- Session Sidebar Visual Polish
 *April 2, 2026 | 237 tests*

@@ -555,4 +580,4 @@ Three-panel layout: sessions sidebar, chat area, workspace panel.

 ---

-*Last updated: v0.18, April 2, 2026 | Tests: 237*
+*Last updated: v0.18.1, April 2, 2026 | Tests: 289*
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -3,8 +3,8 @@
 > Goal: Full 1:1 parity with the Hermes CLI experience via a clean dark web UI.
 > Everything you can do from the CLI terminal, you can do from this UI.
 >
-> Last updated: Sprint 16 / v0.18 (April 2, 2026)
-> Tests: 237 passing
+> Last updated: Sprint 16 / v0.18.1 (April 2, 2026)
+> Tests: 289 passing
 > Source: <repo>/

 ---
@@ -43,7 +43,7 @@
 | Python server | <repo>/server.py (~76 lines) + api/ modules (~2145 lines) | Thin shell + business logic in api/ |
 | HTML template | <repo>/static/index.html | Served from disk |
 | CSS | <repo>/static/style.css (~560 lines) | Served from disk |
-| JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot}.js | 6 modules, ~2750 lines total |
+| JavaScript | <repo>/static/{ui,workspace,sessions,messages,panels,boot}.js | 6 modules, ~2786 lines total |
 | Runtime state | ~/.hermes/webui-mvp/sessions/ | Session JSON files |
 | Test server | Port 8788, state dir ~/.hermes/webui-mvp-test/ | Isolated, wiped per run |
 | Production server | Port 8787 | SSH tunnel from Mac |
--- a/SPRINTS.md
+++ b/SPRINTS.md
@@ -1,6 +1,6 @@
 # Hermes Web UI -- Forward Sprint Plan

-> Current state: v0.18 | 237 tests | Daily driver ready
+> Current state: v0.18.1 | 289 tests | Daily driver ready
 > This document plans the path from here to two targets:
 >
 > Target A: 1:1 feature parity with the Hermes CLI (everything you can do from the
@@ -265,7 +265,7 @@ inconsistently across platforms. These were the most common visual complaints.
 - Thinking/reasoning display for extended-thinking models
 - Slash command autocomplete popup

-**Tests:** 0 new (pure CSS/DOM changes). Total: 237.
+**Tests:** 74 new (test_sprint16.py: safe HTML rendering, XSS security, sidebar polish). Total: 289.
 **Hermes CLI parity impact:** Low
 **Claude parity impact:** Medium (sidebar polish matches Claude's quality bar)

@@ -452,5 +452,5 @@ address.
 ---

 *Last updated: April 2, 2026*
-*Current version: v0.18 | 237 tests*
+*Current version: v0.18.1 | 289 tests*
 *Next sprint: Sprint 17 (Voice + Multimodal Input)*