Hermes WebUI — Two-Stage Chat Proposal (Issue #536)

0 · The model

One turn, two stages

Today an assistant turn is a flat stream: thinking card → tool cards → answer, all stacked inline with equal visual weight. The proposal wraps the execution history in a .p2s-stage1 container with a worklog bar as its header, and marks the final answer as .p2s-answer. The same DOM renders three ways:

Live — worklog shows Working… 0:42 · 2 tools with a pulsing dot; history is fully visible.
Settled — worklog collapses to a single line (Worked 1:42 · 4 tools · 2 thinking); final answer sits below as the calm conclusion.
Settled + opened — user clicks the worklog to re-expand the history for audit.

1 · Current vs proposed — settled turn

Side-by-side comparison

Same turn, same tool calls, same answer. Left is what #587 ships today. Right is the proposal: execution history collapses to a one-line summary; the final answer stands alone with a small Answer kicker.

Current (PR #587)

Does our dev server pick up the workspace from an env var or a flag?

HHermes

💡 Thought for 3.1s ▶

Check how the CLI resolves workspace:
grep for HERMES_WORKSPACE and --workspace
inspect argv vs env precedence.

⚡ bash grep -rn "HERMES_WORKSPACE" . · exit 0 ▶

cli/main.py:14:WORKSPACE_ENV = "HERMES_WORKSPACE"
cli/main.py:92:    ws = os.getenv(WORKSPACE_ENV) or args.workspace

📄 read_file cli/main.py · 148 lines ▶

HHermes

Both work, but env wins. The CLI reads HERMES_WORKSPACE first and only falls back to the --workspace flag if the env var is unset.

So in practice:

CI / daemons → set the env var.
Ad-hoc runs → pass --workspace.

Everything stacks equally — the answer is just the next block.

Proposed — two-stage, settled

Does our dev server pick up the workspace from an env var or a flag?

HHermes

Worked for 0:08 2 tools 1 thinking round ▶

💡 Thought for 3.1s ▶

Check how the CLI resolves workspace:
grep for HERMES_WORKSPACE and --workspace
inspect argv vs env precedence.

⚡ bash grep -rn "HERMES_WORKSPACE" . · exit 0 ▶

📄 read_file cli/main.py · 148 lines ▶

Answer

Both work, but env wins. The CLI reads HERMES_WORKSPACE first and only falls back to the --workspace flag if the env var is unset.

So in practice:

CI / daemons → set the env var.
Ad-hoc runs → pass --workspace.

Click the worklog bar to expand the execution history.

2 · Stage 1 · Live run

Working timer + live execution history

The worklog bar at the top is the anchor for the whole active run: pulsing dot, elapsed timer that ticks every second, and live counts that increment as tool cards resolve. Thinking cards and tool cards render inside .p2s-stage1-body exactly as today. A Round N separator is inserted when the agent starts a new reasoning/tool cycle.

.p2s-stage1.is-live — Round 1 done, Round 2 running

HHermes

Working… 0:42 3 tools 2 thinking

💡 Thought for 2.4s ▶

Need to map the streaming code path first,
then check the persistence layer.

📄 read_file api/streaming.py · 612 lines ▶

⚡ bash grep -rn "tool_call_id" api/ · exit 0 · 88ms ▶

Round 2

💡 Thought for 1.8s ▶

Streaming looks fine — drill into how
tool_calls get attached before save.

⚡ bash pytest tests/test_tool_call_persistence.py -q ▶

3 · Approve vs Clarify — placement

Approvals stay in Stage 1; Clarify moves to the transition

Per the issue: approvals are part of doing the work (they gate a single tool), clarifications stabilise the answer path (they precede the conclusion). The proposal keeps .approval-card inline among tool cards, and places .clarify-card at the Stage 1 → Stage 2 seam, above the final answer.

Approve card — inline in Stage 1

HHermes

Working… 0:18 1 tool

⚡ bash ls -la ~/.hermes/sessions · exit 0 ▶

🔐 Approve command

Hermes wants to run a potentially destructive command:

rm -rf ~/.hermes/sessions/*.json.bak

Permission gate sits next to the tools it gates.

Clarify card — Stage 1 → Stage 2 transition

HHermes

Worked for 0:12 2 tools ▶

📄 read_file package.json · 48 lines ▶

⚡ bash ls src/ · exit 0 ▶

❓ One quick question before I answer

I can wire the dev server either as an npm script in the existing package.json, or as a standalone CLI entry-point. Which would you prefer?

Stage 1 is already settled; the answer is paused on clarification.

4 · Stage 2 · Calm conclusion

What the "Answer" stage looks like on its own

Three small choices distinguish Stage 2 from a regular text block: (1) a thin horizontal divider above it, (2) a tiny gold Answer kicker aligned to the text rail, (3) a slightly taller line-height. No heavy borders, no boxed treatment — the emphasis comes from what is missing around it, not ornament.

.p2s-answer (Stage 1 collapsed above)

HHermes

Worked for 1:42 4 tools 2 thinking 1 approval ▶

💡Thought for 2.4s▶

📄read_fileapi/streaming.py▶

⚡bashgrep -rn "tool_call_id" api/▶

Round 2

💡Thought for 1.8s▶

⚡bashpytest -q · exit 0 · 2.4s▶

✍️edit_fileapi/streaming.py · +12 −3▶

Answer

Tool-call persistence was breaking because session.tool_calls was written after s.save() in api/streaming.py. I moved the attach step above the save, and added a fallback that reconstructs ordering from live tool-progress events when tool_call_id is absent on older sessions.

Net result:

Reloading mid-stream now preserves every tool card with args + output snippet.
Last-turn reasoning survives reload.
No schema migration needed — old sessions degrade gracefully.

Covered by the new regression in tests/test_tool_call_persistence.py.

11:42 AM · 2,481 tokens · 1.42s

5 · Open-question answers (picked defaults)

What this proposal commits to

Stage 1 on settle → partial collapse to a single worklog bar with counts. Click to re-expand. No "nuke to black box", no "keep everything open forever".
Final answer placement → sits beneath Stage 1, not replacing it. Visual distinction comes from the divider + kicker + spacing, not from a two-panel layout.
Clarify placement → at the Stage 1 → Stage 2 seam. Approvals stay inline with tools.
Timer → lives on Stage 1 only. Stops when the agent emits the first Stage 2 token; final label becomes "Worked for N:NN".
Signal for "answer has started" → first assistant text delta after all tool calls have resolved and no new tool_use is pending in the current round. Already present in the SSE stream per maintainer comment.

6 · DOM cheat-sheet

What changes vs index.html

New wrappers

.p2s-stage1[is-live|is-settled][is-open] — wraps the execution history inside an .assistant-segment.
.p2s-worklog — header of Stage 1. Pulsing dot + label + counts + caret. Clickable when settled.
.p2s-stage1-body — holds .thinking-card + .tool-card-row + .p2s-round-sep. Animated via max-height.
.p2s-round-sep — inline horizontal separator between tool/reasoning rounds.
.p2s-transition — thin gradient divider between Stage 1 and Stage 2.
.p2s-answer — wraps the final .msg-body + .msg-foot.
.p2s-answer-kicker — small gold Answer label.
.p2s-clarify-slot — placement slot for .clarify-card at the Stage 1/2 seam.

Unchanged

.thinking-card, .tool-card, .approval-card, .clarify-card, .msg-body, .msg-foot — all existing app CSS and existing markup.
.assistant-turn-blocks and .assistant-segment remain the top-level wrappers.
Tool cards still live as .tool-card-row siblings — now nested inside .p2s-stage1-body rather than as direct children of .messages-inner.

Implementation notes

Renderer in static/messages.js wraps an assistant turn's non-final blocks in .p2s-stage1-body and appends the .p2s-worklog header once; toggles is-live/is-settled based on data-live-assistant.
static/boot.js SSE handler ticks the timer while is-live, increments counts on each tool_use, and flips the class when the first Stage 2 delta arrives.
Persistence: no schema change needed — the worklog summary can be derived on reload from the existing persisted tool-call list + thinking rounds.