Two-Stage Chat UX — Proposal for issue #536 Companion to index.html — shows Working → Final answer as a distinct two-phase interaction model.
Theme
0 · The model

One turn, two stages

Today an assistant turn is a flat stream: thinking card → tool cards → answer, all stacked inline with equal visual weight. The proposal wraps the execution history in a .p2s-stage1 container with a worklog bar as its header, and marks the final answer as .p2s-answer. The same DOM renders three ways:

1 · Current vs proposed — settled turn

Side-by-side comparison

Same turn, same tool calls, same answer. Left is what #587 ships today. Right is the proposal: execution history collapses to a one-line summary; the final answer stands alone with a small Answer kicker.

Current (PR #587)

Does our dev server pick up the workspace from an env var or a flag?

HHermes
💡 Thought for 3.1s
Check how the CLI resolves workspace:
grep for HERMES_WORKSPACE and --workspace
inspect argv vs env precedence.
bash grep -rn "HERMES_WORKSPACE" . · exit 0
cli/main.py:14:WORKSPACE_ENV = "HERMES_WORKSPACE"
cli/main.py:92:    ws = os.getenv(WORKSPACE_ENV) or args.workspace
📄 read_file cli/main.py · 148 lines
HHermes

Both work, but env wins. The CLI reads HERMES_WORKSPACE first and only falls back to the --workspace flag if the env var is unset.

So in practice:

  • CI / daemons → set the env var.
  • Ad-hoc runs → pass --workspace.
Everything stacks equally — the answer is just the next block.
Proposed — two-stage, settled

Does our dev server pick up the workspace from an env var or a flag?

HHermes
Worked for 0:08 2 tools 1 thinking round
💡 Thought for 3.1s
Check how the CLI resolves workspace:
grep for HERMES_WORKSPACE and --workspace
inspect argv vs env precedence.
bash grep -rn "HERMES_WORKSPACE" . · exit 0
📄 read_file cli/main.py · 148 lines
Answer

Both work, but env wins. The CLI reads HERMES_WORKSPACE first and only falls back to the --workspace flag if the env var is unset.

So in practice:

  • CI / daemons → set the env var.
  • Ad-hoc runs → pass --workspace.
Click the worklog bar to expand the execution history.
2 · Stage 1 · Live run

Working timer + live execution history

The worklog bar at the top is the anchor for the whole active run: pulsing dot, elapsed timer that ticks every second, and live counts that increment as tool cards resolve. Thinking cards and tool cards render inside .p2s-stage1-body exactly as today. A Round N separator is inserted when the agent starts a new reasoning/tool cycle.

.p2s-stage1.is-live — Round 1 done, Round 2 running
HHermes
Working… 0:42 3 tools 2 thinking
💡 Thought for 2.4s
Need to map the streaming code path first,
then check the persistence layer.
📄 read_file api/streaming.py · 612 lines
bash grep -rn "tool_call_id" api/ · exit 0 · 88ms
Round 2
💡 Thought for 1.8s
Streaming looks fine — drill into how
tool_calls get attached before save.
bash pytest tests/test_tool_call_persistence.py -q
3 · Approve vs Clarify — placement

Approvals stay in Stage 1; Clarify moves to the transition

Per the issue: approvals are part of doing the work (they gate a single tool), clarifications stabilise the answer path (they precede the conclusion). The proposal keeps .approval-card inline among tool cards, and places .clarify-card at the Stage 1 → Stage 2 seam, above the final answer.

Approve card — inline in Stage 1
HHermes
Working… 0:18 1 tool
bash ls -la ~/.hermes/sessions · exit 0
🔐 Approve command

Hermes wants to run a potentially destructive command:

rm -rf ~/.hermes/sessions/*.json.bak
Permission gate sits next to the tools it gates.
Clarify card — Stage 1 → Stage 2 transition
HHermes
Worked for 0:12 2 tools
📄 read_file package.json · 48 lines
bash ls src/ · exit 0
One quick question before I answer

I can wire the dev server either as an npm script in the existing package.json, or as a standalone CLI entry-point. Which would you prefer?

Stage 1 is already settled; the answer is paused on clarification.
4 · Stage 2 · Calm conclusion

What the "Answer" stage looks like on its own

Three small choices distinguish Stage 2 from a regular text block: (1) a thin horizontal divider above it, (2) a tiny gold Answer kicker aligned to the text rail, (3) a slightly taller line-height. No heavy borders, no boxed treatment — the emphasis comes from what is missing around it, not ornament.

.p2s-answer (Stage 1 collapsed above)
HHermes
Worked for 1:42 4 tools 2 thinking 1 approval
💡Thought for 2.4s
📄read_fileapi/streaming.py
bashgrep -rn "tool_call_id" api/
Round 2
💡Thought for 1.8s
bashpytest -q · exit 0 · 2.4s
✍️edit_fileapi/streaming.py · +12 −3
Answer

Tool-call persistence was breaking because session.tool_calls was written after s.save() in api/streaming.py. I moved the attach step above the save, and added a fallback that reconstructs ordering from live tool-progress events when tool_call_id is absent on older sessions.

Net result:

  • Reloading mid-stream now preserves every tool card with args + output snippet.
  • Last-turn reasoning survives reload.
  • No schema migration needed — old sessions degrade gracefully.

Covered by the new regression in tests/test_tool_call_persistence.py.

11:42 AM · 2,481 tokens · 1.42s
5 · Open-question answers (picked defaults)

What this proposal commits to

  • Stage 1 on settle → partial collapse to a single worklog bar with counts. Click to re-expand. No "nuke to black box", no "keep everything open forever".
  • Final answer placement → sits beneath Stage 1, not replacing it. Visual distinction comes from the divider + kicker + spacing, not from a two-panel layout.
  • Clarify placement → at the Stage 1 → Stage 2 seam. Approvals stay inline with tools.
  • Timer → lives on Stage 1 only. Stops when the agent emits the first Stage 2 token; final label becomes "Worked for N:NN".
  • Signal for "answer has started" → first assistant text delta after all tool calls have resolved and no new tool_use is pending in the current round. Already present in the SSE stream per maintainer comment.
6 · DOM cheat-sheet

What changes vs index.html

New wrappers

  • .p2s-stage1[is-live|is-settled][is-open] — wraps the execution history inside an .assistant-segment.
  • .p2s-worklog — header of Stage 1. Pulsing dot + label + counts + caret. Clickable when settled.
  • .p2s-stage1-body — holds .thinking-card + .tool-card-row + .p2s-round-sep. Animated via max-height.
  • .p2s-round-sep — inline horizontal separator between tool/reasoning rounds.
  • .p2s-transition — thin gradient divider between Stage 1 and Stage 2.
  • .p2s-answer — wraps the final .msg-body + .msg-foot.
  • .p2s-answer-kicker — small gold Answer label.
  • .p2s-clarify-slot — placement slot for .clarify-card at the Stage 1/2 seam.

Unchanged

  • .thinking-card, .tool-card, .approval-card, .clarify-card, .msg-body, .msg-foot — all existing app CSS and existing markup.
  • .assistant-turn-blocks and .assistant-segment remain the top-level wrappers.
  • Tool cards still live as .tool-card-row siblings — now nested inside .p2s-stage1-body rather than as direct children of .messages-inner.

Implementation notes

  • Renderer in static/messages.js wraps an assistant turn's non-final blocks in .p2s-stage1-body and appends the .p2s-worklog header once; toggles is-live/is-settled based on data-live-assistant.
  • static/boot.js SSE handler ticks the timer while is-live, increments counts on each tool_use, and flips the class when the first Stage 2 delta arrives.
  • Persistence: no schema change needed — the worklog summary can be derived on reload from the existing persisted tool-call list + thinking rounds.