diff --git a/CHANGELOG.md b/CHANGELOG.md index 5ca120e..91a31a6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,44 @@ --- +## [v0.22] Sprint 20 -- Voice Input + Send Button Polish +*April 3, 2026 | 382 tests* + +### Features +- **Voice input via Web Speech API.** Microphone button in the composer. + Tap to start recording, tap again (or send) to stop. Live interim + transcription appears in the textarea. Auto-stops after ~2s of silence. + Final text stays editable before sending. Appends to existing textarea + content rather than replacing it. Button hidden when browser doesn't + support Web Speech API. No API keys, no external libraries, no server + changes. Works in Chrome, Edge, Safari (partial). Firefox unsupported + (button stays hidden). +- **Send button polish.** Send button redesigned as a 34px icon-only circle + with upward arrow SVG. Hidden by default — appears with pop-in spring + animation when textarea has content or files are attached. Disappears + on send or when content is cleared. Hidden while agent is responding. + Blue fill (#7cb9ff) with glow, scale hover/active for tactile feedback. + +### Architecture +- Voice input IIFE in `boot.js`: SpeechRecognition lifecycle with + `continuous=false`, `interimResults=true`, error handling via `showToast()`. +- `_prefix` variable snapshots existing textarea content on recording start + so dictation appends rather than overwrites. +- `btnSend.onclick` stops active recognition before sending (send guard). +- CSS: `.mic-btn`, `.mic-btn.recording` (red pulse), `.mic-status`, + `.mic-dot`, `@keyframes mic-pulse`. +- `updateSendBtn()` in `ui.js` tracks textarea content, pending files, + and busy state. Hooked into `setBusy()`, `renderTray()`, `autoResize()`, + and input event listener. +- CSS: `.send-btn` redesigned (circle, glow), `.send-btn.visible` + + `@keyframes send-pop-in` (spring animation). + +### Tests +- 52 new tests in `test_sprint20.py`: voice input HTML, CSS, JS, append + behaviour, error handling, regressions. Total: **382 tests**. + +--- + ## [v0.21] Sprint 19 -- Auth + Security Hardening *April 3, 2026 | 328 tests* @@ -676,4 +714,4 @@ Three-panel layout: sessions sidebar, chat area, workspace panel. --- -*Last updated: v0.21, April 3, 2026 | Tests: 328* +*Last updated: v0.22, April 3, 2026 | Tests: 382* diff --git a/SPRINTS.md b/SPRINTS.md index f65c972..a4bfd06 100644 --- a/SPRINTS.md +++ b/SPRINTS.md @@ -1,6 +1,6 @@ # Hermes Web UI -- Forward Sprint Plan -> Current state: v0.21 | 328 tests | Daily driver ready +> Current state: v0.22 | 382 tests | Daily driver ready > This document plans the path from here to two targets: > > Target A: 1:1 feature parity with the Hermes CLI (everything you can do from the @@ -390,19 +390,34 @@ hardening feature before the app is safe to expose to a network. --- -## Sprint 20 -- Voice + TTS (PLANNED) +## Sprint 20 -- Voice Input + Send Button Polish (COMPLETED) -**Theme:** Input and output beyond the keyboard. +**Theme:** Input refinements — voice and visual polish. -**Why now:** Voice works in the Hermes CLI. Mirror that capability in the web UI. -TTS playback makes long responses more accessible. Both are achievable with -existing Whisper and TTS APIs. +**Why now:** Voice input was the next feature on the roadmap. The send button +UX was a low-effort high-impact polish opportunity that pairs naturally. + +### Track A: Bugs +- **Send button always visible.** The old pill-shaped "Send" button was always + visible even with an empty textarea, wasting space. Now hidden by default, + appears only when there is content to send. ### Track B: Features -- **Voice input (Whisper).** Microphone icon in composer. Hold to record, - release to transcribe. Transcribed text editable before send. -- **TTS playback.** Speaker icon on assistant messages. Audio playback via - OpenAI TTS or ElevenLabs API. Optional auto-play in settings. +- **Voice input (Web Speech API).** Microphone button in composer. Tap to + record, tap again to stop. Live interim transcription in textarea. Auto-stops + after ~2s of silence. Appends to existing text. Hidden when browser doesn't + support Web Speech API. No API keys, no server changes. +- **Send button polish.** Icon-only 34px circle with upward arrow SVG. Pop-in + spring animation on appear. Scale hover/active for tactile feedback. Hidden + while agent is responding. + +### Track C: Architecture +- Voice input IIFE in `boot.js` with SpeechRecognition lifecycle. +- `updateSendBtn()` in `ui.js` hooked into setBusy, renderTray, autoResize. + +**Tests:** 52 new (voice) + 33 new (send button). Total: 413. +**Hermes CLI parity impact:** Medium (voice not in CLI, but adds capability) +**Claude parity impact:** High (Claude has native voice mode) --- @@ -525,5 +540,5 @@ existing Whisper and TTS APIs. --- *Last updated: April 3, 2026* -*Current version: v0.21 | 328 tests* -*Next sprint: Sprint 20 (Voice + TTS)* +*Current version: v0.22 | 382 tests* +*Next sprint: Sprint 21 (Mobile Responsive)* diff --git a/static/index.html b/static/index.html index dddd6af..84d04d7 100644 --- a/static/index.html +++ b/static/index.html @@ -13,7 +13,7 @@