fix: silent agent errors, stale model list, live model fetching (#377)

* fix: silent errors, stale models, live model fetching (#373, #374, #375) - api/streaming.py: detect empty agent response (_assistant_added check), emit apperror(type='no_response' or 'auth_mismatch') instead of silent done - api/streaming.py: add _token_sent flag so guard works for streaming agents - static/messages.js: done handler belt-and-suspenders guard for zero replies - static/messages.js: apperror handler labels 'no_response' type distinctly - api/config.py: remove gpt-4o and o3 from _FALLBACK_MODELS and _PROVIDER_MODELS['openai'] (superseded by gpt-5.4-mini and o4-mini) - api/routes.py: new /api/models/live?provider= endpoint, fetches /v1/models from provider API with B310 scheme check + SSRF guard - static/ui.js: _fetchLiveModels() background fetch after static list loads, appends new models to dropdown, caches per session, skips unsupported providers Other: - tests/test_issues_373_374_375.py: 25 new structural tests - tests/test_regressions.py: extend done-handler window 1500->2500 chars - CHANGELOG.md: v0.50.19 entry; 947 tests (up from 922) * fix: SSRF hostname bypass + auth detection operator precedence 1. routes.py: SSRF guard used substring matching (any(k in hostname)) which allows bypass via hostnames like evil-ollama.attacker.com. Changed to exact hostname matching against a fixed set of known local hostnames (localhost, 127.0.0.1, 0.0.0.0, ::1). 2. streaming.py: _is_auth detection had a Python operator precedence bug on the ternary expression. The line: 'AuthenticationError' in type(...).__name__ if _last_err else False parsed as the ternary absorbing the rest of the or-chain when _last_err was falsy. Fixed to: (_last_err and 'AuthenticationError' in ...) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: fix v0.50.20 CHANGELOG version number and test count (949 tests) --------- Co-authored-by: Nathan Esquenazi <nesquena@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:52:35 -07:00
parent 78de40e015
commit 7a80e73eb2
8 changed files with 485 additions and 6 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,24 @@
 - **Workspace file downloads no longer crash for Unicode filenames** (`api/routes.py`): Clicking a PDF or other file with Chinese, Japanese, Arabic, or other non-ASCII characters in its name caused a `UnicodeEncodeError` because Python's HTTP server requires header values to be latin-1 encodable. A new `_content_disposition_value(disposition, filename)` helper centralises `Content-Disposition` generation: it strips CR/LF (injection guard), builds an ASCII fallback for the legacy `filename=` parameter (non-ASCII chars replaced with `_`), and preserves the full UTF-8 name in `filename*=UTF-8''...` per RFC 5987. Both `attachment` and `inline` responses use it.
  - 2 new integration tests in `tests/test_sprint29.py` covering Chinese filenames for both download and inline responses, verifying the header is latin-1 encodable and `filename*=UTF-8''` is present; 924 tests total (up from 922)

+## [v0.50.20] Silent error fix, stale model cleanup, live model fetching (fixes #373, #374, #375)
+
+### Fix: Chat no longer silently swallows agent failures (fixes #373)
+
+- **`api/streaming.py`**: After `run_conversation()` completes, the server now checks whether the agent produced any assistant reply. If not (e.g., auth error swallowed internally, model unavailable, network timeout), it emits an `apperror` SSE event with a clear message and type (`auth_mismatch` or `no_response`) instead of silently emitting `done`. A `_token_sent` flag tracks whether any streaming tokens were sent.
+- **`static/messages.js`**: The `done` handler has a belt-and-suspenders guard — if `done` arrives but no assistant message exists in the session (the `apperror` path should usually catch this first), an inline "**No response received.**" message is shown. The `apperror` handler now also recognises the new `no_response` type with a distinct label.
+
+### Cleanup: Remove stale OpenAI models from default list (fixes #374)
+
+- **`api/config.py`**: `gpt-4o` and `o3` removed from `_FALLBACK_MODELS` and `_PROVIDER_MODELS["openai"]`. Both are superseded by newer models already in the list (`gpt-5.4-mini` for general use, `o4-mini` for reasoning). The Copilot provider list retains `gpt-4o` as it remains available via the Copilot API.
+
+### Feature: Live model fetching from provider API (closes #375)
+
+- **`api/routes.py`**: New `/api/models/live?provider=openai` endpoint. Fetches the actual model list from the provider's `/v1/models` API using the user's configured credentials. Includes URL scheme validation (B310), SSRF guard (private IP block), and graceful `not_supported` response for providers without a standard `/v1/models` endpoint (Anthropic, Google). Response normalised to `{id, label}` list, filtered to chat models.
+- **`static/ui.js`**: `populateModelDropdown()` now calls `_fetchLiveModels()` in the background after rendering the static list. Live models that aren't already in the dropdown are appended to the provider's optgroup. Results are cached per session so only one fetch per provider per page load. Skips Anthropic and Google (unsupported). Falls back to static list silently if the fetch fails.
+  - 25 new tests in `tests/test_issues_373_374_375.py`; 949 tests total (up from 924)
+
+
 ## [v0.50.18] Recover from invalid default workspace paths (PR #366)

 - **WebUI no longer breaks when the configured default workspace is unavailable** (`api/config.py`): The workspace resolution path was refactored into three composable functions — `_workspace_candidates()`, `_ensure_workspace_dir()`, and `resolve_default_workspace()`. When the configured workspace (from env var, settings file, or passed path) cannot be created or accessed, the server falls back through an ordered priority list: `HERMES_WEBUI_DEFAULT_WORKSPACE` env var → `~/workspace` (if exists) → `~/work` (if exists) → `~/workspace` (create it) → `STATE_DIR/workspace`.