fix: silent agent errors, stale model list, live model fetching (#377)

* fix: silent errors, stale models, live model fetching (#373, #374, #375) - api/streaming.py: detect empty agent response (_assistant_added check), emit apperror(type='no_response' or 'auth_mismatch') instead of silent done - api/streaming.py: add _token_sent flag so guard works for streaming agents - static/messages.js: done handler belt-and-suspenders guard for zero replies - static/messages.js: apperror handler labels 'no_response' type distinctly - api/config.py: remove gpt-4o and o3 from _FALLBACK_MODELS and _PROVIDER_MODELS['openai'] (superseded by gpt-5.4-mini and o4-mini) - api/routes.py: new /api/models/live?provider= endpoint, fetches /v1/models from provider API with B310 scheme check + SSRF guard - static/ui.js: _fetchLiveModels() background fetch after static list loads, appends new models to dropdown, caches per session, skips unsupported providers Other: - tests/test_issues_373_374_375.py: 25 new structural tests - tests/test_regressions.py: extend done-handler window 1500->2500 chars - CHANGELOG.md: v0.50.19 entry; 947 tests (up from 922) * fix: SSRF hostname bypass + auth detection operator precedence 1. routes.py: SSRF guard used substring matching (any(k in hostname)) which allows bypass via hostnames like evil-ollama.attacker.com. Changed to exact hostname matching against a fixed set of known local hostnames (localhost, 127.0.0.1, 0.0.0.0, ::1). 2. streaming.py: _is_auth detection had a Python operator precedence bug on the ternary expression. The line: 'AuthenticationError' in type(...).__name__ if _last_err else False parsed as the ternary absorbing the rest of the or-chain when _last_err was falsy. Fixed to: (_last_err and 'AuthenticationError' in ...) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: fix v0.50.20 CHANGELOG version number and test count (949 tests) --------- Co-authored-by: Nathan Esquenazi <nesquena@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:52:35 -07:00
parent 78de40e015
commit 7a80e73eb2
8 changed files with 485 additions and 6 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,24 @@
 - **Workspace file downloads no longer crash for Unicode filenames** (`api/routes.py`): Clicking a PDF or other file with Chinese, Japanese, Arabic, or other non-ASCII characters in its name caused a `UnicodeEncodeError` because Python's HTTP server requires header values to be latin-1 encodable. A new `_content_disposition_value(disposition, filename)` helper centralises `Content-Disposition` generation: it strips CR/LF (injection guard), builds an ASCII fallback for the legacy `filename=` parameter (non-ASCII chars replaced with `_`), and preserves the full UTF-8 name in `filename*=UTF-8''...` per RFC 5987. Both `attachment` and `inline` responses use it.
  - 2 new integration tests in `tests/test_sprint29.py` covering Chinese filenames for both download and inline responses, verifying the header is latin-1 encodable and `filename*=UTF-8''` is present; 924 tests total (up from 922)
 ## [v0.50.20] Silent error fix, stale model cleanup, live model fetching (fixes #373, #374, #375)
 ### Fix: Chat no longer silently swallows agent failures (fixes #373)
 - **`api/streaming.py`**: After `run_conversation()` completes, the server now checks whether the agent produced any assistant reply. If not (e.g., auth error swallowed internally, model unavailable, network timeout), it emits an `apperror` SSE event with a clear message and type (`auth_mismatch` or `no_response`) instead of silently emitting `done`. A `_token_sent` flag tracks whether any streaming tokens were sent.
 - **`static/messages.js`**: The `done` handler has a belt-and-suspenders guard — if `done` arrives but no assistant message exists in the session (the `apperror` path should usually catch this first), an inline "**No response received.**" message is shown. The `apperror` handler now also recognises the new `no_response` type with a distinct label.
 ### Cleanup: Remove stale OpenAI models from default list (fixes #374)
 - **`api/config.py`**: `gpt-4o` and `o3` removed from `_FALLBACK_MODELS` and `_PROVIDER_MODELS["openai"]`. Both are superseded by newer models already in the list (`gpt-5.4-mini` for general use, `o4-mini` for reasoning). The Copilot provider list retains `gpt-4o` as it remains available via the Copilot API.
 ### Feature: Live model fetching from provider API (closes #375)
 - **`api/routes.py`**: New `/api/models/live?provider=openai` endpoint. Fetches the actual model list from the provider's `/v1/models` API using the user's configured credentials. Includes URL scheme validation (B310), SSRF guard (private IP block), and graceful `not_supported` response for providers without a standard `/v1/models` endpoint (Anthropic, Google). Response normalised to `{id, label}` list, filtered to chat models.
 - **`static/ui.js`**: `populateModelDropdown()` now calls `_fetchLiveModels()` in the background after rendering the static list. Live models that aren't already in the dropdown are appended to the provider's optgroup. Results are cached per session so only one fetch per provider per page load. Skips Anthropic and Google (unsupported). Falls back to static list silently if the fetch fails.
  - 25 new tests in `tests/test_issues_373_374_375.py`; 949 tests total (up from 924)
 ## [v0.50.18] Recover from invalid default workspace paths (PR #366)
 - **WebUI no longer breaks when the configured default workspace is unavailable** (`api/config.py`): The workspace resolution path was refactored into three composable functions — `_workspace_candidates()`, `_ensure_workspace_dir()`, and `resolve_default_workspace()`. When the configured workspace (from env var, settings file, or passed path) cannot be created or accessed, the server falls back through an ordered priority list: `HERMES_WEBUI_DEFAULT_WORKSPACE` env var → `~/workspace` (if exists) → `~/work` (if exists) → `~/workspace` (create it) → `STATE_DIR/workspace`.
--- a/api/config.py
+++ b/api/config.py
@@ -406,8 +406,6 @@ CLI_TOOLSETS = get_config().get("platform_toolsets", {}).get("cli", _DEFAULT_TOO
 # Hardcoded fallback models (used when no config.yaml or agent is available)
 _FALLBACK_MODELS = [
    {"provider": "OpenAI", "id": "openai/gpt-5.4-mini", "label": "GPT-5.4 Mini"},
    {"provider": "OpenAI", "id": "openai/gpt-4o", "label": "GPT-4o"},
    {"provider": "OpenAI", "id": "openai/o3", "label": "o3"},
    {"provider": "OpenAI", "id": "openai/o4-mini", "label": "o4-mini"},
    {
        "provider": "Anthropic",
@@ -463,8 +461,6 @@ _PROVIDER_MODELS = {
    ],
    "openai": [
        {"id": "gpt-5.4-mini", "label": "GPT-5.4 Mini"},
        {"id": "gpt-4o", "label": "GPT-4o"},
        {"id": "o3", "label": "o3"},
        {"id": "o4-mini", "label": "o4-mini"},
    ],
    "openai-codex": [
--- a/api/routes.py
+++ b/api/routes.py
@@ -341,6 +341,9 @@ def handle_get(handler, parsed) -> bool:
    if parsed.path == "/api/models":
        return j(handler, get_available_models())
    if parsed.path == "/api/models/live":
        return _handle_live_models(handler, parsed)
    if parsed.path == "/api/settings":
        settings = load_settings()
        # Never expose the stored password hash to clients
@@ -1410,6 +1413,144 @@ def _handle_approval_inject(handler, parsed):
    return j(handler, {"error": "session_id required"}, status=400)
 def _handle_live_models(handler, parsed):
    """Fetch the live model list from a provider's /v1/models endpoint.
    Returns the provider's actual model catalog so the UI can show all
    available models, not just the hardcoded fallback list.
    Query params:
        provider  (optional) — provider ID to fetch for; defaults to active
        base_url  (optional) — override the base URL for the provider
    Providers that don't expose a /v1/models endpoint (Anthropic) are not
    supported here — the caller should fall back to the static list.
    Supported: openai, openrouter, custom (any OpenAI-compatible endpoint).
    """
    import urllib.request as _ur
    import ipaddress as _ip
    import socket as _sock
    from urllib.parse import urlparse as _up
    qs = parse_qs(parsed.query)
    provider = (qs.get("provider", [""])[0] or "").lower().strip()
    base_url_override = (qs.get("base_url", [""])[0] or "").strip()
    try:
        from api.config import get_config as _gc, resolve_model_provider as _rmp
        cfg = _gc()
        active_provider = cfg.get("model", {}).get("provider") or ""
        if not provider:
            provider = active_provider
        # Resolve API key and base URL for this provider
        api_key = None
        base_url = base_url_override or ""
        try:
            from hermes_cli.runtime_provider import resolve_runtime_provider
            rt = resolve_runtime_provider(requested=provider)
            api_key = rt.get("api_key")
            if not base_url:
                base_url = rt.get("base_url") or ""
        except Exception:
            pass
        # Determine the /v1/models endpoint URL
        if not base_url:
            if provider in ("openai", "openai-codex", "copilot"):
                base_url = "https://api.openai.com/v1"
            elif provider == "openrouter":
                base_url = "https://openrouter.ai/api/v1"
            elif provider in ("anthropic",):
                # Anthropic doesn't support /v1/models in a standard way
                return j(handler, {"error": "not_supported", "models": []})
            elif provider in ("google", "gemini"):
                return j(handler, {"error": "not_supported", "models": []})
            else:
                # Generic OpenAI-compatible — try common paths
                base_url = ""
        if not base_url:
            return j(handler, {"error": "no_base_url", "models": []})
        # Build URL safely
        base_url = base_url.rstrip("/")
        if base_url.endswith("/v1"):
            endpoint_url = base_url + "/models"
        elif "/v1" in base_url:
            endpoint_url = base_url.rstrip("/") + "/models"
        else:
            endpoint_url = base_url + "/v1/models"
        # Validate scheme (B310 guard)
        parsed_ep = _up(endpoint_url)
        if parsed_ep.scheme not in ("http", "https"):
            return j(handler, {"error": "invalid_scheme", "models": []}, status=400)
        # SSRF guard: block private IPs (allow known local provider hostnames).
        # Use exact hostname match — NOT substring — to prevent bypass via
        # hostnames like evil-ollama.attacker.com containing "ollama".
        _KNOWN_LOCAL_HOSTS = {"localhost", "127.0.0.1", "0.0.0.0", "::1"}
        if parsed_ep.hostname:
            hostname_lower = (parsed_ep.hostname or "").lower()
            try:
                for _, _, _, _, addr in _sock.getaddrinfo(parsed_ep.hostname, None):
                    addr_obj = _ip.ip_address(addr[0])
                    if addr_obj.is_private or addr_obj.is_loopback:
                        if hostname_lower not in _KNOWN_LOCAL_HOSTS:
                            return j(handler, {"error": "ssrf_blocked", "models": []}, status=400)
            except _sock.gaierror:
                pass
        # Fetch models
        req = _ur.Request(endpoint_url, method="GET")
        req.add_header("User-Agent", "HermesWebUI/1.0")
        if api_key:
            req.add_header("Authorization", f"Bearer {api_key}")
        with _ur.urlopen(req, timeout=8) as resp:  # nosec B310
            raw = resp.read().decode("utf-8")
        import json as _json
        data = _json.loads(raw)
        raw_models = data.get("data") or data.get("models") or []
        # Normalise to {id, label} list; filter to text-generation models
        models = []
        seen = set()
        for m in raw_models:
            if not isinstance(m, dict):
                continue
            mid = m.get("id") or m.get("name") or ""
            if not mid or mid in seen:
                continue
            # Skip embedding/image/audio models for direct providers
            obj_type = (m.get("object") or "").lower()
            if obj_type and obj_type not in ("model",):
                continue
            # Heuristic: skip obvious non-chat models
            if any(skip in mid.lower() for skip in ("embed", "tts", "whisper", "dall-e", "davinci-edit", "babbage", "ada", "curie")):
                continue
            seen.add(mid)
            label = m.get("name") or m.get("display_name") or mid
            # For OpenAI, the id IS the label — clean it up
            if label == mid:
                label = mid.replace("-", " ").replace(".", ".").title()
                # Restore original casing for well-known names
                for known in ("GPT", "o1", "o3", "o4", "gpt"):
                    label = label.replace(known.title(), known)
            models.append({"id": mid, "label": label})
        # Sort: newest (higher version numbers) first via lexicographic sort on reversed id
        models.sort(key=lambda m: m["id"], reverse=True)
        return j(handler, {"provider": provider, "models": models, "count": len(models)})
    except Exception as _e:
        logger.debug("Failed to fetch live models for %s: %s", provider, _e)
        return j(handler, {"error": str(_e), "models": []})
 def _handle_cron_output(handler, parsed):
    from cron.jobs import OUTPUT_DIR as CRON_OUT
--- a/api/streaming.py
+++ b/api/streaming.py
@@ -163,9 +163,13 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
            logger.debug("Approval module not available, falling back to polling")
        try:
            _token_sent = False  # tracks whether any streamed tokens were sent
            def on_token(text):
                nonlocal _token_sent
                if text is None:
                    return  # end-of-stream sentinel
                _token_sent = True
                put('token', {'text': text})
            def on_tool(name, preview, args):
@@ -308,6 +312,45 @@ def _run_agent_streaming(session_id, msg_text, model, workspace, stream_id, atta
            )
            s.messages = result.get('messages') or s.messages
            # ── Detect silent agent failure (no assistant reply produced) ──
            # When the agent catches an auth/network error internally it may return
            # an empty final_response without raising — the stream would end with
            # a done event containing zero assistant messages, leaving the user with
            # no feedback. Emit an apperror so the client shows an inline error.
            _assistant_added = any(
                m.get('role') == 'assistant' and str(m.get('content') or '').strip()
                for m in (result.get('messages') or [])
            )
            # _token_sent tracks whether on_token() was called (any streamed text)
            if not _assistant_added and not _token_sent:
                _last_err = getattr(agent, '_last_error', None) or result.get('error') or ''
                _err_str = str(_last_err) if _last_err else ''
                _is_auth = (
                    '401' in _err_str
                    or (_last_err and 'AuthenticationError' in type(_last_err).__name__)
                    or 'authentication' in _err_str.lower()
                    or 'unauthorized' in _err_str.lower()
                    or 'invalid api key' in _err_str.lower()
                    or 'invalid_api_key' in _err_str.lower()
                )
                if _is_auth:
                    put('apperror', {
                        'message': _err_str or 'Authentication failed — check your API key.',
                        'type': 'auth_mismatch',
                        'hint': (
                            'The selected model may not be supported by your configured provider or '
                            'your API key is invalid. Run `hermes model` in your terminal to '
                            'update credentials, then restart the WebUI.'
                        ),
                    })
                else:
                    put('apperror', {
                        'message': _err_str or 'The agent returned no response. Check your API key and model selection.',
                        'type': 'no_response',
                        'hint': 'Verify your API key is valid and the selected model is available for your account.',
                    })
                return  # Don't emit done — the apperror already closes the stream on the client
            # ── Handle context compression side effects ──
            # If compression fired inside run_conversation, the agent may have
            # rotated its session_id. Detect and fix the mismatch so the WebUI
--- a/static/messages.js
+++ b/static/messages.js
@@ -204,6 +204,8 @@ async function send(){
        }
        clearLiveToolCards();
        S.busy=false;
        // No-reply guard (#373): if agent returned nothing, show inline error
        if(!S.messages.some(m=>m.role==='assistant'&&String(m.content||'').trim())&&!assistantText){removeThinking();S.messages.push({role:'assistant',content:'**No response received.** Check your API key and model selection.'});}
        syncTopbar();renderMessages();loadDir('.');
      }
      renderSessionList();setBusy(false);setStatus('');
@@ -236,7 +238,8 @@ async function send(){
          const d=JSON.parse(e.data);
          const isRateLimit=d.type==='rate_limit';
          const isAuthMismatch=d.type==='auth_mismatch';
-          const label=isRateLimit?'Rate limit reached':isAuthMismatch?(typeof t==='function'?t('provider_mismatch_label'):'Provider mismatch'):'Error';
+          const isNoResponse=d.type==='no_response';
          const label=isRateLimit?'Rate limit reached':isAuthMismatch?(typeof t==='function'?t('provider_mismatch_label'):'Provider mismatch'):isNoResponse?'No response received':'Error';
          const hint=d.hint?`\n\n*${d.hint}*`:'';
          S.messages.push({role:'assistant',content:`**${label}:** ${d.message}${hint}`});
        }catch(_){
--- a/static/ui.js
+++ b/static/ui.js
@@ -68,6 +68,9 @@ async function populateModelDropdown(){
      _applyModelToDropdown(data.default_model, sel);
    }
    if(typeof syncModelChip==='function') syncModelChip();
    // Kick off a background live-model fetch for the active provider.
    // This runs after the static list is already shown (no blocking flicker).
    if(data.active_provider) _fetchLiveModels(data.active_provider, sel);
  }catch(e){
    // API unavailable -- keep the hardcoded HTML options as fallback
    console.warn('Failed to load models from server:',e.message);
@@ -75,6 +78,60 @@ async function populateModelDropdown(){
  }
 }
 // Cache so we don't re-fetch on every page load
 const _liveModelCache={};
 async function _fetchLiveModels(provider, sel){
  if(!provider||!sel) return;
  // Don't fetch for providers where we know it's unsupported or unnecessary
  if(['anthropic','google','gemini'].includes(provider)) return;
  if(_liveModelCache[provider]) return; // already fetched this session
  try{
    const url=new URL('/api/models/live',location.origin);
    url.searchParams.set('provider',provider);
    const data=await fetch(url.href,{credentials:'include'}).then(r=>r.json());
    if(!data.models||!data.models.length) return;
    _liveModelCache[provider]=data.models;
    // Remember current selection before rebuilding options
    const currentVal=sel.value;
    // Rebuild the optgroup for this provider with live models
    // Keep other providers' optgroups intact
    let providerGroup=null;
    for(const og of sel.querySelectorAll('optgroup')){
      if(og.label&&og.label.toLowerCase().includes(provider.toLowerCase())){
        providerGroup=og; break;
      }
    }
    if(!providerGroup){
      // No existing group — add a new one
      providerGroup=document.createElement('optgroup');
      providerGroup.label=provider.charAt(0).toUpperCase()+provider.slice(1)+' (live)';
      sel.appendChild(providerGroup);
    }
    // Rebuild options from live data
    const existingIds=new Set([...sel.options].map(o=>o.value));
    let added=0;
    for(const m of data.models){
      if(existingIds.has(m.id)) continue; // already shown from static list
      const opt=document.createElement('option');
      opt.value=m.id;
      opt.textContent=m.label||m.id;
      opt.title='Live model — fetched from provider';
      providerGroup.appendChild(opt);
      _dynamicModelLabels[m.id]=m.label||m.id;
      added++;
    }
    if(added>0){
      // Restore selection
      if(currentVal) _applyModelToDropdown(currentVal, sel);
      if(typeof syncModelChip==='function') syncModelChip();
      console.log('[hermes] Live models loaded for',provider+':',added,'new models added');
    }
  }catch(e){
    console.debug('[hermes] Live model fetch failed for',provider,e.message);
  }
 }
 /**
 * Check if the given model ID belongs to a different provider than the one
 * currently configured in Hermes. Returns a warning string if mismatched,
--- a/tests/test_issues_373_374_375.py
+++ b/tests/test_issues_373_374_375.py
@@ -0,0 +1,221 @@
 """
 Tests for issues #373, #374, and #375.
 #373: Chat silently swallows errors — no feedback when agent fails to respond
 #374: Remove stale OpenAI models from default list (gpt-4o, o3)
 #375: Model dropdown should fetch live models from provider
 """
 import pathlib
 import re
 REPO = pathlib.Path(__file__).parent.parent
 STREAMING_PY = (REPO / "api" / "streaming.py").read_text(encoding="utf-8")
 CONFIG_PY    = (REPO / "api" / "config.py").read_text(encoding="utf-8")
 ROUTES_PY    = (REPO / "api" / "routes.py").read_text(encoding="utf-8")
 MESSAGES_JS  = (REPO / "static" / "messages.js").read_text(encoding="utf-8")
 UI_JS        = (REPO / "static" / "ui.js").read_text(encoding="utf-8")
 # ── Issue #373: Silent error detection ──────────────────────────────────────
 class TestSilentErrorDetection:
    """streaming.py must emit apperror when agent returns no assistant reply."""
    def test_streaming_detects_no_assistant_reply(self):
        """streaming.py must check if any assistant message was produced."""
        assert "_assistant_added" in STREAMING_PY, (
            "streaming.py must check whether an assistant message was produced (#373)"
        )
    def test_streaming_emits_apperror_on_no_response(self):
        """streaming.py must emit apperror event when agent produced no reply."""
        assert "no_response" in STREAMING_PY, (
            "streaming.py must emit apperror with type='no_response' for silent failures (#373)"
        )
    def test_streaming_returns_early_after_apperror(self):
        """streaming.py must return after emitting apperror (not also emit done)."""
        # The return statement must come after the put('apperror') for no_response
        no_resp_pos = STREAMING_PY.find("'no_response'")
        return_pos = STREAMING_PY.find("return  # Don't emit done", no_resp_pos)
        assert no_resp_pos != -1, "no_response type not found in streaming.py"
        assert return_pos != -1, (
            "streaming.py must return after emitting apperror to prevent also emitting done (#373)"
        )
        assert return_pos > no_resp_pos
    def test_streaming_detects_auth_error_in_result(self):
        """streaming.py must detect auth errors from the result object."""
        assert "_is_auth" in STREAMING_PY, (
            "streaming.py must detect auth errors in silent failures (#373)"
        )
        assert "auth_mismatch" in STREAMING_PY, (
            "streaming.py must emit auth_mismatch type for auth failures (#373)"
        )
    def test_messages_js_done_handler_detects_no_reply(self):
        """messages.js done handler must show an error if no assistant reply arrived."""
        # Check for either the variable name or the inlined check pattern
        has_no_reply_guard = (
            "hasAssistantReply" in MESSAGES_JS
            or ("role==='assistant'" in MESSAGES_JS and "No response received" in MESSAGES_JS)
        )
        assert has_no_reply_guard, (
            "messages.js done handler must detect zero assistant replies (#373)"
        )
        assert "No response received" in MESSAGES_JS, (
            "messages.js must show 'No response received' inline message (#373)"
        )
    def test_messages_js_handles_no_response_apperror_type(self):
        """messages.js apperror handler must recognise the no_response type."""
        assert "isNoResponse" in MESSAGES_JS or "no_response" in MESSAGES_JS, (
            "messages.js apperror handler must handle type='no_response' (#373)"
        )
    def test_messages_js_no_response_label(self):
        """messages.js must show a distinct label for no_response errors."""
        assert "No response received" in MESSAGES_JS, (
            "messages.js must display 'No response received' label for no_response errors (#373)"
        )
 # ── Issue #374: Stale model list cleanup ─────────────────────────────────────
 class TestStaleModelListCleanup:
    """gpt-4o and o3 must be removed from the primary OpenAI model lists."""
    def test_gpt4o_removed_from_fallback_models(self):
        """_FALLBACK_MODELS must not contain gpt-4o (issue #374)."""
        fallback_block_start = CONFIG_PY.find("_FALLBACK_MODELS = [")
        fallback_block_end = CONFIG_PY.find("]", fallback_block_start)
        fallback_block = CONFIG_PY[fallback_block_start:fallback_block_end]
        assert "gpt-4o" not in fallback_block, (
            "_FALLBACK_MODELS still contains gpt-4o — remove it per issue #374"
        )
    def test_o3_removed_from_fallback_models(self):
        """_FALLBACK_MODELS must not contain o3 (issue #374)."""
        fallback_block_start = CONFIG_PY.find("_FALLBACK_MODELS = [")
        fallback_block_end = CONFIG_PY.find("]", fallback_block_start)
        fallback_block = CONFIG_PY[fallback_block_start:fallback_block_end]
        assert '"o3"' not in fallback_block and "'o3'" not in fallback_block, (
            "_FALLBACK_MODELS still contains o3 — remove it per issue #374"
        )
    def test_gpt4o_removed_from_provider_models_openai(self):
        """_PROVIDER_MODELS['openai'] must not contain gpt-4o (issue #374)."""
        openai_start = CONFIG_PY.find('"openai": [')
        openai_end = CONFIG_PY.find("],", openai_start)
        openai_block = CONFIG_PY[openai_start:openai_end]
        assert "gpt-4o" not in openai_block, (
            "_PROVIDER_MODELS['openai'] still contains gpt-4o — remove per issue #374"
        )
    def test_o3_removed_from_provider_models_openai(self):
        """_PROVIDER_MODELS['openai'] must not contain o3 (issue #374)."""
        openai_start = CONFIG_PY.find('"openai": [')
        openai_end = CONFIG_PY.find("],", openai_start)
        openai_block = CONFIG_PY[openai_start:openai_end]
        assert '"o3"' not in openai_block and "'o3'" not in openai_block, (
            "_PROVIDER_MODELS['openai'] still contains o3 — remove per issue #374"
        )
    def test_fallback_still_has_gpt54_mini(self):
        """_FALLBACK_MODELS must still contain gpt-5.4-mini (not over-trimmed)."""
        assert "gpt-5.4-mini" in CONFIG_PY, (
            "_FALLBACK_MODELS must keep gpt-5.4-mini as primary OpenAI model (#374)"
        )
    def test_fallback_still_has_o4_mini(self):
        """_FALLBACK_MODELS must still contain o4-mini (reasoning model)."""
        assert "o4-mini" in CONFIG_PY, (
            "_FALLBACK_MODELS must keep o4-mini as reasoning model (#374)"
        )
    def test_copilot_list_unchanged(self):
        """Copilot provider model list should still include gpt-4o (it's a valid Copilot model)."""
        copilot_start = CONFIG_PY.find('"copilot": [')
        copilot_end = CONFIG_PY.find("],", copilot_start)
        if copilot_start == -1:
            return  # No copilot list — that's fine
        copilot_block = CONFIG_PY[copilot_start:copilot_end]
        assert "gpt-4o" in copilot_block, (
            "Copilot provider model list should keep gpt-4o (it's available via Copilot) (#374)"
        )
 # ── Issue #375: Live model fetching ─────────────────────────────────────────
 class TestLiveModelFetching:
    """Backend and frontend must support live model fetching from provider APIs."""
    def test_live_models_endpoint_exists_in_routes(self):
        """routes.py must have a /api/models/live endpoint (#375)."""
        assert "/api/models/live" in ROUTES_PY, (
            "routes.py must define /api/models/live endpoint (#375)"
        )
    def test_live_models_handler_function_exists(self):
        """routes.py must define _handle_live_models() function (#375)."""
        assert "def _handle_live_models(" in ROUTES_PY, (
            "routes.py must define _handle_live_models() for live model fetching (#375)"
        )
    def test_live_models_handler_validates_scheme(self):
        """_handle_live_models must validate URL scheme to prevent file:// injection (B310)."""
        assert "nosec B310" in ROUTES_PY or ("scheme" in ROUTES_PY and "http" in ROUTES_PY), (
            "_handle_live_models must validate URL scheme before urlopen (#375)"
        )
    def test_live_models_handler_has_ssrf_guard(self):
        """_handle_live_models must guard against SSRF (private IP access)."""
        assert "ssrf_blocked" in ROUTES_PY or ("is_private" in ROUTES_PY and "live" in ROUTES_PY), (
            "_handle_live_models must have SSRF protection for private IP ranges (#375)"
        )
    def test_live_models_unsupported_providers_gracefully_handled(self):
        """Providers without /v1/models support must return not_supported gracefully."""
        assert "not_supported" in ROUTES_PY, (
            "_handle_live_models must return not_supported for Anthropic/Google (#375)"
        )
    def test_frontend_has_fetch_live_models_function(self):
        """ui.js must define _fetchLiveModels() for background live model loading (#375)."""
        assert "function _fetchLiveModels(" in UI_JS or "async function _fetchLiveModels(" in UI_JS, (
            "ui.js must define _fetchLiveModels() function (#375)"
        )
    def test_frontend_live_models_cache_exists(self):
        """ui.js must cache live model responses to avoid redundant API calls (#375)."""
        assert "_liveModelCache" in UI_JS, (
            "ui.js must use _liveModelCache to avoid re-fetching on every dropdown open (#375)"
        )
    def test_frontend_calls_live_models_after_static_load(self):
        """populateModelDropdown must call _fetchLiveModels after rendering the static list (#375)."""
        assert "_fetchLiveModels" in UI_JS, (
            "populateModelDropdown must call _fetchLiveModels for background update (#375)"
        )
    def test_frontend_live_fetch_only_adds_new_models(self):
        """_fetchLiveModels must not duplicate models already in the static list (#375)."""
        assert "existingIds" in UI_JS, (
            "_fetchLiveModels must track existing model IDs to avoid duplicates (#375)"
        )
    def test_frontend_live_fetch_skips_unsupported_providers(self):
        """_fetchLiveModels must skip providers that don't support live fetching (#375)."""
        assert "anthropic" in UI_JS and "google" in UI_JS, (
            "_fetchLiveModels must skip Anthropic and Google (no /v1/models support) (#375)"
        )
    def test_live_models_endpoint_wired_in_routes(self):
        """The /api/models/live path must be handled in handle_get()."""
        # Find handle_get and check our route appears inside it
        handle_get_pos = ROUTES_PY.find("def handle_get(")
        live_route_pos = ROUTES_PY.find('"/api/models/live"')
        assert handle_get_pos != -1 and live_route_pos != -1
        assert live_route_pos > handle_get_pos, (
            "/api/models/live must be inside handle_get() (#375)"
        )
--- a/tests/test_regressions.py
+++ b/tests/test_regressions.py
@@ -401,7 +401,7 @@ def test_done_handler_sets_busy_false_before_renderMessages(cleanup_test_session
    if done_idx < 0:
        done_idx = src.find("es.addEventListener('done'")
    assert done_idx >= 0
-    done_block = src[done_idx:done_idx+1500]
+    done_block = src[done_idx:done_idx+2500]
    # S.busy=false must appear before renderMessages() within the done handler
    busy_pos = done_block.find("S.busy=false;")
    render_pos = done_block.find("renderMessages()")