Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -425,7 +425,8 @@ jobs:
LIGHTSPEED_API_ENDPOINT: ${{ vars.LIGHTSPEED_API_ENDPOINT || secrets.LIGHTSPEED_API_ENDPOINT }}
LIGHTSPEED_USER: ${{ vars.LIGHTSPEED_USER || secrets.LIGHTSPEED_USER }}
LIGHTSPEED_PASSWORD: ${{ secrets.LIGHTSPEED_PASSWORD }}
timeout-minutes: 30
# UI pulls a large image and can exceed 30m when Selenium is up before code-server is ready
timeout-minutes: 60

- name: task finish
run: task finish
Expand Down
66 changes: 62 additions & 4 deletions test/ui/fixtures/ui_fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import os
import subprocess
import time
import urllib.error
import urllib.request
from collections.abc import Generator
from datetime import datetime
from typing import TYPE_CHECKING, Any
Expand Down Expand Up @@ -49,6 +51,46 @@ def is_container_healthy() -> bool:
return result.returncode == 0


def _wait_for_code_server_http(
url: str = "http://127.0.0.1:8080/",
timeout_sec: int = 600,
) -> None:
"""Wait until code-server accepts HTTP (not covered by Selenium :4444 healthcheck).

The compose file healthcheck only verifies Selenium Grid; code-server can still
be starting, which otherwise leads to long WebDriver/page-load timeouts in CI.
"""
deadline = time.time() + timeout_sec
attempt = 0
while time.time() < deadline:
attempt += 1
try:
with urllib.request.urlopen(url, timeout=5) as resp:
if resp.status == 200:
log.info("code-server ready at %s (attempt %s)", url, attempt)
return
except urllib.error.HTTPError as exc:
# Upstream still starting — keep polling. Other responses mean HTTP is up.
if exc.code in (502, 503, 504):
pass
else:
log.info(
"code-server ready at %s (HTTP %s, attempt %s)",
url,
exc.code,
attempt,
)
return
Comment on lines +73 to +83
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urllib.request.urlopen() raises HTTPError that still holds an open response file (exc.fp). In the retry path (502/503/504) this exception is not closed, so repeated polling can leak sockets/file descriptors in long CI runs. Close the HTTPError (e.g., exc.close() or ensure the response is closed via contextlib.closing).

Suggested change
# Upstream still starting — keep polling. Other responses mean HTTP is up.
if exc.code in (502, 503, 504):
pass
else:
log.info(
"code-server ready at %s (HTTP %s, attempt %s)",
url,
exc.code,
attempt,
)
return
with contextlib.closing(exc):
# Upstream still starting — keep polling. Other responses mean HTTP is up.
if exc.code in (502, 503, 504):
pass
else:
log.info(
"code-server ready at %s (HTTP %s, attempt %s)",
url,
exc.code,
attempt,
)
return

Copilot uses AI. Check for mistakes.
except (urllib.error.URLError, OSError, TimeoutError):
pass
if attempt <= 3 or attempt % 20 == 0:
log.info("Waiting for code-server at %s: %s", url, attempt)
time.sleep(2)
pytest.fail(
f"code-server did not become ready at {url} within {timeout_sec}s",
)


@pytest.fixture(scope="session")
def browser_setup(
request: pytest.FixtureRequest,
Expand All @@ -62,7 +104,7 @@ def browser_setup(
"capturemanager"
) # type: ignore[name-defined]
try:
if not is_container_healthy() or True:
if not is_container_healthy():
subprocess.run(
f"podman rm -f {CONTAINER_NAME} 2>/dev/null || true",
shell=True,
Expand All @@ -80,16 +122,30 @@ def browser_setup(
shell=True,
cwd=_PROJECT_ROOT,
)
health_deadline = time.time() + int(
os.environ.get("UI_CONTAINER_HEALTH_TIMEOUT", "900"),
)
Comment on lines +125 to +127
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI_CONTAINER_HEALTH_TIMEOUT is parsed with int(...) without validation. If it’s set to a non-integer value, the fixture will crash with a ValueError that doesn’t identify the offending env var. Consider parsing with a helper that either falls back to the default or raises a clearer message mentioning UI_CONTAINER_HEALTH_TIMEOUT.

Copilot uses AI. Check for mistakes.
count = 0
while True:
while time.time() < health_deadline:
if is_container_healthy():
break
count += 1
time.sleep(1)
log.info(
"Waiting for container %s to be healthy: %s", CONTAINER_NAME, count
if count <= 5 or count % 15 == 0:
log.info(
"Waiting for container %s to be healthy: %s",
CONTAINER_NAME,
count,
)
else:
pytest.fail(
f"container {CONTAINER_NAME} did not become healthy in time",
)
Comment on lines +125 to 143
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '42,52p;125,147p' test/ui/fixtures/ui_fixtures.py
rg -n -C3 'healthcheck|4444|8080' --iglob '*compose*.y*ml' || true

Repository: ansible/vscode-ansible

Length of output: 2414


Add an explicit timeout to the subprocess call in is_container_healthy().

While the docker-compose.yml defines a healthcheck with a 5-second timeout, the is_container_healthy() function calls subprocess.run() without a timeout argument. This means the subprocess itself could hang if the podman healthcheck run command stalls at the system level, potentially causing the entire wait loop to exceed UI_CONTAINER_HEALTH_TIMEOUT. Add an explicit timeout to the subprocess call for defensive programming and to ensure the behavior is fully under code control.

Patch sketch
-def is_container_healthy() -> bool:
+def is_container_healthy(timeout_sec: int = 10) -> bool:
     """Check if the selenium container is healthy."""
-    result = subprocess.run(
-        f"podman healthcheck run {CONTAINER_NAME}",
-        shell=True,
-        check=False,
-        text=True,
-        capture_output=True,
-    )
+    try:
+        result = subprocess.run(
+            f"podman healthcheck run {CONTAINER_NAME}",
+            shell=True,
+            check=False,
+            text=True,
+            capture_output=True,
+            timeout=timeout_sec,
+        )
+    except subprocess.TimeoutExpired:
+        return False
     return result.returncode == 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/ui/fixtures/ui_fixtures.py` around lines 125 - 143, The subprocess.run
call inside is_container_healthy() must include an explicit timeout to prevent
the health-check command from hanging; update is_container_healthy() to pass a
timeout (e.g.,
timeout=int(os.environ.get("UI_CONTAINER_HEALTHCHECK_TIMEOUT","5"))) to
subprocess.run and catch subprocess.TimeoutExpired (return False and optionally
log) so the outer wait loop still respects UI_CONTAINER_HEALTH_TIMEOUT; ensure
you reference is_container_healthy and handle TimeoutExpired consistently with
other error paths.


_wait_for_code_server_http(
timeout_sec=int(os.environ.get("UI_CODE_SERVER_TIMEOUT", "600")),
)
Comment on lines +145 to +147
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI_CODE_SERVER_TIMEOUT is parsed with int(...) without validation. If it’s set to a non-integer value, the session fixture will crash with a ValueError before any tests run, and the traceback won’t clearly indicate which env var was invalid. Consider validating/parsing this value with a clearer error message (or falling back to the default).

Copilot uses AI. Check for mistakes.

browser = os.environ.get("BROWSER_TYPE")
options: FirefoxOptions | ChromeOptions
if browser == "chrome":
Expand All @@ -103,6 +159,8 @@ def browser_setup(
command_executor="http://localhost:4444/wd/hub",
options=options,
)
driver.set_page_load_timeout(300)
driver.set_script_timeout(300)
driver.maximize_window()

yield driver, "https://stage.ai.ansible.redhat.com/login"
Expand Down
Loading