creating a unique name for the clusters created by the evaluation tests #198

andrej1991 · 2025-09-10T14:26:33Z

creating a unique name for the clusters created by the evaluation tests

Summary by CodeRabbit

Tests
- Evaluation cluster names now include a run-specific unique suffix to avoid collisions.
- Evaluations run from a stable test location, use explicit data and token paths, and preprocess test data to inject the run identifier before runs.
Chores
- CI now generates, persists, and injects a per-run unique ID into job environments; job orchestration was updated to accept and propagate that ID and an unused secret parameter was removed.
- Added a fallback to a default test image when none is specified.

coderabbitai · 2025-09-10T14:26:41Z

Walkthrough

Generate an eight-character UNIQUE_ID in CI, persist and pass it as JOB_ID to oc process; add UNIQUE_ID env to the Prow Job; inject UNIQUE_ID into test/evals/eval_data.yaml in the entrypoint; update eval test data cluster-name placeholders; remove GEMINI_API_SECRET_NAME parameter from the template.

Changes

Cohort / File(s)	Summary
CI script `scripts/ci_test.sh`	Add global `UNIQUE_ID` (8-char lowercase alphanumeric from /dev/urandom), write `${SHARED_DIR}/eval_test_unique_id`, and pass `-p JOB_ID=${UNIQUE_ID}` to `oc process` (invocation reformatted to multi-line).
Prow job template `test/prow/template.yaml`	Remove parameter `GEMINI_API_SECRET_NAME`; add container env `UNIQUE_ID: ${JOB_ID}`.
Prow entrypoint `test/prow/entrypoint.sh`	Define `TEST_DIR="${WORK_DIR}/test/evals"`, run `sed -i "s/ClustER-NAme/${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml` before invoking the evaluator, keep `TEMP_DIR` and token/GEMINI handling.
Evaluator test data `test/evals/eval_data.yaml`	Replace concrete cluster names with `-ClustER-NAme` variants and update related eval_query, expected_tool_calls, expected_keywords, and expected_response strings to be substituted at runtime by `UNIQUE_ID`.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant CI as CI runner (scripts/ci_test.sh)
    participant OC as oc process
    participant Prow as Prow Job (template.yaml)
    participant Entrypoint as entrypoint.sh (container)
    participant Eval as python eval.py

    rect rgba(247,243,200,0.6)
    note right of CI: generate UNIQUE_ID (8-char)\npersist to ${SHARED_DIR}/eval_test_unique_id
    CI->>OC: oc process ... \n -p IMAGE_NAME=... \n -p SSL_CLIENT_SECRET_NAME=... \n -p JOB_ID=${UNIQUE_ID}
    end

    OC->>Prow: create Job with env UNIQUE_ID=${JOB_ID}
    Prow->>Entrypoint: start container (env UNIQUE_ID available)
    Entrypoint->>Entrypoint: sed -i "s/ClustER-NAme/${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml
    Entrypoint->>Eval: python eval.py --eval_data_yaml $TEST_DIR/eval_data.yaml --agent_auth_token_file $TEMP_DIR/ocm_token.txt
    Eval-->>Entrypoint: run evals against uniquely named resources

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

add random job id to eval tests template #197 — introduces a JOB_ID parameter into the Job/template and uses it in Job naming (closely related to passing JOB_ID from CI).
MGMT-21406 Switch from Gemini to Vertex AI #126 — modifies the Prow template parameters related to GEMINI secrets (related to removal of GEMINI_API_SECRET_NAME).
MGMT-21239: creating an image which could be started in the cluster claimed by prow #109 — prior changes to entrypoint.sh and template.yaml that adjusted how the entrypoint and template handle UNIQUE_ID/JOB_ID.

Suggested labels

ok-to-test

Suggested reviewers

jhernand
keitwb

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly and directly states the PR's main objective—ensuring clusters created by the evaluation tests have unique names—which matches the changes that add UNIQUE_ID and inject it into test YAML, templates, and scripts.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22ef9d9 and 6df55aa.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (3 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

test/prow/template.yaml
test/prow/entrypoint.sh
scripts/ci_test.sh

🔇 Additional comments (4)

test/evals/eval_data.yaml (4)

79-80: LGTM on follow-up ISO step reference.

The query correctly targets the newly created unique-name cluster.

114-114: LGTM on multinode ISO retrieval step.

Consistent with the created cluster name.

178-183: LGTM on templated cluster name usage in name-to-ID flow.

Consistent placeholder use; will work once UNIQUE_ID substitution is applied.

Also applies to: 191-191

65-76: UNIQUE_ID generation is regex- and name-safe.

scripts/ci_test.sh constrains UNIQUE_ID to [0-9a-z]{8} (UNIQUE_ID=$(head /dev/urandom | tr -dc 0-9a-z | head -c 8)), and test/prow/entrypoint.sh substitutes ClustER-NAme with that value via sed — the generated charset has no regex metacharacters or sed-replacement specials, so it will not break expected_tool_calls matching. See scripts/ci_test.sh:9, test/prow/entrypoint.sh:23, test/prow/template.yaml:48-50.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (6)

test/prow/template.yaml (2)
64-66: Required UNIQUE_ID param — consider a default for local runs.

Optional: add a benign default (e.g., "dev") to ease local oc process without CI-provided params.
 - name: UNIQUE_ID
-  required: true
+  required: true
+  value: dev
9-9: Optional: include UNIQUE_ID in Job name to avoid collisions across concurrent runs.

If multiple jobs share a namespace, a static Job name can collide. Suffixing with UNIQUE_ID makes the Job/POD names distinct.
-    name: assisted-chat-eval-test
+    name: assisted-chat-eval-test-${UNIQUE_ID}
test/prow/entrypoint.sh (1)
17-19: Sed-based parametrization works; add a quick post-check.

Two small robustness tweaks:

UNIQUE_ID must be DNS-1123 compatible (lowercase a–z, 0–9, '-') — ensure generation enforces this (see ci script comment).

Fail fast if replacements didn’t occur (guards against drift in eval_data.yaml).
 sed -i "s/eval-test-singlenode/eval-test-singlenode-${UNIQUE_ID}/g" test/evals/eval_data.yaml
 sed -i "s/eval-test-multinode/eval-test-multinode-${UNIQUE_ID}/g" test/evals/eval_data.yaml
+grep -q "eval-test-singlenode-${UNIQUE_ID}" test/evals/eval_data.yaml || { echo "singlenode replacement failed"; exit 1; }
+grep -q "eval-test-multinode-${UNIQUE_ID}" test/evals/eval_data.yaml || { echo "multinode replacement failed"; exit 1; }
scripts/ci_test.sh (2)
32-35: Quote UNIQUE_ID in oc process.

Minor safety improvement; keeps behavior identical.
 oc process -p IMAGE_NAME="$ASSISTED_CHAT_TEST" \
            -p SSL_CLIENT_SECRET_NAME=assisted-chat-ssl-ci \
-           -p UNIQUE_ID=${UNIQUE_ID} \
+           -p UNIQUE_ID="${UNIQUE_ID}" \
            -f test/prow/template.yaml --local | oc apply -n "$NAMESPACE" -f -
40-46: Don’t fail early if the Job’s Pod hasn’t been created yet.

Fetching POD_NAME immediately after apply is racy; the early exit can cause flaky failures. Resolve by moving the lookup inside the wait loop and tolerating empty results until either success or failure is observed.
-POD_NAME=$(oc get pods -n "$NAMESPACE" -l job-name="$JOB_NAME" -o jsonpath='{.items[0].metadata.name}')
-if [[ -z "${POD_NAME}" ]]; then
-    echo "No pod found with label app=assisted-chat-eval-test in namespace ${NAMESPACE}"
-    oc get pods -n "$NAMESPACE"
-    exit 1
-fi
+POD_NAME=""
And update the loop body to refresh POD_NAME for logging:
-    echo "Waiting for pod $POD_NAME to be ready..."
+    POD_NAME=$(oc get pods -n "$NAMESPACE" -l job-name="$JOB_NAME" -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)
+    echo "Waiting for pod ${POD_NAME:-<pending>} to be ready..."
test/evals/eval_data.yaml (1)
65-76: Optional: template-friendly placeholders to avoid broad sed.

If drift becomes an issue, consider placeholders (e.g., ${SNO_NAME}, ${MNO_NAME}) and render via envsubst/yq instead of global find/replace.
-      eval_query: create a new single node cluster named eval-test-singlenode, ...
+      eval_query: create a new single node cluster named ${SNO_NAME}, ...
...
-              name: "eval-test-singlenode"
+              name: "${SNO_NAME}"
Entry script:
-export SNO_NAME="eval-test-singlenode-${UNIQUE_ID}"
-export MNO_NAME="eval-test-multinode-${UNIQUE_ID}"
-envsubst < test/evals/eval_data.yaml > test/evals/eval_data.rendered.yaml
-exec python eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" -f test/evals/eval_data.rendered.yaml
Also applies to: 90-101, 111-118

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c868042 and f60bf1d.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (1 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request

🔇 Additional comments (3)

test/prow/template.yaml (1)

44-46: Env wiring for UNIQUE_ID is correct.

The parameter is correctly exposed to the container environment and will expand at template processing time.

test/evals/eval_data.yaml (2)

65-76: Rename to eval-test-singlenode is consistent; ready for UNIQUE_ID suffixing.

The updated literals will be correctly suffixed by entrypoint.sh at runtime and keep expected_keywords/tool args aligned.

Also applies to: 78-79

145-147: Intentional use of legacy name in error case?

eval-test-sno remains in the invalid-SSH-key test. If intentional, no action. Otherwise, align with the new naming or make it placeholder-driven.

Would you like me to update this test to ${SNO_NAME} as well and keep the error semantics intact?

scripts/ci_test.sh

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

test/evals/eval_data.yaml (1)
173-190: Rename eval_id to create_multi_node_cluster

Fixes the typo “cluser” and aligns the ID with the multi-node cluster creation.
-    - eval_id: create_single_node_cluser
+    - eval_id: create_multi_node_cluster

🧹 Nitpick comments (2)

template.yaml (1)

308-310: Name→ID mapping rule looks good; consider pagination and duplicate-name UX.

Solid addition. Two optional refinements:

If list_clusters is paginated/large, instruct to show the first N and offer to expand.

When multiple exact matches exist, suggest including brief disambiguators (creation time/base_domain) before asking the user to choose.

test/evals/eval_data.yaml (1)

132-140: Good switch to name-based lookup; add a positive single-match case.

This negative case (non-existent name) is fine. Consider adding a counterpart where an exact name maps to a single cluster and then assert the follow-up cluster_info call.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f60bf1d and 06297ff.

📒 Files selected for processing (5)

scripts/ci_test.sh (2 hunks)
template.yaml (1 hunks)
test/evals/eval_data.yaml (3 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

test/prow/template.yaml
scripts/ci_test.sh
test/prow/entrypoint.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request
GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request

🔇 Additional comments (3)

test/evals/eval_data.yaml (3)

78-85: Chaining by “ID of the cluster you just created” is consistent with the flow.

LGTM; this matches the system prompt’s guidance and keeps context explicit.

171-172: Description tweak aligns with scope—LGTM.

65-77: UNIQUE_ID substitutions cover all literal eval-test references.
eval-test-cluster-name, eval-test-multinode, and eval-test-singlenode each appear four times—covering both eval_query and expected_keywords; eval-test-sno appears once in its eval_id as intended.

andrej1991 · 2025-09-10T17:35:38Z

t

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

test/prow/entrypoint.sh (1)
24-24: Quote cd target and ensure directory exists (now created earlier).

Minor robustness: quote the path.

Apply this diff:
-cd $TEST_DIR
+cd "$TEST_DIR"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06297ff and e8a556a.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (1 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/evals/eval_data.yaml
scripts/ci_test.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request

🔇 Additional comments (2)

test/prow/entrypoint.sh (1)

26-26: Consider reading env from current dir or export GEMINI_API_KEY.

You now write .env in WORK_DIR (moved to TEST_DIR in earlier diff). Ensure eval.py loads it from CWD or export the var before invoking Python.

Do you want me to export GEMINI_API_KEY in the script to decouple from .env parsing?

test/prow/template.yaml (1)

44-45: Passing UNIQUE_ID into the container looks good.

This aligns with the eval name parameterization flow.

coderabbitai · 2025-09-10T17:48:15Z

test/prow/entrypoint.sh

 WORK_DIR=$(pwd)
-TEMP_DIR=$(mktemp -d)

-cd $TEMP_DIR
-
-echo "$OCM_TOKEN" > ocm_token.txt
+echo "$OCM_TOKEN" > $TEST_DIR/ocm_token.txt
 echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > .env

 TEST_DIR="${WORK_DIR}/test/evals"

+sed -i "s/eval-test-singlenode/eval-test-singlenode-${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml


⚠️ Potential issue

Fix unbound TEST_DIR (script exits with nounset before assignment).

Line 16 uses $TEST_DIR before it’s defined at Line 19. With set -o nounset, this aborts the script. Define and create TEST_DIR before first use, and quote expansions when writing files.

Apply this diff:

WORK_DIR=$(pwd) -echo "$OCM_TOKEN" > $TEST_DIR/ocm_token.txt -echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > .env - -TEST_DIR="${WORK_DIR}/test/evals" +TEST_DIR="${WORK_DIR}/test/evals" +mkdir -p "$TEST_DIR" +echo "$OCM_TOKEN" > "$TEST_DIR/ocm_token.txt" +echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > "$TEST_DIR/.env"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

WORK_DIR=$(pwd)

TEMP_DIR=$(mktemp -d)

cd $TEMP_DIR

echo "$OCM_TOKEN" > ocm_token.txt

echo "$OCM_TOKEN" > $TEST_DIR/ocm_token.txt

echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > .env

TEST_DIR="${WORK_DIR}/test/evals"

sed -i "s/eval-test-singlenode/eval-test-singlenode-${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml

WORK_DIR=$(pwd)

TEST_DIR="${WORK_DIR}/test/evals"

mkdir -p "$TEST_DIR"

echo "$OCM_TOKEN" > "$TEST_DIR/ocm_token.txt"

echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > "$TEST_DIR/.env"

sed -i "s/eval-test-singlenode/eval-test-singlenode-${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml

🤖 Prompt for AI Agents

In test/prow/entrypoint.sh around lines 14 to 21, TEST_DIR is used on line 16 before being defined on line 19 which breaks the script under set -o nounset; move the TEST_DIR assignment before any use, ensure the directory exists (mkdir -p "$TEST_DIR") before writing files, and change unquoted expansions to quoted ones (e.g. "$TEST_DIR" and "${GEMINI_API_KEY}") so file writes and sed operations use properly quoted variables.

test/prow/entrypoint.sh

test/prow/template.yaml

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

test/prow/entrypoint.sh (2)
20-21: Make name substitution idempotent and quote paths.

Current sed re-suffixes on reruns. Guard against double “-${UNIQUE_ID}” and quote the file path.
-sed -i "s/eval-test-singlenode/eval-test-singlenode-${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml
-sed -i "s/eval-test-multinode/eval-test-multinode-${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml
+sed -i "/eval-test-singlenode-${UNIQUE_ID}/! s/eval-test-singlenode/&-${UNIQUE_ID}/g" "$TEST_DIR/eval_data.yaml"
+sed -i "/eval-test-multinode-${UNIQUE_ID}/! s/eval-test-multinode/&-${UNIQUE_ID}/g" "$TEST_DIR/eval_data.yaml"
Optional: replace both with envsubst/yq for safer templating.

15-18: Create TEST_DIR before use and quote all paths; write .env into TEST_DIR (or drop it).

Without ensuring the directory exists, writes may fail; unquoted paths can break on spaces. Also, .env is created outside TEST_DIR but you cd into TEST_DIR later—likely not intended.
 TEST_DIR="${WORK_DIR}/test/evals"
+mkdir -p "$TEST_DIR"

-echo "$OCM_TOKEN" > $TEST_DIR/ocm_token.txt
-echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > .env
+printf '%s\n' "$OCM_TOKEN" > "$TEST_DIR/ocm_token.txt"
+# Option A (file): keep a colocated .env for tools that read it from CWD
+printf 'GEMINI_API_KEY=%s\n' "${GEMINI_API_KEY}" > "$TEST_DIR/.env"
+# Option B (preferred): drop the .env file if eval.py reads from env and rely on the container env var.

🧹 Nitpick comments (1)

test/prow/entrypoint.sh (1)
23-23: Quote cd target.
-cd $TEST_DIR
+cd "$TEST_DIR"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8a556a and 0d896f6.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (1 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/evals/eval_data.yaml
scripts/ci_test.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request
GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request

🔇 Additional comments (3)

test/prow/template.yaml (3)

9-9: Job name uniqueness looks good.

Using assisted-chat-eval-test-${UNIQUE_ID} achieves the PR goal.

44-45: Propagating UNIQUE_ID into the container env is fine.

Keeps runtime tools aware of the suffix; harmless if unused.

64-65: Required UNIQUE_ID parameter is correctly defined.

Matches usage above; aligns with CI passing -p UNIQUE_ID=….

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

scripts/ci_test.sh (2)

14-19: Fix user-facing typos.

These appear in logs and CI artifacts; correct them.

-    echo "The variable ASSISTED_CHAT_TEST was proided with the value ${ASSISTED_CHAT_TEST}, using it to create the IMAGE and TAG variables for the template"
+    echo "The variable ASSISTED_CHAT_TEST was provided with the value ${ASSISTED_CHAT_TEST}, using it to create the IMAGE and TAG variables for the template"
@@
-    echo "The variable ASSISTED_CHAT_TEST was not provieded, downloading the latest image from ${IMAGE}"
+    echo "The variable ASSISTED_CHAT_TEST was not provided, downloading the latest image from ${IMAGE}"
@@
-echo "Timeout reached. Pod $POD_NAME did not become ready in time. PLease check the logs of the pods under the directory artifacts/eval-test/gather-extra/artifacts/pods/."
+echo "Timeout reached. Pod $POD_NAME did not become ready in time. Please check the logs of the pods under artifacts/eval-test/gather-extra/artifacts/pods/."

Also applies to: 74-75

55-66: Avoid integer comparison errors when fields are empty.

When .status.succeeded/.failed are unset, [[ "$VAR" -gt 0 ]] can emit “integer expression expected”. Default to 0 and use arithmetic context.

-    if [[  "$JOB_SUCCEEDED" -gt 0  ]]; then
+    if (( ${JOB_SUCCEEDED:-0} > 0  )); then
@@
-    if [[  "$JOB_FAILED" -gt 0  ]]; then
+    if (( ${JOB_FAILED:-0} > 0  )); then

🧹 Nitpick comments (4)

scripts/ci_test.sh (3)
32-35: Quote template parameters to be safe.

Not strictly required here but good hygiene for robustness.
-           -p JOB_ID=${UNIQUE_ID} \
+           -p JOB_ID="${UNIQUE_ID}" \
27-30: Quote namespace when creating the gemini secret.

Prevents issues with spaces/special chars; also matches usage elsewhere.
-    oc create secret generic -n $NAMESPACE gemini --from-file=api_key="${SECRETS_BASE_PATH}/gemini/api_key"
+    oc create secret generic -n "$NAMESPACE" gemini --from-file=api_key="${SECRETS_BASE_PATH}/gemini/api_key"
40-45: Correct label referenced in the error message.

You search by job-name="$JOB_NAME", but the message mentions app=.... Align for clarity.
-    echo "No pod found with label app=assisted-chat-eval-test in namespace ${NAMESPACE}"
+    echo "No pod found with label job-name=${JOB_NAME} in namespace ${NAMESPACE}"
test/evals/eval_data.yaml (1)

65-76: Reduce duplication with anchors or a constant token.

Optional: define a YAML anchor or a single variable-like key for the base name and reuse, to simplify future edits.

Also applies to: 90-101, 176-189

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d896f6 and e01a299.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (4 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/prow/entrypoint.sh
test/prow/template.yaml

🔇 Additional comments (1)

test/evals/eval_data.yaml (1)

65-76: Confirm placeholder replacement path and keep names DNS-1123.

The ClustER-NAme token looks intentional to avoid accidental matches. Please verify:

test/prow/entrypoint.sh replaces the exact token (case-sensitive) with the lower-case ${UNIQUE_ID}.

No other occurrences require replacement.

Final names remain lowercase and <= 63 chars after substitution.

I can generate a quick verification script to scan the repo for the token and ensure the template/job wiring is consistent.

Also applies to: 78-85, 90-101, 112-119, 176-179, 180-189

scripts/ci_test.sh

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

test/prow/entrypoint.sh (1)

15-27: Quote paths, fix .env location, and make sed replacement safe for special chars.

Quote all variable expansions to avoid word-splitting.
Write .env where eval.py can find it (current script cds to TEST_DIR; the file is created in TEMP_DIR).
Escape & in the sed replacement and use | delimiter to avoid / issues.

Apply this diff:

-TEST_DIR="${WORK_DIR}/test/evals"
+TEST_DIR="${WORK_DIR}/test/evals"
 TEMP_DIR=$(mktemp -d)

-cd $TEMP_DIR
+cd "$TEMP_DIR"

-echo "$OCM_TOKEN" > ocm_token.txt
-echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > .env
+echo "$OCM_TOKEN" > ocm_token.txt
+# Ensure eval.py sees the key from its working dir
+echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > "$TEST_DIR/.env"

-sed -i "s/ClustER-NAme/${UNIQUE_ID}/g" $TEST_DIR/eval_data.yaml
+sed -i "s|ClustER-NAme|${UNIQUE_ID//&/\\&}|g" "$TEST_DIR/eval_data.yaml"

-cd $TEST_DIR
+cd "$TEST_DIR"

-python $TEST_DIR/eval.py --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" --agent_auth_token_file $TEMP_DIR/ocm_token.txt --eval_data_yaml $TEST_DIR/eval_data.yaml
+python "$TEST_DIR/eval.py" \
+  --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" \
+  --agent_auth_token_file "$TEMP_DIR/ocm_token.txt" \
+  --eval_data_yaml "$TEST_DIR/eval_data.yaml"

🧹 Nitpick comments (3)

test/evals/eval_data.yaml (1)
78-85: Minor: keep cluster-name token consistent in follow-up steps.

Good propagation to the ISO step. As a guardrail, add a quick preflight that fails the run if any ClustER-NAme placeholders remain after rendering.

Apply this check at runtime (in entrypoint.sh after sed):
+if rg -n "ClustER-NAme" "$TEST_DIR/eval_data.yaml" >/dev/null; then
+  echo "Unrendered cluster-name placeholders remain in eval_data.yaml" >&2
+  exit 1
+fi
test/prow/entrypoint.sh (2)
23-27: Optional: avoid mutating tracked YAML; render to a temp copy.

Keeps the working tree clean and makes the step idempotent.

Apply this diff:
-sed -i "s|ClustER-NAme|${UNIQUE_ID//&/\\&}|g" "$TEST_DIR/eval_data.yaml"
-cd "$TEST_DIR"
-python "$TEST_DIR/eval.py" \
+RENDERED_YAML="$TEMP_DIR/eval_data.yaml"
+cp "$TEST_DIR/eval_data.yaml" "$RENDERED_YAML"
+sed -i "s|ClustER-NAme|${UNIQUE_ID//&/\\&}|g" "$RENDERED_YAML"
+cd "$TEST_DIR"
+python "$TEST_DIR/eval.py" \
   --agent_endpoint "${AGENT_URL}:${AGENT_PORT}" \
   --agent_auth_token_file "$TEMP_DIR/ocm_token.txt" \
-  --eval_data_yaml "$TEST_DIR/eval_data.yaml"
+  --eval_data_yaml "$RENDERED_YAML"
15-15: Guard for directory existence before use.

Be defensive in CI: ensure $TEST_DIR exists.
 TEST_DIR="${WORK_DIR}/test/evals"
+test -d "$TEST_DIR" || { echo "Missing TEST_DIR: $TEST_DIR" >&2; exit 1; }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e01a299 and 21864bb.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (4 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/prow/template.yaml
scripts/ci_test.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request

🔇 Additional comments (2)

test/evals/eval_data.yaml (2)

112-119: Good: read-only tool call for ISO retrieval.

This looks consistent with the SNO flow and uses the same ID regex.

176-189: Standardize cluster-name placeholders; reported typo not found

The reported typo create_single_node_cluser is not present in the repo — ignore that point.

Inconsistent naming found: replace the hard-coded eval-test-sno (test/evals/eval_data.yaml:145) with the ClustER-NAme pattern so all create/reference scenarios use the same placeholder (other entries already use eval-test-singlenode-ClustER-NAme / eval-test-multinode-ClustER-NAme).

Likely an incorrect or invalid review comment.

test/evals/eval_data.yaml

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

test/evals/eval_data.yaml (2)
70-70: Anchor the name pattern to avoid partial matches; ensure UNIQUE_ID stays regex-safe.

Without ^...$, the regex could match substrings if the framework treats values as regex. Also keep UNIQUE_ID strictly [a-z0-9] to remain regex- and name-safe (dup of earlier thread).
-              name: "eval-test-singlenode-ClustER-NAme"
+              name: "^eval-test-singlenode-ClustER-NAme$"
If the generator ever changes, confirm it still emits only lowercase a–z and 0–9. Do you want me to add a quick validator in entrypoint.sh?

96-96: Anchor multinode name; same regex-safety duplicate.
-              name: "eval-test-multinode-ClustER-NAme"
+              name: "^eval-test-multinode-ClustER-NAme$"

🧹 Nitpick comments (3)

test/evals/eval_data.yaml (3)
91-91: Reduce judge flakiness: drop response_eval:accuracy here.

This step already has sub-string and tool_eval; the accuracy judge tends to flap. Align with the SNO test for stability.
-      eval_types: [tool_eval, response_eval:accuracy, response_eval:sub-string]
+      eval_types: [tool_eval, response_eval:sub-string]
177-181: Rename eval_id typo and align type with the actual multi-node action.

The id says create_single_node_cluser (typo, and single_node) while the query creates a multi-node cluster.
-    - eval_id: create_single_node_cluser
+    - eval_id: create_multi_node_cluster
Also consider adding tool_eval so we assert the create call happened (prevents false positives in the next step).
-      eval_types: [response_eval:accuracy, response_eval:sub-string]
+      eval_types: [tool_eval, response_eval:sub-string]
183-191: LGTM: list then fetch-by-id flow looks correct; patterns are sensible.

Minor: you may anchor the expected name in keywords if you want stricter matching, but not required.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21864bb and ac143af.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (4 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

test/prow/template.yaml
scripts/ci_test.sh
test/prow/entrypoint.sh

🔇 Additional comments (2)

test/evals/eval_data.yaml (2)

79-80: LGTM: ISO retrieval prompt updated to the templated cluster name.

114-116: LGTM: ISO retrieval prompt uses the templated name consistently.

coderabbitai · 2025-09-11T18:51:23Z

test/evals/eval_data.yaml

+      eval_query: create a new single node cluster named eval-test-singlenode-ClustER-NAme, running on version 4.19.7 with the x86_64 CPU architecture, configured under the base domain example.com, using the provided SSH key "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCmeaBFhSJ/MLECmqUaKweRgo10ABpwdvJ7v76qLYfP0pzfzYsF3hGP/fH5OQfHi9pTbWynjaEcPHVfaTaFWHvyMtv8PEMUIDgQPWlBSYzb+3AgQ5AsChhzTJCYnRdmCdzENlV+azgtb3mVfXiyCfjxhyy3QAV4hRrMaVtJGuUQfQ== [email protected]".
+      eval_types: [tool_eval, response_eval:sub-string]
      expected_tool_calls:
        - - tool_name: create_cluster
            arguments:
-              name: "eval-test-sno"
+              name: "eval-test-singlenode-ClustER-NAme"
              version: "4\\.19\\.7"
              base_domain: "example\\.com"
              single_node: "(?i:true)"
              cpu_architecture: "x86_64"
              ssh_public_key: 'ssh-rsa\s+[A-Za-z0-9+/]+[=]{0,3}(\s+.+)?\s*'
-      expected_keywords: ["eval-test-sno", "4.19.7", "ID", "Discovery ISO", "download", "cluster"]
-      expected_response: I have created a cluster with name eval-test-sno, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO?
+      expected_keywords: ["eval-test-singlenode-ClustER-NAme", "4.19.7", "ID", "Discovery ISO", "download", "cluster"]
+      expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO?


⚠️ Potential issue

Base-domain mismatch: example.com in query vs test.local in expected_response.

The eval_query and expected_tool_calls use example.com, but expected_response mentions test.local. This will cause false negatives.

Apply this diff:

- expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO? + expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain example.com. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO?

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

eval_query: create a new single node cluster named eval-test-singlenode-ClustER-NAme, running on version 4.19.7 with the x86_64 CPU architecture, configured under the base domain example.com, using the provided SSH key "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCmeaBFhSJ/MLECmqUaKweRgo10ABpwdvJ7v76qLYfP0pzfzYsF3hGP/fH5OQfHi9pTbWynjaEcPHVfaTaFWHvyMtv8PEMUIDgQPWlBSYzb+3AgQ5AsChhzTJCYnRdmCdzENlV+azgtb3mVfXiyCfjxhyy3QAV4hRrMaVtJGuUQfQ== [email protected]".

eval_types: [tool_eval, response_eval:sub-string]

expected_tool_calls:

- - tool_name: create_cluster

arguments:

name: "eval-test-sno"

name: "eval-test-singlenode-ClustER-NAme"

version: "4\\.19\\.7"

base_domain: "example\\.com"

single_node: "(?i:true)"

cpu_architecture: "x86_64"

ssh_public_key: 'ssh-rsa\s+[A-Za-z0-9+/]+[=]{0,3}(\s+.+)?\s*'

expected_keywords: ["eval-test-sno", "4.19.7", "ID", "Discovery ISO", "download", "cluster"]

expected_response: I have created a cluster with name eval-test-sno, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO?

expected_keywords: ["eval-test-singlenode-ClustER-NAme", "4.19.7", "ID", "Discovery ISO", "download", "cluster"]

expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO?

eval_query: create a new single node cluster named eval-test-singlenode-ClustER-NAme, running on version 4.19.7 with the x86_64 CPU architecture, configured under the base domain example.com, using the provided SSH key "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCmeaBFhSJ/MLECmqUaKweRgo10ABpwdvJ7v76qLYfP0pzfzYsF3hGP/fH5OQfHi9pTbWynjaEcPHVfaTaFWHvyMtv8PEMUIDgQPWlBSYzb+3AgQ5AsChhzTJCYnRdmCdzENlV+azgtb3mVfXiyCfjxhyy3QAV4hRrMaVtJGuUQfQ== [email protected]".

eval_types: [tool_eval, response_eval:sub-string]

expected_tool_calls:

- - tool_name: create_cluster

arguments:

name: "eval-test-singlenode-ClustER-NAme"

version: "4\\.19\\.7"

base_domain: "example\\.com"

single_node: "(?i:true)"

cpu_architecture: "x86_64"

ssh_public_key: 'ssh-rsa\s+[A-Za-z0-9+/]+[=]{0,3}(\s+.+)?\s*'

expected_keywords: ["eval-test-singlenode-ClustER-NAme", "4.19.7", "ID", "Discovery ISO", "download", "cluster"]

expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain example.com. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Do you want me to download the Discovery ISO?

🤖 Prompt for AI Agents

In test/evals/eval_data.yaml around lines 65 to 77, the eval_query and expected_tool_calls use base domain "example.com" but expected_response contains "test.local"; update the expected_response so the base domain matches the rest of the test by replacing "test.local" with "example.com" (so the sentence reads "... base domain example.com.") and keep the rest of the response unchanged.

test/evals/eval_data.yaml

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

test/evals/eval_data.yaml (2)

77-77: Fix base-domain mismatch in expected_response.

Change test.local → example.com to match the query and expected_tool_calls.

-      expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?
+      expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain example.com. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?

103-103: Make expected_response cluster name consistent.

Include the -ClustER-NAme suffix to match the templated name.

-      expected_response: I have created a cluster with name eval-test-multinode, OpenShift version 4.18.22, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?
+      expected_response: I have created a cluster with name eval-test-multinode-ClustER-NAme, OpenShift version 4.18.22, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?

🧹 Nitpick comments (2)

test/evals/eval_data.yaml (1)
179-183: Rename eval_id and fix typo to reflect multi-node flow.

The eval_id says “single_node” and has a typo; the query is multi-node.
-    - eval_id: create_single_node_cluser
+    - eval_id: create_multi_node_cluster
test/prow/entrypoint.sh (1)
18-21: Quote paths for safety.

Minor: quote cd target and file paths.
-cd $TEMP_DIR
+cd "$TEMP_DIR"
 
-echo "$OCM_TOKEN" > ocm_token.txt
-echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > .env
+echo "$OCM_TOKEN" > "ocm_token.txt"
+echo "GEMINI_API_KEY=${GEMINI_API_KEY}" > ".env"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac143af and a4fa1ac.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (3 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/prow/template.yaml
scripts/ci_test.sh

🔇 Additional comments (2)

test/prow/entrypoint.sh (2)

15-15: Defining TEST_DIR early is good.

This resolves the prior unbound variable issue.

23-25: Incorrect — UNIQUE_ID is charset-safe; no sed escaping or temp-file needed

UNIQUE_ID is generated in scripts/ci_test.sh as UNIQUE_ID=$(head /dev/urandom | tr -dc 0-9a-z | head -c 8) (only 0-9 and a-z) and is passed into the pod via JOB_ID/UNIQUE_ID, so the sed replacement in test/prow/entrypoint.sh is safe — no change required.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-09-11T19:49:45Z

test/evals/eval_data.yaml

+      expected_response: The SSH public key is set for the cluster for cluster
+      expected_keywords: ["SSH", "cluster", "Discovery ISO"]


⚠️ Potential issue

Fix duplicated wording in expected_response.

Remove “for cluster” repetition.

- expected_response: The SSH public key is set for the cluster for cluster + expected_response: The SSH public key has been set for the cluster

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

expected_response: The SSH public key is set for the cluster for cluster

expected_keywords: ["SSH", "cluster", "Discovery ISO"]

expected_response: The SSH public key has been set for the cluster

expected_keywords: ["SSH", "cluster", "Discovery ISO"]

🤖 Prompt for AI Agents

In test/evals/eval_data.yaml around lines 112 to 113, the expected_response value contains duplicated wording ("for cluster" repeated); edit the YAML so the expected_response reads without the duplicate, e.g. "The SSH public key is set for the cluster", and keep expected_keywords unchanged.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

scripts/ci_test.sh (3)

14-19: Fix user-facing typos.

“proided” → “provided”; “provieded” → “provided”.

-    echo "The variable ASSISTED_CHAT_TEST was proided with the value ${ASSISTED_CHAT_TEST}, using it to create the IMAGE and TAG variables for the template"
+    echo "The variable ASSISTED_CHAT_TEST was provided with the value ${ASSISTED_CHAT_TEST}, using it to create the IMAGE and TAG variables for the template"
@@
-    echo "The variable ASSISTED_CHAT_TEST was not provieded, downloading the latest image from ${IMAGE}"
+    echo "The variable ASSISTED_CHAT_TEST was not provided, downloading the latest image from ${IMAGE}"

55-66: Avoid “integer expression expected” when fields are empty; fix grammar.

Guard numeric tests with defaults; pluralize “tests were”.

-    if [[  "$JOB_SUCCEEDED" -gt 0  ]]; then
-        echo "The evaluation test were successful. The logs of the tests are stored in the directory artifacts/eval-test/gather-extra/artifacts/pods/ in the logs of the pod ${POD_NAME}."
+    if [[ "${JOB_SUCCEEDED:-0}" -gt 0 ]]; then
+        echo "The evaluation tests were successful. The logs of the tests are stored in artifacts/eval-test/gather-extra/artifacts/pods/ in the logs of the pod ${POD_NAME}."
         exit 0
     fi
@@
-    if [[  "$JOB_FAILED" -gt 0  ]]; then
+    if [[ "${JOB_FAILED:-0}" -gt 0 ]]; then
         echo "Pod ${POD_NAME} is Failed, exiting"
         echo "The evaluation tests failed, you can see the logs of the pods under the directory artifacts/eval-test/gather-extra/artifacts/pods/."
         echo "oc events"
         oc events -n "$NAMESPACE"
         exit 1
     fi

74-75: Fix capitalization in timeout message.

-echo "Timeout reached. Pod $POD_NAME did not become ready in time. PLease check the logs of the pods under the directory artifacts/eval-test/gather-extra/artifacts/pods/."
+echo "Timeout reached. Pod $POD_NAME did not become ready in time. Please check the logs of the pods under the directory artifacts/eval-test/gather-extra/artifacts/pods/."

♻️ Duplicate comments (4)

scripts/ci_test.sh (1)

9-11: Harden UNIQUE_ID generation; quote SHARED_DIR.

Replicating earlier feedback: enforce C locale, add a fallback, and quote the output path.
-UNIQUE_ID=$(head /dev/urandom | tr -dc 0-9a-z | head -c 8)
-
-echo "${UNIQUE_ID}" > ${SHARED_DIR}/eval_test_unique_id
+UNIQUE_ID="$(LC_ALL=C tr -dc 'a-z0-9' </dev/urandom | head -c 8 || true)"
+UNIQUE_ID="${UNIQUE_ID:-$(date +%s%N | sha256sum | cut -c1-8)}"
+echo "${UNIQUE_ID}" > "${SHARED_DIR}/eval_test_unique_id"

test/evals/eval_data.yaml (3)

90-104: Ensure cluster name in expected_response includes the unique suffix.

The response string lacks “-ClustER-NAme”.

-      expected_response: I have created a cluster with name eval-test-multinode, OpenShift version 4.18.22, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?
+      expected_response: I have created a cluster with name eval-test-multinode-ClustER-NAme, OpenShift version 4.18.22, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?

105-112: Fix duplicated wording in expected_response.

-      expected_response: The SSH public key is set for the cluster for cluster
+      expected_response: The SSH public key has been set for the cluster

65-77: Base-domain mismatch: expected_response says test.local but query/tool use example.com.

Align to example.com to avoid false negatives.

-      expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain test.local. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?
+      expected_response: I have created a cluster with name eval-test-singlenode-ClustER-NAme, OpenShift version 4.19.7, base domain example.com. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL?

🧹 Nitpick comments (3)

scripts/ci_test.sh (3)
27-30: Quote NAMESPACE consistently when creating the gemini secret.
-    oc create secret generic -n $NAMESPACE gemini --from-file=api_key="${SECRETS_BASE_PATH}/gemini/api_key"
+    oc create secret generic -n "$NAMESPACE" gemini --from-file=api_key="${SECRETS_BASE_PATH}/gemini/api_key"
32-35: Quote JOB_ID parameter for oc process.
-           -p JOB_ID=${UNIQUE_ID} \
+           -p JOB_ID="${UNIQUE_ID}" \
40-45: Fix label name in error message.

Message mentions app=… but code selects by job-name=…. Align the text.
-    echo "No pod found with label app=assisted-chat-eval-test in namespace ${NAMESPACE}"
+    echo "No pod found with label job-name=${JOB_NAME} in namespace ${NAMESPACE}"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a4fa1ac and 22ef9d9.

📒 Files selected for processing (4)

scripts/ci_test.sh (2 hunks)
test/evals/eval_data.yaml (3 hunks)
test/prow/entrypoint.sh (1 hunks)
test/prow/template.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/prow/entrypoint.sh
test/prow/template.yaml

🔇 Additional comments (2)

test/evals/eval_data.yaml (2)

114-121: LGTM: ISO retrieval step parameters look correct.

178-193: LGTM: Name templating and follow-up info lookup flow are consistent.

openshift-ci · 2025-09-12T12:33:35Z

@andrej1991: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/eval-test	`6df55aa`	link	false	`/test eval-test`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

omertuc · 2025-09-12T14:28:49Z

/lgtm

omertuc · 2025-09-12T14:28:52Z

/approve

openshift-ci · 2025-09-12T14:28:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrej1991, omertuc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [omertuc]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from jhernand and keitwb September 10, 2025 14:26

openshift-merge-robot added the needs-rebase label Sep 10, 2025

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

scripts/ci_test.sh Outdated Show resolved Hide resolved

andrej1991 force-pushed the eval_test_v2 branch from f60bf1d to b688a89 Compare September 10, 2025 14:48

openshift-merge-robot removed the needs-rebase label Sep 10, 2025

andrej1991 force-pushed the eval_test_v2 branch from b688a89 to 06297ff Compare September 10, 2025 14:51

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

andrej1991 closed this Sep 10, 2025

andrej1991 force-pushed the eval_test_v2 branch from 06297ff to f10503a Compare September 10, 2025 17:27

andrej1991 reopened this Sep 10, 2025

andrej1991 force-pushed the eval_test_v2 branch from c483e00 to e8a556a Compare September 10, 2025 17:36

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

andrej1991 force-pushed the eval_test_v2 branch 3 times, most recently from f0a30a9 to 0d896f6 Compare September 10, 2025 18:06

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

andrej1991 force-pushed the eval_test_v2 branch from 0d896f6 to e01a299 Compare September 10, 2025 19:28

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

scripts/ci_test.sh Show resolved Hide resolved

openshift-merge-robot added the needs-rebase label Sep 10, 2025

andrej1991 force-pushed the eval_test_v2 branch from e01a299 to 21864bb Compare September 11, 2025 06:29

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

test/evals/eval_data.yaml Outdated Show resolved Hide resolved

test/evals/eval_data.yaml Outdated Show resolved Hide resolved

andrej1991 force-pushed the eval_test_v2 branch from 21864bb to ac143af Compare September 11, 2025 18:44

openshift-merge-robot removed the needs-rebase label Sep 11, 2025

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

andrej1991 force-pushed the eval_test_v2 branch from ac143af to a4fa1ac Compare September 11, 2025 19:42

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

andrej1991 force-pushed the eval_test_v2 branch from a4fa1ac to 22ef9d9 Compare September 12, 2025 08:51

coderabbitai bot reviewed Sep 12, 2025

View reviewed changes

adding unique name to the clusters created by the evaluation tests

6df55aa

andrej1991 force-pushed the eval_test_v2 branch from 22ef9d9 to 6df55aa Compare September 12, 2025 11:26

openshift-ci bot assigned omertuc Sep 12, 2025

openshift-ci bot added the lgtm label Sep 12, 2025

openshift-ci bot added the approved label Sep 12, 2025

openshift-merge-bot bot merged commit b96916a into rh-ecosystem-edge:main Sep 12, 2025
6 of 7 checks passed

This was referenced Sep 14, 2025

Fix(test): Update evaluation data #205

Merged

making the evaluation tests more stable #213

Merged

		expected_response: The SSH public key is set for the cluster for cluster
		expected_keywords: ["SSH", "cluster", "Discovery ISO"]

creating a unique name for the clusters created by the evaluation tests #198

creating a unique name for the clusters created by the evaluation tests #198

Uh oh!

Conversation

andrej1991 commented Sep 10, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks (3 passed)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

andrej1991 commented Sep 10, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

omertuc commented Sep 12, 2025

Uh oh!

omertuc commented Sep 12, 2025

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

andrej1991 commented Sep 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 10, 2025 •

edited

Loading