Removing the openshift version and the base domain from the expected response #202

andrej1991 · 2025-09-12T15:25:20Z

Removing the openshift version and the base domain from the expected response response of the chat for a successful cluster creation, because the system prompt does not instruct the chat to repeat this information back. Sometimes the model/chat does repeat those information and sometimes it does not which makes the tests unstable.

Summary by CodeRabbit

Style
- Simplified cluster creation confirmations to start with the cluster name and a clear next step to download the Discovery ISO.
- Removed mentions of OpenShift version and base domain from user-facing messages.
- Standardized wording across single-node and multi-node flows and cluster-name lookups for consistency.

…response of the chat for a successful cluster creation, because the system prompt does not instruct the chat to repeat this information back. Sometimes the model/chat does repeat those information and sometimes it does not which makes the tests unstable.

openshift-ci · 2025-09-12T15:25:25Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: andrej1991
Once this PR has been reviewed and has the lgtm label, please assign carbonin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2025-09-12T15:25:27Z

Walkthrough

Adjusted evaluation expectations in test/evals/eval_data.yaml by removing OpenShift version and base-domain tokens from expected_keywords and expected_response texts, simplifying cluster-creation responses, and slightly altering keyword sets in one multinode block. Tool call expectations were unchanged.

Changes

Cohort / File(s)	Summary of Changes
Eval expectation updates `test/evals/eval_data.yaml`	Removed version/base-domain tokens from expected_keywords and expected_response across cluster-creation and cluster-id lookup evals; standardized cluster creation responses to start with cluster name and a generic Discovery ISO instruction; in one multinode eval, dropped a “download” keyword; no tool-call expectation changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Increasing eval test coverage #134 — Also modifies cluster-creation eval keywords/response expectations in test/evals/eval_data.yaml.
MGMT-21240: improve eval test coverage #182 — Alters the same eval blocks for SNO and multinode workflows, changing expected strings and flows.
MGMT-21148: Add initial eval tests #40 — Earlier introduction of versioned expected responses in the same file later simplified by this PR.

Suggested labels

approved, lgtm

Suggested reviewers

omertuc

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title accurately and concisely summarizes the primary change: removing the OpenShift version and base domain from expected responses in the tests, which matches the PR summary and the raw diffs. It is specific and directly related to the changeset so a reviewer can understand the main intent at a glance.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

andrej1991 · 2025-09-12T15:25:34Z

/hold

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

test/evals/eval_data.yaml (4)
76-77: Good simplification; drop exact-match accuracy to prevent flakiness.

You achieved the PR goal here. To further stabilize this eval, remove the exact-match accuracy check so we don’t couple the test to a specific sentence form.
-      eval_types: [tool_eval, response_eval:sub-string, response_eval:accuracy]
+      eval_types: [tool_eval, response_eval:sub-string]
Also applies to: 66-66

102-103: Restore “download” keyword for consistency and de-flake by removing accuracy.

Multinode block omits "download" while SNO includes it; align both. Also drop the accuracy check like above.
-      eval_types: [tool_eval, response_eval:accuracy, response_eval:sub-string]
+      eval_types: [tool_eval, response_eval:sub-string]
-      expected_keywords: ["eval-test-multinode-ClustER-NAme", "ID", "Discovery ISO", "cluster"]
+      expected_keywords: ["eval-test-multinode-ClustER-NAme", "ID", "Discovery ISO", "download", "cluster"]
Also applies to: 91-91

180-181: LGTM; make keyword-only for stability.

Content matches the new policy. Recommend removing the accuracy check here too.
-      eval_types: [response_eval:accuracy, response_eval:sub-string]
+      eval_types: [response_eval:sub-string]
Also applies to: 179-179

112-112: Fix redundant phrase in SSH key confirmation message.

The expected response reads awkwardly and likely mismatches model outputs.
-      expected_response: The SSH public key is set for the cluster for cluster
+      expected_response: The SSH public key is set for the cluster.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b96916a and 9746151.

📒 Files selected for processing (1)

test/evals/eval_data.yaml (3 hunks)

andrej1991 · 2025-09-12T18:00:52Z

/test eval-test

andrej1991 · 2025-09-12T19:05:06Z

/test eval-test

openshift-ci · 2025-09-12T19:24:08Z

@andrej1991: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/eval-test	`9746151`	link	false	`/test eval-test`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

andrej1991 · 2025-09-15T07:09:07Z

/test eval-test

openshift-ci bot requested review from maorfr and omertuc September 12, 2025 15:25

openshift-ci bot added the do-not-merge/hold label Sep 12, 2025

coderabbitai bot reviewed Sep 12, 2025

View reviewed changes

eranco74 merged commit 3f5584a into rh-ecosystem-edge:main Sep 14, 2025
5 of 7 checks passed

coderabbitai bot mentioned this pull request Nov 3, 2025

Remove substring test from sno creation eval test #243

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removing the openshift version and the base domain from the expected response #202

Removing the openshift version and the base domain from the expected response #202

Uh oh!

andrej1991 commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

Uh oh!

andrej1991 commented Sep 12, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

andrej1991 commented Sep 12, 2025

Uh oh!

andrej1991 commented Sep 12, 2025

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

Uh oh!

andrej1991 commented Sep 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Removing the openshift version and the base domain from the expected response #202

Removing the openshift version and the base domain from the expected response #202

Uh oh!

Conversation

andrej1991 commented Sep 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks (3 passed)

Uh oh!

andrej1991 commented Sep 12, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

andrej1991 commented Sep 12, 2025

Uh oh!

andrej1991 commented Sep 12, 2025

Uh oh!

openshift-ci bot commented Sep 12, 2025

Uh oh!

Uh oh!

andrej1991 commented Sep 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andrej1991 commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading