Skip to content

Conversation

@andrej1991
Copy link
Collaborator

@andrej1991 andrej1991 commented Sep 12, 2025

Removing the openshift version and the base domain from the expected response response of the chat for a successful cluster creation, because the system prompt does not instruct the chat to repeat this information back. Sometimes the model/chat does repeat those information and sometimes it does not which makes the tests unstable.

Summary by CodeRabbit

  • Style
    • Simplified cluster creation confirmations to start with the cluster name and a clear next step to download the Discovery ISO.
    • Removed mentions of OpenShift version and base domain from user-facing messages.
    • Standardized wording across single-node and multi-node flows and cluster-name lookups for consistency.

…response of the chat for a successful cluster creation, because the system prompt does not instruct the chat to repeat this information back. Sometimes the model/chat does repeat those information and sometimes it does not which makes the tests unstable.
@openshift-ci openshift-ci bot requested review from maorfr and omertuc September 12, 2025 15:25
@openshift-ci
Copy link

openshift-ci bot commented Sep 12, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: andrej1991
Once this PR has been reviewed and has the lgtm label, please assign carbonin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link

coderabbitai bot commented Sep 12, 2025

Walkthrough

Adjusted evaluation expectations in test/evals/eval_data.yaml by removing OpenShift version and base-domain tokens from expected_keywords and expected_response texts, simplifying cluster-creation responses, and slightly altering keyword sets in one multinode block. Tool call expectations were unchanged.

Changes

Cohort / File(s) Summary of Changes
Eval expectation updates
test/evals/eval_data.yaml
Removed version/base-domain tokens from expected_keywords and expected_response across cluster-creation and cluster-id lookup evals; standardized cluster creation responses to start with cluster name and a generic Discovery ISO instruction; in one multinode eval, dropped a “download” keyword; no tool-call expectation changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

approved, lgtm

Suggested reviewers

  • omertuc

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately and concisely summarizes the primary change: removing the OpenShift version and base domain from expected responses in the tests, which matches the PR summary and the raw diffs. It is specific and directly related to the changeset so a reviewer can understand the main intent at a glance.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@andrej1991
Copy link
Collaborator Author

/hold

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
test/evals/eval_data.yaml (4)

76-77: Good simplification; drop exact-match accuracy to prevent flakiness.

You achieved the PR goal here. To further stabilize this eval, remove the exact-match accuracy check so we don’t couple the test to a specific sentence form.

-      eval_types: [tool_eval, response_eval:sub-string, response_eval:accuracy]
+      eval_types: [tool_eval, response_eval:sub-string]

Also applies to: 66-66


102-103: Restore “download” keyword for consistency and de-flake by removing accuracy.

Multinode block omits "download" while SNO includes it; align both. Also drop the accuracy check like above.

-      eval_types: [tool_eval, response_eval:accuracy, response_eval:sub-string]
+      eval_types: [tool_eval, response_eval:sub-string]
-      expected_keywords: ["eval-test-multinode-ClustER-NAme", "ID", "Discovery ISO", "cluster"]
+      expected_keywords: ["eval-test-multinode-ClustER-NAme", "ID", "Discovery ISO", "download", "cluster"]

Also applies to: 91-91


180-181: LGTM; make keyword-only for stability.

Content matches the new policy. Recommend removing the accuracy check here too.

-      eval_types: [response_eval:accuracy, response_eval:sub-string]
+      eval_types: [response_eval:sub-string]

Also applies to: 179-179


112-112: Fix redundant phrase in SSH key confirmation message.

The expected response reads awkwardly and likely mismatches model outputs.

-      expected_response: The SSH public key is set for the cluster for cluster
+      expected_response: The SSH public key is set for the cluster.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b96916a and 9746151.

📒 Files selected for processing (1)
  • test/evals/eval_data.yaml (3 hunks)

@andrej1991
Copy link
Collaborator Author

/test eval-test

1 similar comment
@andrej1991
Copy link
Collaborator Author

/test eval-test

@openshift-ci
Copy link

openshift-ci bot commented Sep 12, 2025

@andrej1991: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/eval-test 9746151 link false /test eval-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@eranco74 eranco74 merged commit 3f5584a into rh-ecosystem-edge:main Sep 14, 2025
5 of 7 checks passed
@andrej1991
Copy link
Collaborator Author

/test eval-test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants