Resolves asking for cluster ID when it's known. #249

andrej1991 · 2025-11-11T14:41:05Z

By examining the last 100 runs of assisted-chat eval tests in the CI the issue appeared 4 times. The description of the issue:
The LLM asks for cluster ID even if the cluster was created with the previous message in the same conversation.
Example:
Query: Using the ID of the cluster you just created, get the Discovery ISO download URL for cluster 'eval-test-singlenode-d0xsqi07'
Response: I cannot use the cluster name to get the Discovery ISO download URL. I need the cluster ID. The cluster ID is 12e392cb-82e3-43e1-9923-d627b6476f43. Would you like me to get the Discovery ISO download URL for you?

Root cause:
Some test cases used redundant identification of the cluster. They had
an indirect refference like 'the cluster you just created' and a direct refference 'cluster named xyz'. LLMs tend to have a policy that in such redundant definitions it shall verify, which wan is the subject of the query.

Summary by CodeRabbit

Tests
- Updated evaluation test queries to use generic cluster references instead of specific cluster identifiers. This change simplifies test data across SNO and multinode configuration test cases while maintaining existing test structure and behavior.

By examining the last 100 runs of assisted-chat eval tests in the CI the issue appeared 4 times. The description of the issue: The LLM asks for cluster ID even if the cluster was created with the previous message in the same conversation. Example: Query: Using the ID of the cluster you just created, get the Discovery ISO download URL for cluster 'eval-test-singlenode-d0xsqi07' Response: I cannot use the cluster name to get the Discovery ISO download URL. I need the cluster ID. The cluster ID is 12e392cb-82e3-43e1-9923-d627b6476f43. Would you like me to get the Discovery ISO download URL for you? Root cause: Some test cases used redundant identification of the cluster. They had an indirect refference like 'the cluster you just created' and a direct refference 'cluster named xyz'. LLMs tend to have a policy that in such redundant definitions it shall verify, which wan is the subject of the query.

coderabbitai · 2025-11-11T14:41:15Z

Walkthrough

Updates three eval query strings in test configuration to replace specific cluster name references with generic phrasing. Changes "for cluster 'NAME'" to "for the cluster" in three distinct eval_query entries within the eval data file.

Changes

Cohort / File(s)	Summary
Test evaluation data updates `test/evals/eval_data.yaml`	Updated three eval_query strings to replace cluster name references with generic phrasing; modified get_iso_eval_test_sno, get_iso_eval_test_multinode, and cluster ISO retrieval query entries

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Fix(test): Update evaluation data #205: Modifies cluster-name references in the same test/evals/eval_data.yaml file
making the evaluation tests more stable #213: Updates eval query text in test/evals/eval_data.yaml with cluster-related changes
Improve cluster id/name handling #188: Changes eval queries that reference cluster names/IDs in the same eval file

Suggested labels

lgtm, approved

Suggested reviewers

eranco74
keitwb
maorfr

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'Resolves asking for cluster ID when it's known' directly matches the main objective of fixing redundant cluster identification in test cases to prevent unnecessary LLM queries.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b98f900 and c6bd664.

📒 Files selected for processing (1)

test/evals/eval_data.yaml (2 hunks)

🔇 Additional comments (2)

test/evals/eval_data.yaml (2)

62-62: Aligns with PR objective to remove redundant cluster identification.

The change from an explicit cluster name reference to a generic "the cluster you just created" eliminates redundancy that could confuse the LLM. The query now relies solely on indirect context reference, which should help the assistant extract the cluster ID from the conversation history without unnecessary disambiguation.

Ensure the evaluation test passes with this indirect reference approach—the LLM should correctly identify the cluster ID from the conversation context created in the previous eval step.

90-90: Consistent application of redundancy removal pattern.

This change mirrors line 62, applying the same principle to the multinode workflow: removing the explicit cluster name to rely on implicit context. The expected tool call remains unchanged, confirming the LLM should still extract the cluster ID correctly.

Verify that the evaluation test correctly executes the cluster_iso_download_url tool call with the inferred cluster ID from "the cluster you just created" reference in the multinode workflow context.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

andrej1991 · 2025-11-11T18:11:46Z

/retest-required

carbonin · 2025-11-11T19:33:58Z

/retest

openshift-ci · 2025-11-11T19:34:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrej1991, carbonin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [carbonin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from carbonin and omertuc November 11, 2025 14:41

carbonin approved these changes Nov 11, 2025

View reviewed changes

openshift-ci bot assigned carbonin Nov 11, 2025

openshift-ci bot added the lgtm label Nov 11, 2025

openshift-ci bot added the approved label Nov 11, 2025

openshift-merge-bot bot merged commit 7b68e95 into rh-ecosystem-edge:main Nov 11, 2025
8 checks passed

coderabbitai bot mentioned this pull request Nov 17, 2025

fix(eval): Update intent check to accept cluster not found #252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Resolves asking for cluster ID when it's known. #249

Resolves asking for cluster ID when it's known. #249

Uh oh!

andrej1991 commented Nov 11, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 11, 2025 •

edited

Loading

Uh oh!

andrej1991 commented Nov 11, 2025

Uh oh!

carbonin commented Nov 11, 2025

Uh oh!

openshift-ci bot commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Resolves asking for cluster ID when it's known. #249

Resolves asking for cluster ID when it's known. #249

Uh oh!

Conversation

andrej1991 commented Nov 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

andrej1991 commented Nov 11, 2025

Uh oh!

carbonin commented Nov 11, 2025

Uh oh!

openshift-ci bot commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andrej1991 commented Nov 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 11, 2025 •

edited

Loading