changing the prompt so the host_booted_but_not_discovered test would be more consistent #223

andrej1991 · 2025-10-01T08:58:19Z

changing the prompt so the mno_cluster_workflow_conv/host_booted_but_not_discovered test would be more consistent.
Also changing the basic_introduction_conv from sub-string to accuracy

Summary by CodeRabbit

New Features
- Clear messaging when no hosts are discovered during Host Discovery and Configuration.
- Clarified capabilities wording to explicitly reference VMs (e.g., vSphere, KVM, libvirt).
Documentation
- Refined Static Network Configuration copy to remove explicit tool names and use generic “proper tool” references while preserving sequencing, validation, and YAML confirmation steps.
- Minor capitalization and phrasing improvements for readability.
Tests
- Switched evaluations to accuracy-based checks with explicit expected responses.
- Reworked network config tests to use regex-based assertions.

coderabbitai · 2025-10-01T08:58:26Z

Walkthrough

Generalized explicit tool-call wording in the static networking flow in template.yaml, added an explicit "no hosts discovered" branch, reflowed phrasing/indentation, and updated evals in test/evals/eval_data.yaml to use accuracy-based checking and a regex-based nmstate YAML assertion.

Changes

Cohort / File(s)	Summary of Changes
Assistant template: static networking flow & host discovery `template.yaml`	Replaced explicit tool-call names with generic "proper tool call" wording for NMState generation/validation; reworded static network configuration steps (initial YAML generation, prompt for user tweaks); added explicit branch to report when no hosts are discovered; minor indentation and phrasing reflow.
Evals: accuracy and regex assertions `test/evals/eval_data.yaml`	Switched `basic_introduction` eval from substring to accuracy with an explicit `expected_response`; adjusted supported_platforms wording to use "VMs (e.g., vSphere, KVM, libvirt)"; replaced a literal nmstate YAML assertion with a single regex (?s) lookahead validating `ethernet_ifaces`, `vlan_ifaces`, and DNS structure.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant A as Assistant
  participant T as Network Tool
  participant H as Hosts

  U->>A: Request static network configuration
  A->>T: Generate initial nmstate YAML (proper tool call)
  T-->>A: YAML
  A->>U: Present YAML and ask for tweaks
  U-->>A: Confirm / request changes
  A->>T: Validate YAML (proper tool call)
  T-->>A: Validation result
  A->>H: Apply configuration
  H-->>A: Result

  alt No hosts discovered
    A->>U: Report no hosts found
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

MGMT-21646: Static Networking Configuration #214 — Overlapping edits to the static networking workflow and nmstate/tool-call expectations in template.yaml and related tests.
Fix(test): Update evaluation data #205 — Similar changes converting eval checks from substring to accuracy and adjusting expected responses.
MGMT-21645: clarifying support scope #184 — Related wording changes around supported platforms and corresponding eval updates.

Suggested labels

approved, lgtm

Suggested reviewers

jhernand
eranco74

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly indicates the change to the host_booted_but_not_discovered test prompt, which is one of the PR’s objectives, but it omits mention of the updated basic_introduction_conv evaluation method and other relevant changes.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e605a1 and 982de33.

📒 Files selected for processing (2)

template.yaml (3 hunks)
test/evals/eval_data.yaml (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

test/evals/eval_data.yaml
template.yaml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2025-10-01T08:58:30Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

andrej1991 · 2025-10-01T08:58:55Z

/test eval-test

andrej1991 · 2025-10-01T14:44:38Z

/test eval-test

keitwb

Prompt changes and regex lookaheads look good.

keitwb · 2025-10-01T14:52:33Z

test/evals/eval_data.yaml

            arguments: 
              params: |-
-                \{"ethernet_ifaces": \[\{"mac_address": "c5:d6:bc:f0:05:20", "name": "eth0"}\], "vlan_ifaces": \[{"name": "vlan0", "vlan_id": 400, "base_interface_name": "eth0", "ipv4_address": {"address": "10.0.0.5", "cidr_length": 24}}\], "dns": {"dns_servers": \["8.8.8.8"\]}}
+                (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$


It would be nice if the eval tests would just compare objects without caring about key sort order but I guess this isn't available yet.

carbonin · 2025-10-01T17:29:46Z

template.yaml

      3.  **Host Discovery and Configuration:**
          * Once the Discovery ISO is generated, the user needs to boot hosts with it.
          * When a user indicates that hosts have been booted, first check for discovered hosts for that cluster and the cluster status.
+          * If no host were discovered indicate the error to the user and offer help in the possible further operations.


Generally there is no error when hosts are not discovered. The cluster just remains "insufficient" .

I'm also not sure what "offer help in the possible further operations" is supposed to mean.

andrej1991 · 2025-10-01T17:47:42Z

/test eval-test

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

test/evals/eval_data.yaml (2)
110-110: Fix response string typo

“The SSH public key is set for the cluster for cluster” → remove duplicate phrase.
-      expected_response: The SSH public key is set for the cluster for cluster
+      expected_response: The SSH public key has been set for the cluster.
121-123: Grammar for host discovery message

Use a clear plural form: “No hosts have been discovered yet.”
-      expected_response: "hosts hasn't been discovered yet."
+      expected_response: "No hosts have been discovered yet."

🧹 Nitpick comments (4)

test/evals/eval_data.yaml (1)

173-173: Platform wording clarity (on‑premises, virtualization platforms)

Tighten wording and fix “on‑premise” to “on‑premises”; call out virtualization platforms rather than “VMs like …”.

-      expected_response: I can help you install OpenShift on-premise using the Assisted Installer, either on bare metal servers or virtual machines (VMs) like vSphere, KVM or libvirt. I do not support public cloud platforms like Amazon Web Services (AWS), Azure, or Google Cloud Platform (GCP).
+      expected_response: I can help you install OpenShift on‑premises using the Assisted Installer, either on bare metal servers or on virtualization platforms like vSphere, KVM, or libvirt. I do not support public cloud platforms like Amazon Web Services (AWS), Azure, or Google Cloud Platform (GCP).

template.yaml (3)

208-208: Use “on‑premises” and clarify virtualization platforms

Mirror the eval wording and fix terminology.

-      - Supported: On-premise OpenShift installs via Assisted Installer on baremetal hosts or virtual machines (VMs) like vSphere, KVM or libvirt.
+      - Supported: On‑premises OpenShift installs via Assisted Installer on bare‑metal hosts or on virtualization platforms like vSphere, KVM, or libvirt.

250-257: Remove explicit tool name; keep generalized phrasing; minor style nits

Line 255 still names generate_nmstate_yaml, which contradicts the generalized “proper tool call” approach and the non‑disclosure guidance. Also capitalize YAML consistently.

-          * If the user wants static network configuration, you should first remind them of any existing static network configuration already present on the cluster by using the appropriate tool call. Show them the YAML only and not the mac_interface_map.
+          * If the user wants static network configuration, first remind them of any existing static network configuration already present on the cluster using the appropriate tool call. Show YAML only (not the mac_interface_map).
-          * Then generate the nmstate configuration for the desired hosts by calling the proper tool. Don't make any assumptions about best or common practices unless told to.
+          * Then generate the nmstate configuration for the desired hosts via the proper tool. Don't assume best/common practices unless the user asks for them.
-              * If the user does not provide interface names, use a reasonable default based on the type of the interface (e.g. for ethernet use eth0, eth1, etc).
+              * If the user does not provide interface names, use a reasonable default based on the interface type (e.g., for ethernet use eth0, eth1, etc.).
-          * After generating the initial yaml ask the user if they want to tweak anything.
+          * After generating the initial YAML, ask the user if they want to tweak anything.
-          * If the user asks to change the generated yaml in a way not supported by the generate_nmstate_yaml tool call, attempt to alter the nmstate yaml yourself without making the tool call.
+          * If the user asks to change the generated YAML in a way not supported by the generation tool, attempt to alter the nmstate YAML yourself without making that tool call.
-          * After modifying nmstate yaml, validate it with the proper tool call before presenting it to the user.
+          * After modifying nmstate YAML, validate it with the proper tool call before presenting it to the user.

270-270: Grammar: pluralize “hosts”

“host were” → “hosts were”.

-          * If no host were discovered indicate it to the user. Do not assume any hosts were discovered.
+          * If no hosts were discovered, indicate it to the user. Do not assume any hosts were discovered.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 725f39a and 4e605a1.

📒 Files selected for processing (2)

template.yaml (3 hunks)
test/evals/eval_data.yaml (3 hunks)

coderabbitai · 2025-10-01T18:47:42Z

test/evals/eval_data.yaml

+      eval_types: [response_eval:accuracy]
+      expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer.
      description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping)


⚠️ Potential issue | 🟡 Minor

Eval type/description mismatch for basic_introduction

You switched to accuracy but the description still says “keyword matching.” Align the description with the new evaluation type to avoid confusion.

- eval_types: [response_eval:accuracy] - expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer. - description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping) + eval_types: [response_eval:accuracy] + expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer. + description: Basic greeting test using exact-match expected_response for consistency

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

eval_types: [response_eval:accuracy]

expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer.

description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping)

eval_types: [response_eval:accuracy]

expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer.

description: Basic greeting test using exact-match expected_response for consistency

🤖 Prompt for AI Agents

In test/evals/eval_data.yaml around lines 5 to 7, the eval_types field was changed to response_eval:accuracy but the description still references "keyword matching"; update the description to reflect the accuracy evaluation (e.g., mention accuracy-based comparison or exact match evaluation) so it aligns with the eval_types, keeping wording concise and consistent with the new evaluation method.

coderabbitai · 2025-10-01T18:47:42Z

test/evals/eval_data.yaml

            arguments: 
              params: |-
-                \{"ethernet_ifaces": \[\{"mac_address": "c5:d6:bc:f0:05:20", "name": "eth0"}\], "vlan_ifaces": \[{"name": "vlan0", "vlan_id": 400, "base_interface_name": "eth0", "ipv4_address": {"address": "10.0.0.5", "cidr_length": 24}}\], "dns": {"dns_servers": \["8.8.8.8"\]}}
+                (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$


⚠️ Potential issue | 🔴 Critical

Regex bug: missing quotes on several JSON keys; pattern will under‑match

The lookaheads for vlan_id, base_interface_name, and ipv4_address are missing the opening double quote, causing false negatives. Correct them and keep the overall DOTALL + anchors.

- (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$ + (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?"vlan_id":\s*400\b)(?=.*?"name":\s*"vlan0")(?=.*?"base_interface_name":\s*"eth0")(?=.*?"ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}).*?\}\s*\])(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]\s*\}).*$

Optional: prefer structural checks (e.g., JSON/YAML parse + field assertions) over regex to reduce flakiness.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$

(?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?"vlan_id":\s*400\b)(?=.*?"name":\s*"vlan0")(?=.*?"base_interface_name":\s*"eth0")(?=.*?"ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}).*?\}\s*\])(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]\s*\}).*$

🤖 Prompt for AI Agents

In test/evals/eval_data.yaml around line 218, the regex lookaheads are missing the opening double quotes for the JSON keys vlan_id, base_interface_name, and ipv4_address which causes under‑matching; update the pattern to include the leading double quote for those keys (e.g., change instances like (?=.*?vlan_id": to (?=.*?"vlan_id": and similarly for base_interface_name and ipv4_address), preserving the DOTALL flag (?s) and the start/end anchors; optionally replace this fragile regex approach with a structural check by parsing the YAML/JSON and asserting the required fields instead of long regexes.

andrej1991 · 2025-10-01T19:41:40Z

/test eval-test

…not_discovered test would be more consistent.Also changing the basic_introduction_conv from sub-string to accuracy

andrej1991 · 2025-10-02T08:19:26Z

/test eval-test

andrej1991 · 2025-10-02T09:16:11Z

/test eval-test

andrej1991 · 2025-10-02T11:51:08Z

/test eval-test

andrej1991 · 2025-10-02T13:50:49Z

/test eval-test

openshift-ci · 2025-10-02T15:37:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrej1991, carbonin, keitwb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [carbonin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot added the do-not-merge/work-in-progress label Oct 1, 2025

andrej1991 force-pushed the eval_test_v3 branch from 6abf5d1 to 6c4e85b Compare October 1, 2025 14:43

keitwb approved these changes Oct 1, 2025

View reviewed changes

openshift-ci bot assigned keitwb Oct 1, 2025

openshift-ci bot added the lgtm label Oct 1, 2025

keitwb reviewed Oct 1, 2025

View reviewed changes

andrej1991 marked this pull request as ready for review October 1, 2025 16:43

openshift-ci bot removed the do-not-merge/work-in-progress label Oct 1, 2025

openshift-ci bot requested review from carbonin and jhernand October 1, 2025 16:43

andrej1991 force-pushed the eval_test_v3 branch from 6c4e85b to 725f39a Compare October 1, 2025 16:44

openshift-ci bot removed the lgtm label Oct 1, 2025

carbonin reviewed Oct 1, 2025

View reviewed changes

andrej1991 force-pushed the eval_test_v3 branch from 725f39a to 4e605a1 Compare October 1, 2025 18:39

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

changing the prompt so the mno_cluster_workflow_conv/host_booted_but_…

982de33

…not_discovered test would be more consistent.Also changing the basic_introduction_conv from sub-string to accuracy

andrej1991 force-pushed the eval_test_v3 branch from 4e605a1 to 982de33 Compare October 1, 2025 19:50

carbonin approved these changes Oct 2, 2025

View reviewed changes

openshift-ci bot assigned carbonin Oct 2, 2025

openshift-ci bot added the lgtm label Oct 2, 2025

openshift-ci bot added the approved label Oct 2, 2025

openshift-merge-bot bot merged commit d462648 into rh-ecosystem-edge:main Oct 2, 2025
8 checks passed

coderabbitai bot mentioned this pull request Oct 13, 2025

Prompt change to reduce hallucinations of static net config #227

Merged

coderabbitai bot mentioned this pull request Oct 28, 2025

Make eval tests more reliable #239

Merged

coderabbitai bot mentioned this pull request Nov 18, 2025

MGMT-22245: eval-test update for the SSH key fix #253

Closed

coderabbitai bot mentioned this pull request Dec 14, 2025

feat: add conditional SSH key prompt #275

Merged

changing the prompt so the host_booted_but_not_discovered test would be more consistent #223

changing the prompt so the host_booted_but_not_discovered test would be more consistent #223

Uh oh!

Conversation

andrej1991 commented Oct 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

openshift-ci bot commented Oct 1, 2025

Uh oh!

andrej1991 commented Oct 1, 2025

Uh oh!

andrej1991 commented Oct 1, 2025

Uh oh!

keitwb left a comment

Choose a reason for hiding this comment

Uh oh!

keitwb Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

carbonin Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

andrej1991 commented Oct 1, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

andrej1991 commented Oct 1, 2025

Uh oh!

andrej1991 commented Oct 2, 2025

Uh oh!

andrej1991 commented Oct 2, 2025

Uh oh!

andrej1991 commented Oct 2, 2025

Uh oh!

andrej1991 commented Oct 2, 2025

Uh oh!

openshift-ci bot commented Oct 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andrej1991 commented Oct 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 1, 2025 •

edited

Loading