-
Notifications
You must be signed in to change notification settings - Fork 21
changing the prompt so the host_booted_but_not_discovered test would be more consistent #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changing the prompt so the host_booted_but_not_discovered test would be more consistent #223
Conversation
WalkthroughGeneralized explicit tool-call wording in the static networking flow in Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User
participant A as Assistant
participant T as Network Tool
participant H as Hosts
U->>A: Request static network configuration
A->>T: Generate initial nmstate YAML (proper tool call)
T-->>A: YAML
A->>U: Present YAML and ask for tweaks
U-->>A: Confirm / request changes
A->>T: Validate YAML (proper tool call)
T-->>A: Validation result
A->>H: Apply configuration
H-->>A: Result
alt No hosts discovered
A->>U: Report no hosts found
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Skipping CI for Draft Pull Request. |
|
/test eval-test |
6abf5d1 to
6c4e85b
Compare
|
/test eval-test |
keitwb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prompt changes and regex lookaheads look good.
test/evals/eval_data.yaml
Outdated
| arguments: | ||
| params: |- | ||
| \{"ethernet_ifaces": \[\{"mac_address": "c5:d6:bc:f0:05:20", "name": "eth0"}\], "vlan_ifaces": \[{"name": "vlan0", "vlan_id": 400, "base_interface_name": "eth0", "ipv4_address": {"address": "10.0.0.5", "cidr_length": 24}}\], "dns": {"dns_servers": \["8.8.8.8"\]}} | ||
| (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if the eval tests would just compare objects without caring about key sort order but I guess this isn't available yet.
6c4e85b to
725f39a
Compare
template.yaml
Outdated
| 3. **Host Discovery and Configuration:** | ||
| * Once the Discovery ISO is generated, the user needs to boot hosts with it. | ||
| * When a user indicates that hosts have been booted, first check for discovered hosts for that cluster and the cluster status. | ||
| * If no host were discovered indicate the error to the user and offer help in the possible further operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally there is no error when hosts are not discovered. The cluster just remains "insufficient" .
I'm also not sure what "offer help in the possible further operations" is supposed to mean.
|
/test eval-test |
725f39a to
4e605a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
test/evals/eval_data.yaml (2)
110-110: Fix response string typo“The SSH public key is set for the cluster for cluster” → remove duplicate phrase.
- expected_response: The SSH public key is set for the cluster for cluster + expected_response: The SSH public key has been set for the cluster.
121-123: Grammar for host discovery messageUse a clear plural form: “No hosts have been discovered yet.”
- expected_response: "hosts hasn't been discovered yet." + expected_response: "No hosts have been discovered yet."
🧹 Nitpick comments (4)
test/evals/eval_data.yaml (1)
173-173: Platform wording clarity (on‑premises, virtualization platforms)Tighten wording and fix “on‑premise” to “on‑premises”; call out virtualization platforms rather than “VMs like …”.
- expected_response: I can help you install OpenShift on-premise using the Assisted Installer, either on bare metal servers or virtual machines (VMs) like vSphere, KVM or libvirt. I do not support public cloud platforms like Amazon Web Services (AWS), Azure, or Google Cloud Platform (GCP). + expected_response: I can help you install OpenShift on‑premises using the Assisted Installer, either on bare metal servers or on virtualization platforms like vSphere, KVM, or libvirt. I do not support public cloud platforms like Amazon Web Services (AWS), Azure, or Google Cloud Platform (GCP).template.yaml (3)
208-208: Use “on‑premises” and clarify virtualization platformsMirror the eval wording and fix terminology.
- - Supported: On-premise OpenShift installs via Assisted Installer on baremetal hosts or virtual machines (VMs) like vSphere, KVM or libvirt. + - Supported: On‑premises OpenShift installs via Assisted Installer on bare‑metal hosts or on virtualization platforms like vSphere, KVM, or libvirt.
250-257: Remove explicit tool name; keep generalized phrasing; minor style nitsLine 255 still names generate_nmstate_yaml, which contradicts the generalized “proper tool call” approach and the non‑disclosure guidance. Also capitalize YAML consistently.
- * If the user wants static network configuration, you should first remind them of any existing static network configuration already present on the cluster by using the appropriate tool call. Show them the YAML only and not the mac_interface_map. + * If the user wants static network configuration, first remind them of any existing static network configuration already present on the cluster using the appropriate tool call. Show YAML only (not the mac_interface_map). - * Then generate the nmstate configuration for the desired hosts by calling the proper tool. Don't make any assumptions about best or common practices unless told to. + * Then generate the nmstate configuration for the desired hosts via the proper tool. Don't assume best/common practices unless the user asks for them. - * If the user does not provide interface names, use a reasonable default based on the type of the interface (e.g. for ethernet use eth0, eth1, etc). + * If the user does not provide interface names, use a reasonable default based on the interface type (e.g., for ethernet use eth0, eth1, etc.). - * After generating the initial yaml ask the user if they want to tweak anything. + * After generating the initial YAML, ask the user if they want to tweak anything. - * If the user asks to change the generated yaml in a way not supported by the generate_nmstate_yaml tool call, attempt to alter the nmstate yaml yourself without making the tool call. + * If the user asks to change the generated YAML in a way not supported by the generation tool, attempt to alter the nmstate YAML yourself without making that tool call. - * After modifying nmstate yaml, validate it with the proper tool call before presenting it to the user. + * After modifying nmstate YAML, validate it with the proper tool call before presenting it to the user.
270-270: Grammar: pluralize “hosts”“host were” → “hosts were”.
- * If no host were discovered indicate it to the user. Do not assume any hosts were discovered. + * If no hosts were discovered, indicate it to the user. Do not assume any hosts were discovered.
| eval_types: [response_eval:accuracy] | ||
| expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer. | ||
| description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eval type/description mismatch for basic_introduction
You switched to accuracy but the description still says “keyword matching.” Align the description with the new evaluation type to avoid confusion.
- eval_types: [response_eval:accuracy]
- expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer.
- description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping)
+ eval_types: [response_eval:accuracy]
+ expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer.
+ description: Basic greeting test using exact-match expected_response for consistency📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| eval_types: [response_eval:accuracy] | |
| expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer. | |
| description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping) | |
| eval_types: [response_eval:accuracy] | |
| expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer. | |
| description: Basic greeting test using exact-match expected_response for consistency |
🤖 Prompt for AI Agents
In test/evals/eval_data.yaml around lines 5 to 7, the eval_types field was
changed to response_eval:accuracy but the description still references "keyword
matching"; update the description to reflect the accuracy evaluation (e.g.,
mention accuracy-based comparison or exact match evaluation) so it aligns with
the eval_types, keeping wording concise and consistent with the new evaluation
method.
test/evals/eval_data.yaml
Outdated
| arguments: | ||
| params: |- | ||
| \{"ethernet_ifaces": \[\{"mac_address": "c5:d6:bc:f0:05:20", "name": "eth0"}\], "vlan_ifaces": \[{"name": "vlan0", "vlan_id": 400, "base_interface_name": "eth0", "ipv4_address": {"address": "10.0.0.5", "cidr_length": 24}}\], "dns": {"dns_servers": \["8.8.8.8"\]}} | ||
| (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex bug: missing quotes on several JSON keys; pattern will under‑match
The lookaheads for vlan_id, base_interface_name, and ipv4_address are missing the opening double quote, causing false negatives. Correct them and keep the overall DOTALL + anchors.
- (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$
+ (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?"vlan_id":\s*400\b)(?=.*?"name":\s*"vlan0")(?=.*?"base_interface_name":\s*"eth0")(?=.*?"ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}).*?\}\s*\])(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]\s*\}).*$Optional: prefer structural checks (e.g., JSON/YAML parse + field assertions) over regex to reduce flakiness.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?vlan_id":\s*400\b)(?=.*?name":\s*"vlan0")(?=.*?base_interface_name":\s*"eth0")(?=.*?ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}))(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]).*$ | |
| (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?"vlan_id":\s*400\b)(?=.*?"name":\s*"vlan0")(?=.*?"base_interface_name":\s*"eth0")(?=.*?"ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}).*?\}\s*\])(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]\s*\}).*$ |
🤖 Prompt for AI Agents
In test/evals/eval_data.yaml around line 218, the regex lookaheads are missing
the opening double quotes for the JSON keys vlan_id, base_interface_name, and
ipv4_address which causes under‑matching; update the pattern to include the
leading double quote for those keys (e.g., change instances like (?=.*?vlan_id":
to (?=.*?"vlan_id": and similarly for base_interface_name and ipv4_address),
preserving the DOTALL flag (?s) and the start/end anchors; optionally replace
this fragile regex approach with a structural check by parsing the YAML/JSON and
asserting the required fields instead of long regexes.
|
/test eval-test |
…not_discovered test would be more consistent.Also changing the basic_introduction_conv from sub-string to accuracy
4e605a1 to
982de33
Compare
|
/test eval-test |
3 similar comments
|
/test eval-test |
|
/test eval-test |
|
/test eval-test |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrej1991, carbonin, keitwb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d462648
into
rh-ecosystem-edge:main
changing the prompt so the mno_cluster_workflow_conv/host_booted_but_not_discovered test would be more consistent.
Also changing the basic_introduction_conv from sub-string to accuracy
Summary by CodeRabbit
New Features
Documentation
Tests