Skip to content

Conversation

@theobarberbany
Copy link
Contributor

@theobarberbany theobarberbany commented Nov 27, 2025

This is experimental.

Vibe coded test suite to test our vibe coded commands to review our vibe coded code.

image

Uses LLM as a judge (claude) to review output of the /api-review command.

Directory Structure

tests/eval/testdata/
├── golden/                     # Base truth tests - single isolated issues
│   ├── missing-optional-doc/
│   │   ├── patch.diff          # Triggers ONLY missing-optional-doc
│   │   └── expected.txt
│   ├── undocumented-enum/
│   │   ├── patch.diff          # Triggers ONLY undocumented-enum
│   │   └── expected.txt
│   ├── missing-featuregate/
│   │   ├── patch.diff          # Triggers ONLY missing-featuregate
│   │   └── expected.txt
│   └── valid-api-change/
│       ├── patch.diff          # Triggers NO issues
│       └── expected.txt
└── integration/                # Complex scenarios - multiple issues
    ├── new-field-all-issues/
    │   ├── patch.diff          # Triggers multiple issues together
    │   └── expected.txt
    └── partial-documentation/
        ├── patch.diff
        └── expected.txt

Test Case Format

patch.diff

Standard git diff format:

diff --git a/config/v1/types.go b/config/v1/types.go
--- a/config/v1/types.go
+++ b/config/v1/types.go
@@ -10,0 +11,5 @@
+// MyField does something
+// +optional
+// +kubebuilder:validation:Enum=Foo;Bar
+MyField string `json:"myField"`

expected.txt

One expected issue per line:

enum values Foo and Bar not documented in comment
optional field does not explain behavior when omitted

Empty file means the API change should pass review with no issues.

Note: Order of issues in expected.txt does not matter. Comparison uses semantic matching, not exact string matching.

@openshift-ci-robot
Copy link

Pipeline controller notification
This repository is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 27, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 27, 2025

Hello @theobarberbany! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@coderabbitai
Copy link

coderabbitai bot commented Nov 27, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Excluded labels (none allowed) (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 27, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 27, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign everettraven for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@theobarberbany theobarberbany force-pushed the test-evals branch 2 times, most recently from 99de6d4 to 06bca4c Compare November 27, 2025 13:54
@theobarberbany theobarberbany marked this pull request as draft November 27, 2025 14:15
This builds a basic go test suite that uses claude as a judge to review
the output of the /api-review command.

See tests/eval/DESIGN.md for more details.
@openshift-ci openshift-ci bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants