MG-128: Add/Update e2e tests by praveencodes · Pull Request #311 · openshift/must-gather-operator

praveencodes · 2025-11-05T15:43:49Z

No description provided.

openshift-ci · 2025-11-05T15:43:54Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci-robot · 2025-11-05T15:43:54Z

@praveencodes: This pull request references MG-128 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2025-11-05T15:44:06Z

Walkthrough

Add a Ginkgo-based e2e test framework: new Makefile e2e target and docs, comprehensive e2e tests and helpers, RBAC test manifests, dynamic resource loader and kube client utilities; migrate must-gather examples to a spec.uploadTarget.sftp shape; remove legacy OSDE2E artifacts and adjust dependencies.

Changes

Cohort / File(s)	Summary
Makefile & e2e docs `Makefile`, `test/e2e/README.md`	Add `E2E_TIMEOUT` (default 1h) and `test-e2e` target running `go test` with `-timeout`, `-count 1`, `-v`, `-p 1`, `-tags e2e`, plus Ginkgo flags; README updated to run tests via `KUBECONFIG=... make test-e2e`.
New Ginkgo e2e suite & runner `test/e2e/must_gather_operator_test.go`, `test/e2e/must_gather_operator_runner_test.go`	Add comprehensive Ginkgo tests (RBAC, impersonation, MustGather lifecycle, Job/Pod checks) and runner with BeforeSuite, dynamic JUnit path, config setup, and package/tag switched to `e2e`.
Removed legacy e2e artifacts `test/e2e/must_gather_operator_tests.go`, `test/e2e/Dockerfile`, `test/e2e/e2e-template.yml`, `test/must-gather.yaml`	Delete old OSDE2E test source, e2e Dockerfile, OpenShift job template, and a removed test manifest.
e2e test library & helpers `test/library/kube_client.go`, `test/library/dynamic_resources.go`, `test/library/utils.go`	Add kubeconfig/client helpers, DynamicResourceLoader (CreateFromFile/DeleteFromFile, REST mapping, subject namespace updates), and namespace/event utilities for tests.
e2e testdata RBAC manifests `test/e2e/testdata/serviceaccount.yaml`, `test/e2e/testdata/serviceaccount-clusterrole.yaml`, `test/e2e/testdata/serviceaccount-clusterrole-binding.yaml`, `test/e2e/testdata/nonadmin-clusterrole.yaml`, `test/e2e/testdata/nonadmin-clusterrole-binding.yaml`	Add ServiceAccount, ClusterRole(s), and ClusterRoleBinding(s) used by the new e2e tests.
MustGather examples & README updates `examples/.yaml` (e.g. `examples/mustgather_.yaml`), `README.md`	Move `caseID` and `caseManagementAccountSecretRef` from top-level `spec` into `spec.uploadTarget` with `type: SFTP` and nested `sftp` block across examples; update `apiVersion` in one example; add proxy `noProxy` where applicable; README examples adjusted.
Dependency & boilerplate changes `go.mod`, `boilerplate/update.cfg`	Remove OSDE2E-related dependencies (e.g., osde2e-common, e2e-framework entries), add `github.com/stretchr/testify v1.10.0` and `github.com/pmezard/go-difflib v1.0.0`; remove `openshift/golang-osd-e2e` entry from boilerplate config.
Ignored artifacts `.gitignore`	Add `test/e2e/test-run-results/` to .gitignore (two locations).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

test/e2e/must_gather_operator_runner_test.go (1)
20-20: Update the outdated comment.

The comment still references "osde2e" but should be updated to reflect the new e2e test context.

Apply this diff:
-// Test entrypoint. osde2e runs this as a test suite on test pod.
+// Test entrypoint. e2e framework runs this as a test suite on test pod.

🧹 Nitpick comments (1)

test/e2e/must_gather_operator_tests.go (1)

882-921: Remove or uncomment the commented test cases.

Two test cases are commented out (lines 882-915 and 917-921). If these tests are not yet ready, consider using ginkgo.PIt() for pending tests or remove them entirely to improve code cleanliness.

If the tests are pending implementation:

Keep the ginkgo.PIt() marker on line 917 (already present)

Either uncomment lines 882-915 or remove them if they won't be implemented

If the tests are obsolete:

Remove lines 882-921 entirely

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 985c9e2 and 03a5eb7.

📒 Files selected for processing (12)

Makefile (1 hunks)
README.md (3 hunks)
examples/mustgather_basic.yaml (1 hunks)
examples/mustgather_full.yaml (1 hunks)
examples/mustgather_non_internal_user.yaml (1 hunks)
examples/mustgather_proxy.yaml (1 hunks)
examples/mustgather_retain_resources.yaml (1 hunks)
examples/mustgather_timeout.yaml (1 hunks)
go.mod (1 hunks)
test/e2e/must_gather_operator_runner_test.go (1 hunks)
test/e2e/must_gather_operator_tests.go (5 hunks)
test/must-gather.yaml (1 hunks)

🧰 Additional context used

🪛 checkmake (0.2.2)

Makefile

[warning] 15-15: Missing required phony target "all"

(minphony)

[warning] 15-15: Missing required phony target "clean"

(minphony)

[warning] 15-15: Missing required phony target "test"

(minphony)

🔇 Additional comments (26)

test/e2e/must_gather_operator_runner_test.go (1)

2-4: LGTM! Build tag and package name updated correctly.

The migration from osde2e to e2e is consistent and aligns with the PR objectives.

examples/mustgather_basic.yaml (1)

8-13: LGTM! Consistent with the API restructuring.

The uploadTarget structure is consistent with other examples and test files in this PR.

examples/mustgather_timeout.yaml (1)

8-14: LGTM! Consistent restructuring while preserving other fields.

The uploadTarget structure is consistent, and the mustGatherTimeout field correctly remains at the spec level.

examples/mustgather_proxy.yaml (1)

8-17: LGTM! Consistent restructuring with proxy configuration.

The uploadTarget structure is consistent with other examples. The proxyConfig includes the standard noProxy field, which is appropriate for excluding certain hosts from proxying.

examples/mustgather_full.yaml (1)

8-13: LGTM! Consistent with the API restructuring.

The uploadTarget structure matches the pattern applied across all MustGather examples.
README.md (4)
14-20: LGTM! Documentation updated to reflect the new API structure.

The example correctly shows the uploadTarget.sftp structure consistent with the code changes.

22-22: Verify secret name reference consistency.

The text mentions caseManagementCreds secret, but the examples and the secret creation command (Line 93) use case-management-creds (with hyphens). Ensure the reference is accurate.

Apply this diff if the hyphenated form is correct:
-This request will collect the standard must-gather info and upload it to case `#02527285` using the credentials found in the `caseManagementCreds` secret.
+This request will collect the standard must-gather info and upload it to case `#02527285` using the credentials found in the `case-management-creds` secret.
35-42: LGTM! Full example documentation updated correctly.

The example-mustgather-full documentation reflects the new uploadTarget structure and correctly shows the audit field configuration.

56-66: LGTM! Proxy example documentation updated correctly.

The proxy configuration example correctly shows both the new uploadTarget structure and the proxy settings.
go.mod (1)

16-16: No action required—dependency promotion is justified.

The codebase directly imports k8s.io/apiextensions-apiserver in test/e2e/must_gather_operator_tests.go, confirming that promoting this dependency from indirect to direct in go.mod is appropriate.

examples/mustgather_non_internal_user.yaml (1)

8-14: Consistent API restructuring.

The uploadTarget.sftp structure is consistent with the changes in other example files. The internalUser field has been moved under the sftp block, which maintains logical grouping of SFTP-related configuration.

test/e2e/must_gather_operator_tests.go (11)

1-5: LGTM! Proper e2e test build tags.

The build tags and package name correctly isolate these e2e tests from unit tests, allowing selective execution via the -tags e2e build flag.

94-164: LGTM! Comprehensive permission validation.

This test thoroughly validates that the required service accounts exist and have the necessary permissions for job and secret management. The granular checks help catch RBAC misconfigurations early.

166-223: LGTM! Well-structured job creation test.

The test properly verifies the MustGather-to-Job lifecycle, including service account propagation, container configuration, and cleanup behavior.

225-261: LGTM! Good negative test case.

Testing failure scenarios with non-existent service accounts validates the operator's error handling and helps ensure graceful degradation.

263-356: LGTM! Comprehensive lifecycle test.

This test validates the complete MustGather workflow from creation through status updates to completion, with appropriate status logging for debugging.

358-400: LGTM! Thorough permission verification.

The SubjectAccessReview-based permission checks provide concrete validation of RBAC configuration without relying on actual resource operations.

402-623: LGTM! Thorough node log collection test.

This test comprehensively validates node log collection capabilities, including permission checks, pod lifecycle, and event monitoring for errors.

625-880: LGTM! Comprehensive CRD collection test.

The test thoroughly validates CRD and custom resource collection, including creating test CRs, verifying permissions, and ensuring non-destructive collection.

924-944: LGTM! Useful helper functions.

Both helper functions are well-implemented:

getNodeRoles() provides clean role extraction from node labels

boolPtr() is a common utility for converting bool to pointer

24-24: Dependency correctly declared as direct.

The verification confirms that k8s.io/apiextensions-apiserver v0.33.0 is properly declared as a direct dependency in go.mod (no // indirect marker present). The import of apiextensionsv1 in the test file is properly supported by the dependency declaration.

176-177: No issues found — the ServiceAccountName field is correctly defined and used.

The verification confirms that ServiceAccountName exists in the MustGatherSpec struct (api/v1alpha1/mustgather_types.go, line 32) and the test code at lines 176-177 uses it with the correct Go field name syntax.

Makefile (2)

11-14: LGTM! Reasonable e2e test configuration.

The 1-hour timeout is appropriate for end-to-end tests, and the platform-specific label filter allows selective test execution based on cloud provider.

16-26: LGTM! Properly configured e2e test target.

The test target correctly:

Uses the e2e build tag to select only e2e tests

Sets appropriate timeout and verbosity flags

Enables node event visibility for debugging

Runs tests serially (-p 1) to avoid resource conflicts

Uses -count 1 to disable test caching

examples/mustgather_retain_resources.yaml (2)

8-13: YAML configuration correctly aligns with the API definition.

The nested uploadTarget.sftp structure is properly defined in the CRD schema:

UploadTarget field in MustGatherSpec is typed as *UploadTargetSpec

UploadTargetSpec contains Type field (required, enum SFTP) and SFTP field (*SFTPSpec)

SFTPSpec defines CaseID (required string) and CaseManagementAccountSecretRef (required), both present in the YAML configuration

All required fields are provided and use correct field names matching the JSON tags.

1-1: The review comment is incorrect.

The verification shows that the codebase contains no references to managed.openshift.io/v1alpha1 anywhere. All examples, tests, and documentation already use operator.openshift.io/v1alpha1 throughout. The file examples/mustgather_retain_resources.yaml is already in the correct state with the new API version on line 1. The migration from the old API version is already complete—there is no actual breaking change to address in the current codebase state.

Likely an incorrect or invalid review comment.

test/e2e/must_gather_operator_tests.go

coderabbitai · 2025-11-06T04:39:17Z

test/must-gather.yaml

+  uploadTarget:
+    type: SFTP
+    sftp:
+      caseID: '02527285'
+      caseManagementAccountSecretRef:
+        name: case-management-creds


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify CRD schema and controller support for the new uploadTarget structure.

This structural change moves authentication fields under uploadTarget.sftp. Ensure that:

The CRD schema has been updated to support this new nested structure

The controller/reconciler code has been updated to read from uploadTarget.sftp instead of the top-level spec fields

There's a migration strategy for existing MustGather resources using the old structure

Run the following script to verify the implementation:

🏁 Script executed:

#!/bin/bash # Description: Verify CRD schema and controller support for uploadTarget structure # Check CRD for uploadTarget definition echo "=== Checking CRD schema ===" fd -e yaml -e yml 'crd|mustgather' --exec cat {} | rg -A 20 'uploadTarget|sftp' | head -50 # Check controller/reconciler for uploadTarget usage echo -e "\n=== Checking controller implementation ===" rg -n --type=go 'uploadTarget|UploadTarget' -g '!vendor/' -C 3

Length of output: 20995

Verify whether the API restructuring provides a backward-compatible migration path for existing MustGather resources.

The review comment correctly identifies a backward-compatibility issue. Moving a field in the API schema is a breaking change, and this PR restructures the upload credentials from top-level spec fields into a nested uploadTarget.sftp structure. Based on the script output:

The controller reads from instance.Spec.UploadTarget.SFTP (lines 159-160 in mustgather_controller.go), not from deprecated top-level fields

No conversion logic or backward-compatibility code is present to handle existing resources using the old flat structure

The UploadTarget field is optional, so new resources work, but this does NOT help existing MustGather resources that currently have credentials at the top level

While v1alpha1 is pre-release API, if deployed instances exist in the wild, this change will silently break them. The PR should either:

Include controller logic to read from both old top-level fields and new nested structure temporarily

Document this as a breaking change requiring manual resource migration

Add conversion webhooks if supporting multiple versions

codecov-commenter · 2025-11-06T04:41:49Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.98%. Comparing base (7d9c76d) to head (8791f83).

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #311   +/-   ##
=======================================
  Coverage   70.98%   70.98%           
=======================================
  Files           7        7           
  Lines         479      479           
=======================================
  Hits          340      340           
  Misses        136      136           
  Partials        3        3

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

test/e2e/must_gather_operator_tests.go (2)

624-626: Replace hardcoded sleep with polling.

The 30-second sleep makes the test unnecessarily slow and may still be insufficient in some environments. Poll the container state instead.

Apply this diff:

-		ginkgo.By("Verifying gather container is collecting data")
-		time.Sleep(30 * time.Second) // Allow time for data collection
-
+		ginkgo.By("Verifying gather container is collecting data")
+		Eventually(func() bool {
+			pod := &corev1.Pod{}
+			err := nonAdminClient.Get(testCtx, client.ObjectKey{
+				Name:      gatherPod.Name,
+				Namespace: testNamespace,
+			}, pod)
+			if err != nil {
+				return false
+			}
+			for _, cs := range pod.Status.ContainerStatuses {
+				if cs.Name == "gather" && (cs.State.Running != nil || cs.State.Terminated != nil) {
+					return true
+				}
+			}
+			return false
+		}).WithTimeout(2*time.Minute).WithPolling(5*time.Second).Should(BeTrue(),
+			"Gather container should be running or terminated")
+

797-798: Replace hardcoded sleep with polling for events.

The 45-second sleep unnecessarily slows down the test. Poll for relevant events or container status instead.

Apply this diff:

-		ginkgo.By("Checking for permission denied errors in events")
-		time.Sleep(45 * time.Second) // Allow time for potential errors
-
+		ginkgo.By("Checking for permission denied errors in events")
+		// Wait for container to start and potentially encounter permission issues
+		Eventually(func() corev1.PodPhase {
+			pod := &corev1.Pod{}
+			_ = adminClient.Get(testCtx, client.ObjectKey{
+				Name:      gatherPod.Name,
+				Namespace: testNamespace,
+			}, pod)
+			return pod.Status.Phase
+		}).WithTimeout(2*time.Minute).WithPolling(5*time.Second).Should(
+			Or(Equal(corev1.PodRunning), Equal(corev1.PodSucceeded), Equal(corev1.PodFailed)),
+			"Pod should reach a terminal or running state")
+

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 03a5eb7 and 7f1f989.

📒 Files selected for processing (1)

test/e2e/must_gather_operator_tests.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

test/e2e/must_gather_operator_tests.go

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7f1f989 and e58eb28.

📒 Files selected for processing (1)

test/e2e/must_gather_operator_tests.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

test/e2e/must_gather_operator_tests.go

🔇 Additional comments (2)

test/e2e/must_gather_operator_tests.go (2)

51-961: E2E suite structure and coverage look strong

The ordered suite, context breakdown (admin setup, non-admin flows, SA permissions, security, cleanup), and mix of SAR-based checks plus impersonated controller-runtime clients give you very thorough coverage of the RBAC and MustGather lifecycle from both admin and non-admin perspectives. Assertions and Eventually/Consistently polling patterns are appropriate for e2e and should keep flakiness manageable.

963-1017: Helper orchestration and cleanup utilities are well factored

The helper functions (createNonAdminClient, verifyOperatorDeployment, createTestNamespace, cleanupTestResources) keep the core specs readable and centralize cluster mutations. Labels like test: "non-admin-must-gather" and the final foreground namespace/clusterrole/clusterrolebinding deletes make it much easier to reason about test isolation and to clean up after runs.

Also applies to: 1237-1260

coderabbitai · 2025-11-18T16:35:01Z

test/e2e/must_gather_operator_tests.go

+func createNonAdminClusterRole() {
+	cr := &rbacv1.ClusterRole{
+		ObjectMeta: metav1.ObjectMeta{
+			Name: nonAdminCRRoleName,
+			Labels: map[string]string{
+				"test": "non-admin-must-gather",
+			},
+		},
+		Rules: []rbacv1.PolicyRule{
+			{
+				APIGroups: []string{"operator.openshift.io"},
+				Resources: []string{"mustgathers"},
+				Verbs:     []string{"*"},
+			},
+			{
+				APIGroups: []string{"operator.openshift.io"},
+				Resources: []string{"mustgathers/status"},
+				Verbs:     []string{"get", "list", "watch"},
+			},
+			{
+				APIGroups: []string{"batch"},
+				Resources: []string{"jobs"},
+				Verbs:     []string{"get", "list", "watch"},
+			},
+			{
+				APIGroups: []string{""},
+				Resources: []string{"pods"},
+				Verbs:     []string{"get", "list", "watch"},
+			},
+			{
+				APIGroups: []string{""},
+				Resources: []string{"pods/log"},
+				Verbs:     []string{"get", "list"},
+			},
+			{
+				APIGroups: []string{""},
+				Resources: []string{"events"},
+				Verbs:     []string{"get", "list"},
+			},
+		},
+	}
+
+	err := adminClient.Create(testCtx, cr)
+	if apierrors.IsAlreadyExists(err) {
+		// Update if already exists
+		_ = adminClient.Update(testCtx, cr)
+	} else {
+		Expect(err).NotTo(HaveOccurred(), "Failed to create ClusterRole for non-admin")
+	}
+
+	ginkgo.GinkgoWriter.Printf("✅ Created ClusterRole: %s\n", nonAdminCRRoleName)
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

Fix RBAC Create/Update logic to be truly idempotent

In createNonAdminClusterRole, createNonAdminClusterRoleBinding, and createSARBACForMustGather, the pattern:

Create(...)

if apierrors.IsAlreadyExists(err) { _ = adminClient.Update(ctx, obj) }

is problematic. When Create returns AlreadyExists, the in-memory object you pass to Update has no resourceVersion, so Update will typically fail server-side, and that error is ignored. This means re-running the tests against a cluster where these RBAC objects already exist will not reliably reconcile them to the desired spec, which can lead to confusing, non‑reproducible behavior.

A safer pattern is:

On AlreadyExists, Get the existing resource, mutate its fields to match the desired spec (labels, rules, subjects, etc.), then Update that existing object and assert on the error.

Conceptual example for one of them:

err := adminClient.Create(testCtx, cr) if apierrors.IsAlreadyExists(err) { existing := &rbacv1.ClusterRole{} err = adminClient.Get(testCtx, client.ObjectKey{Name: cr.Name}, existing) Expect(err).NotTo(HaveOccurred(), "Failed to get existing ClusterRole") existing.Labels = cr.Labels existing.Rules = cr.Rules err = adminClient.Update(testCtx, existing) } Expect(err).NotTo(HaveOccurred(), "Failed to ensure ClusterRole for non-admin")

Applying this pattern to all three helpers will make the RBAC setup idempotent and more predictable across repeated e2e runs.

Please double-check the Update behavior for your controller-runtime/k8s versions, but in general Kubernetes requires a populated metadata.resourceVersion on updates, which the newly-constructed objects here won’t have after a failed Create.

Also applies to: 1072-1103, 1126-1193

I'll verify this review by examining the actual code patterns and confirming the Kubernetes/controller-runtime behavior.

🏁 Script executed:

#!/bin/bash # Examine the three functions mentioned in the review rg -n "func createNonAdminClusterRole|func createNonAdminClusterRoleBinding|func createSARBACForMustGather" test/e2e/must_gather_operator_tests.go -A 60

Length of output: 5119

🌐 Web query:

controller-runtime Update resourceVersion requirement Kubernetes

💡 Result:

Short answer: Kubernetes uses metadata.resourceVersion for optimistic concurrency; Update requests must include the object's current resourceVersion (so the API can reject stale updates). controller-runtime's client.Update sends whatever resourceVersion is on the object you pass, so you must update an object that carries the latest resourceVersion (or handle 409 conflicts by retrying). Use Get → modify → Update, or use RetryOnConflict + controllerutil.CreateOrUpdate, or switch to Patch/Server-Side-Apply when appropriate. [1][2][3]

Helpful references:

explanation of resourceVersion and optimistic locking. [1]

note that Update requires a resourceVersion from a current object and will be rejected on mismatch. [2]

common pattern: RetryOnConflict / CreateOrUpdate to handle "the object has been modified" errors. [3]

Fix RBAC Create/Update pattern in three test setup functions—missing resourceVersion causes silent failures

In createNonAdminClusterRole() (line 1061–1067), createNonAdminClusterRoleBinding() (line 1094–1099), and createSARBACForMustGather() (line 1156–1161 and 1185–1186), the pattern of calling Create() followed by _ = adminClient.Update(ctx, obj) on AlreadyExists error is broken. The newly constructed objects lack metadata.resourceVersion, which Kubernetes requires for updates. The server-side update will fail, but the error is silently ignored (via _ blank assignment), causing the functions to appear successful while the objects remain unreconciled to the desired spec. This breaks idempotency across repeated test runs.

Fix by retrieving the existing resource before update:

err := adminClient.Create(testCtx, cr) if apierrors.IsAlreadyExists(err) { existing := &rbacv1.ClusterRole{} err = adminClient.Get(testCtx, client.ObjectKey{Name: cr.Name}, existing) Expect(err).NotTo(HaveOccurred(), "Failed to get existing ClusterRole") existing.Labels = cr.Labels existing.Rules = cr.Rules err = adminClient.Update(testCtx, existing) } Expect(err).NotTo(HaveOccurred(), "Failed to ensure ClusterRole for non-admin")

Apply this pattern to all three helpers.

🤖 Prompt for AI Agents

In test/e2e/must_gather_operator_tests.go around lines 1019–1070 (and similarly the helpers at ~1094–1099 and ~1156–1186), the Create-if-exists Update pattern is broken because the newly constructed objects lack metadata.resourceVersion, so Update silently fails; fix by: after Create returns AlreadyExists, call adminClient.Get to load the existing resource into a local variable, copy/merge the desired fields (Labels, Rules, Subjects, RoleRef, etc.) from the desired object into that existing object, then call adminClient.Update on the fetched object and assert the error (Expect(err).NotTo(HaveOccurred(...))). Apply this same Get-merge-Update pattern to createNonAdminClusterRole(), createNonAdminClusterRoleBinding(), and createSARBACForMustGather() so updates succeed and are idempotent across runs.

praveencodes · 2025-11-19T05:58:30Z

/cc @swghosh @shivprakashmuley
As I removed the direct dependency github.com/openshift/osde2e-common and some indirect dependencies, I had to run go mod vendor and push the files in vendor directory. The verify-deps check is successful after that.

swghosh · 2025-11-25T08:58:25Z

/test e2e-gcp-operator

shivprakashmuley · 2025-11-26T14:38:19Z

@praveencodes can you please check e2e--gcp-operator job failures.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

test/e2e/must_gather_operator_runner_test.go (1)
1-27: Create the JUnit results directory and align its path with .gitignore

Right now:

testResultsDirectory is "./test-run-results", but .gitignore ignores test/e2e/test-run-results/, and

There is no os.MkdirAll call before Ginkgo tries to write the JUnit report.

This can lead to flaky/broken e2e runs (if the directory doesn’t exist) and to unignored artifacts in git status.

Consider something along these lines:
-const (
-	testResultsDirectory = "./test-run-results"
+const (
+	testResultsDirectory = "test/e2e/test-run-results"
@@
 func TestMustGatherOperator(t *testing.T) {
@@
-	if _, ok := os.LookupEnv("DISABLE_JUNIT_REPORT"); !ok {
-		reporterConfig.JUnitReport = filepath.Join(testResultsDirectory, jUnitOutputFilename)
-	}
+	if _, ok := os.LookupEnv("DISABLE_JUNIT_REPORT"); !ok {
+		if err := os.MkdirAll(testResultsDirectory, 0755); err != nil {
+			t.Fatalf("failed to create test results directory %q: %v", testResultsDirectory, err)
+		}
+		reporterConfig.JUnitReport = filepath.Join(testResultsDirectory, jUnitOutputFilename)
+	}
This keeps artifacts under test/e2e/, makes the .gitignore entry accurate, and ensures JUnit report generation doesn’t fail due to a missing directory.

🧹 Nitpick comments (1)

.gitignore (1)

90-90: Align ignored test results path with the runner’s output directory

This ignores test/e2e/test-run-results/, but the e2e runner currently writes JUnit results under ./test-run-results (see testResultsDirectory in test/e2e/must_gather_operator_runner_test.go), so those artifacts will still show up in git status. I’d either:

Move the output under test/e2e/test-run-results in the runner (keeps artifacts near the tests and makes this pattern correct), or

Change this entry to ignore the actual output path (e.g., test-run-results/ at repo root).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between c8caf91 and c88241c.

📒 Files selected for processing (7)

.gitignore (1 hunks)
Makefile (1 hunks)
boilerplate/update.cfg (0 hunks)
test/e2e/Dockerfile (0 hunks)
test/e2e/README.md (1 hunks)
test/e2e/e2e-template.yml (0 hunks)
test/e2e/must_gather_operator_runner_test.go (2 hunks)

💤 Files with no reviewable changes (3)

boilerplate/update.cfg
test/e2e/e2e-template.yml
test/e2e/Dockerfile

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

test/e2e/README.md
test/e2e/must_gather_operator_runner_test.go
Makefile

🪛 checkmake (0.2.2)

Makefile

[warning] 12-12: Missing required phony target "all"

(minphony)

[warning] 12-12: Missing required phony target "clean"

(minphony)

[warning] 12-12: Missing required phony target "test"

(minphony)

🪛 LanguageTool

test/e2e/README.md

[style] ~3-~3: Consider a more expressive alternative.
Context: ...ty is not breaking using e2e tests. To do this, following steps are recommended: ...

(DO_ACHIEVE)

🔇 Additional comments (2)

Makefile (1)

10-22: E2E test target wiring looks solid

The test-e2e target (timeout, single run, -p 1, -tags e2e, and Ginkgo flags after ./test/e2e) is correctly structured and should integrate cleanly with the new e2e suite.

test/e2e/README.md (1)

3-22: Docs correctly reflect the new e2e workflow

The updated steps (KUBECONFIG=... make test-e2e and optional DISABLE_JUNIT_REPORT) line up with the Makefile target and the test runner behavior; nothing concerning here.

praveencodes · 2025-11-27T05:11:23Z

@praveencodes can you please check e2e--gcp-operator job failures.

Fixed the JUnit report generation issue.

examples/mustgather_full.yaml

examples/mustgather_non_internal_user.yaml

test/e2e/must_gather_operator_tests.go

swghosh · 2025-12-03T13:19:06Z

test/e2e/must_gather_operator_tests.go

+		ginkgo.By("STEP 2: Admin creates test namespace")
+		createTestNamespace()
+
+		ginkgo.By("STEP 3: Admin defines privileges for MustGather CRs")
+		createNonAdminClusterRole()
+
+		ginkgo.By("STEP 4: Admin binds the non-admin user to the ClusterRole")
+		createNonAdminClusterRoleBinding()
+
+		ginkgo.By("STEP 5: Admin provides a ServiceAccount for execution")
+		createServiceAccount()
+		createSARBACForMustGather()
+


any specific reason for using functions here? couldn't find repeated used of those functions elsewhere / do we plan to reuse these later?

These functions are called only once in the BeforeAll hook in Describe block. I used it for better readability and isolation.

test/e2e/must_gather_operator_tests.go

test/e2e/must_gather_operator_runner_test.go

boilerplate/update.cfg

test/e2e/must_gather_operator_runner_test.go

test/e2e/must_gather_operator_tests.go

test/e2e/must_gather_operator_runner_test.go

test/e2e/must_gather_operator_tests.go

swghosh · 2025-12-10T19:54:39Z

test/e2e/must_gather_operator_tests.go

+		})
+	})
+
+	ginkgo.Context("Non-Admin User MustGather Collection Flow", func() {


Suggested change

ginkgo.Context("Non-Admin User MustGather Collection Flow", func() {

ginkgo.Context("Non-Admin User", func() {

swghosh · 2025-12-10T19:55:04Z

test/e2e/must_gather_operator_tests.go

+			}
+		})
+
+		ginkgo.It("non-admin user should successfully submit MustGather CR with SA", func() {


Suggested change

ginkgo.It("non-admin user should successfully submit MustGather CR with SA", func() {

ginkgo.It("should successfully create MustGather CR", func() {

examples/mustgather_proxy.yaml

test/e2e/must_gather_operator_tests.go

swghosh · 2025-12-10T20:09:39Z

test/e2e/must_gather_operator_tests.go

+				Resources: []string{"jobs", "cronjobs"},
+				Verbs:     []string{"get", "list"},
+			},
+			// Add more resources as needed for must-gather


yeah good catch, we need a li'l more that this to actually run a full gather. refer: #264

test/e2e/must_gather_operator_tests.go

swghosh · 2025-12-10T20:17:55Z

test/e2e/must_gather_operator_tests.go

+			ginkgo.By("Waiting for operator to create Job (non-admin user perspective)")
+			job := &batchv1.Job{}
+			Eventually(func() error {
+				err := nonAdminClient.Get(testCtx, client.ObjectKey{


it's little questionable choice if we wanna let the non-admin user get the pod and the jobs, for a test we can however user specific scenario can vary.

a real cluster admin would be hesitant to share the pod logs with the non-admin user atleast, considering the Job pod will have sensitive info cluster-wide.

As per this document, a non-admin user can get/list/watch the job and pod logs. Hence, added those tests.

For the non-admin user:
Must be able to create/get/list/watch MustGather CRs in the target namespace.
Likely needs get/list/watch on Jobs/Pods (and logs) in that namespace to monitor progress and fetch artifacts.

swghosh · 2025-12-10T20:20:36Z

test/e2e/must_gather_operator_tests.go

+				fetchedMG.Status.Status, fetchedMG.Status.Completed, fetchedMG.Status.Reason)
+		})
+
+		ginkgo.It("non-admin user should be able to monitor MustGather progress", func() {


instead we should look at improving the progress status before GA and let the non-admin user poll the conditions in the MustGather CR to get aware about the running state.

Ok. @shivprakashmuley What's your suggestion on this?

I think what we are trying to achieve as part of this test non-admin user should be able to monitor MustGather progress is non-admin should be able to see the CR status, if we want to simulate non-admin monitoring what is happening w.r.t. must-gather collection then we can poll conditions. But then again we don't update anything in conditions other than ReconcileSuccess.

For now we can rename this testcase to non-admin user should be able to get MustGather status and add in future add another testcase to monitor step by step must-gather collection through conditions.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (13)

test/e2e/testdata/nonadmin-clusterrole-binding.yaml (1)
4-4: Consider consistent naming for clarity.

The binding name must-gather-cr-binding doesn't follow the "nonadmin" naming pattern used in the ClusterRole (must-gather-nonadmin-clusterrole) and subject (must-gather-nonadmin-user). For consistency and clarity, consider renaming to must-gather-nonadmin-cr-binding.

Apply this diff:
-  name: must-gather-cr-binding
+  name: must-gather-nonadmin-cr-binding
test/library/utils.go (3)
35-37: Use consistent context throughout the function.

The function uses context.TODO() for polling but context.Background() for the namespace creation call. Consider using the same context throughout, or preferably use d.context (the context field stored in DynamicResourceLoader) for consistency and proper cancellation support.

Apply this diff:
-	if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) {
+	if err := wait.PollUntilContextTimeout(d.context, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {
 		var err error
-		got, err = d.KubeClient.CoreV1().Namespaces().Create(context.Background(), namespace, metav1.CreateOptions{})
+		got, err = d.KubeClient.CoreV1().Namespaces().Create(ctx, namespace, metav1.CreateOptions{})
60-63: Use consistent context throughout the function.

Similar to CreateTestingNS, this function uses context.TODO() for polling but context.Background() for API calls. Consider using d.context consistently.

Apply this diff:
-	if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) {
+	if err := wait.PollUntilContextTimeout(d.context, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {
 		// Poll until namespace is deleted
-		_, err := d.KubeClient.CoreV1().Namespaces().Get(ctx, name, metav1.GetOptions{})
+		_, err := d.KubeClient.CoreV1().Namespaces().Get(d.context, name, metav1.GetOptions{})
76-76: Use consistent context.

Use d.context instead of context.TODO() for consistency with the struct's context field.

Apply this diff:
-	events, err := d.KubeClient.CoreV1().Events(name).List(context.TODO(), metav1.ListOptions{})
+	events, err := d.KubeClient.CoreV1().Events(name).List(d.context, metav1.ListOptions{})
test/library/kube_client.go (2)
17-29: Consider simplifying error handling.

Since GetConfigForTest already calls require.NoError (line 39), the error check and require on line 23 is redundant. The test will abort in GetConfigForTest if there's an error. You could simplify this by removing the intermediate error variable and require call.

Apply this diff:
 func NewClientsConfigForTest(t *testing.T) (kubernetes.Interface, dynamic.Interface) {
 	config, err := GetConfigForTest(t)
-	if err == nil {
-		t.Logf("Found configuration for host %v.\n", config.Host)
-	}
-
-	require.NoError(t, err)
+	// GetConfigForTest already requires no error, so config is always valid here
+	t.Logf("Found configuration for host %v.\n", config.Host)
+	
 	kubeClient, err := kubernetes.NewForConfig(config)
 	require.NoError(t, err)
39-40: Simplify return after require.NoError.

Since require.NoError will abort the test on error, returning err is redundant. Consider returning just config, nil.

Apply this diff:
 	require.NoError(t, err)
-	return config, err
+	return config, nil
test/e2e/must_gather_operator_runner_test.go (1)
63-63: Consider aligning suite name with project name.

The suite name "support-log-gather e2e suite" doesn't clearly indicate this is the must-gather-operator e2e suite. Consider using a name that matches the project for clarity in test reports, such as "must-gather-operator e2e suite".

Apply this diff:
-	RunSpecs(t, "support-log-gather e2e suite", suiteConfig, reporterConfig)
+	RunSpecs(t, "must-gather-operator e2e suite", suiteConfig, reporterConfig)
test/library/dynamic_resources.go (3)
77-77: Silently ignoring error from SetNestedSlice.

The error from unstructured.SetNestedSlice is being discarded. While failures here are rare, silently ignoring the error could mask issues during test debugging.

Consider logging if the operation fails:
-	_ = unstructured.SetNestedSlice(obj.Object, subjects, "subjects")
+	if err := unstructured.SetNestedSlice(obj.Object, subjects, "subjects"); err != nil {
+		// Log but don't fail - this is a best-effort update
+		// The test will likely fail later with a more specific error
+	}
122-131: Inconsistent context usage in delete operation.

Line 125 uses context.TODO() instead of d.context which is stored in the struct. This inconsistency could cause issues if context cancellation is expected to propagate.
 func (d DynamicResourceLoader) DeleteFromFile(assetFunc func(name string) ([]byte, error), filename string, overrideNamespace string) {
 	d.t.Logf("Deleting resource %v\n", filename)
 	deleteFunc := func(t *testing.T, unstructured *unstructured.Unstructured, dynamicResourceInterface dynamic.ResourceInterface) {
-		err := dynamicResourceInterface.Delete(context.TODO(), unstructured.GetName(), metav1.DeleteOptions{})
+		err := dynamicResourceInterface.Delete(d.context, unstructured.GetName(), metav1.DeleteOptions{})
 		d.noErrorSkipNotExisting(err)
 	}
133-142: Same inconsistent context usage in create operation.

Line 136 uses context.TODO() instead of d.context. Apply the same fix for consistency.
 func (d DynamicResourceLoader) CreateFromFile(assetFunc func(name string) ([]byte, error), filename string, overrideNamespace string) {
 	d.t.Logf("Creating resource %v\n", filename)
 	createFunc := func(t *testing.T, unstructured *unstructured.Unstructured, dynamicResourceInterface dynamic.ResourceInterface) {
-		_, err := dynamicResourceInterface.Create(context.TODO(), unstructured, metav1.CreateOptions{})
+		_, err := dynamicResourceInterface.Create(d.context, unstructured, metav1.CreateOptions{})
 		d.noErrorSkipExists(err)
 	}
test/e2e/must_gather_operator_test.go (3)
138-153: Potential test flakiness: AfterEach cleanup may timeout if CR deletion is slow.

The Eventually block waits up to 2 minutes for CR deletion, but if the operator is under load or the deletion cascades through many resources, this could cause intermittent failures. Consider whether 2 minutes is sufficient for all scenarios, especially when Jobs and Pods need cascading deletion.

The timeout appears reasonable for most cases, but consider adding a brief log message if cleanup takes longer than expected to aid debugging flaky tests.

547-552: String matching for security validation is fragile.

Checking for "forbidden" and "denied" substrings in event messages is locale-dependent and may miss variations. This approach could produce false negatives if the message format changes or uses different wording.

Consider checking the event Reason field against known security-related reasons (e.g., FailedCreate, FailedScheduling) rather than substring matching on messages, which is more stable across Kubernetes versions.

697-716: Helper uses nonAdminClient directly, coupling it to package-level state.

The createMustGatherCR function uses the package-level nonAdminClient variable directly rather than accepting it as a parameter. This reduces flexibility and makes the function harder to test in isolation or reuse with different clients.

Consider passing the client as a parameter for better flexibility:
-func createMustGatherCR(name, namespace, serviceAccountName string, retainResources bool) *mustgatherv1alpha1.MustGather {
+func createMustGatherCR(c client.Client, name, namespace, serviceAccountName string, retainResources bool) *mustgatherv1alpha1.MustGather {
 	mg := &mustgatherv1alpha1.MustGather{
 		// ...
 	}
 
-	err := nonAdminClient.Create(testCtx, mg)
+	err := c.Create(testCtx, mg)
 	Expect(err).NotTo(HaveOccurred(), "Failed to create MustGather CR")
 
 	return mg
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 6ef1b31 and d552982.

📒 Files selected for processing (12)

go.mod (2 hunks)
test/e2e/must_gather_operator_runner_test.go (1 hunks)
test/e2e/must_gather_operator_test.go (1 hunks)
test/e2e/must_gather_operator_tests.go (0 hunks)
test/e2e/testdata/nonadmin-clusterrole-binding.yaml (1 hunks)
test/e2e/testdata/nonadmin-clusterrole.yaml (1 hunks)
test/e2e/testdata/serviceaccount-clusterrole-binding.yaml (1 hunks)
test/e2e/testdata/serviceaccount-clusterrole.yaml (1 hunks)
test/e2e/testdata/serviceaccount.yaml (1 hunks)
test/library/dynamic_resources.go (1 hunks)
test/library/kube_client.go (1 hunks)
test/library/utils.go (1 hunks)

💤 Files with no reviewable changes (1)

test/e2e/must_gather_operator_tests.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

test/e2e/testdata/nonadmin-clusterrole-binding.yaml
go.mod
test/e2e/testdata/serviceaccount-clusterrole.yaml
test/library/dynamic_resources.go
test/library/utils.go
test/library/kube_client.go
test/e2e/must_gather_operator_test.go
test/e2e/must_gather_operator_runner_test.go
test/e2e/testdata/serviceaccount-clusterrole-binding.yaml
test/e2e/testdata/serviceaccount.yaml
test/e2e/testdata/nonadmin-clusterrole.yaml

🧬 Code graph analysis (4)

test/library/dynamic_resources.go (2)

test/library/kube_client.go (1)

NewClientsConfigForTest (17-29)

version/version.go (1)

Version (4-4)

test/library/utils.go (1)

test/library/dynamic_resources.go (1)

DynamicResourceLoader (24-30)

test/e2e/must_gather_operator_test.go (2)

api/v1alpha1/groupversion_info.go (1)

AddToScheme (35-35)

api/v1alpha1/mustgather_types.go (3)

MustGather (193-199)

MustGatherSpec (28-65)

MustGatherList (204-208)

test/e2e/must_gather_operator_runner_test.go (1)

test/library/dynamic_resources.go (2)

DynamicResourceLoader (24-30)

NewDynamicResourceLoader (34-42)

🪛 Checkov (3.2.334)

test/e2e/testdata/nonadmin-clusterrole.yaml

[medium] 1-25: Minimize wildcard use in Roles and ClusterRoles

(CKV_K8S_49)

🔇 Additional comments (12)

go.mod (1)

14-14: Addition of testify for e2e test assertions looks good.

The addition of github.com/stretchr/testify v1.10.0 is appropriate for Go testing and aligns with the PR's shift to Ginkgo-based e2e testing. The version is current and well-maintained. The accompanying indirect dependency github.com/pmezard/go-difflib (line 61) is a transitive dependency from testify and is expected. Verification confirms that removed dependencies (osde2e-common and e2e-framework) are completely removed from the codebase with no lingering imports.

test/e2e/testdata/serviceaccount.yaml (1)

1-8: LGTM!

The ServiceAccount manifest is correctly structured for dynamic e2e test scenarios with runtime namespace assignment.

test/e2e/testdata/serviceaccount-clusterrole.yaml (1)

1-31: LGTM!

The ClusterRole correctly defines read-only permissions appropriate for non-admin e2e test scenarios. The resource and verb combinations are suitable for must-gather operations.

test/e2e/testdata/nonadmin-clusterrole.yaml (1)

8-10: Verify the need for wildcard permissions on mustgathers.

The ClusterRole grants wildcard verbs ["*"] on mustgathers resources, providing full control (create, update, delete, patch, etc.). While this may be intentional for comprehensive e2e testing, consider whether specific verbs (e.g., ["get", "list", "watch", "create", "delete"]) would be sufficient. Explicitly listing required verbs follows the principle of least privilege and reduces potential security risks.

Based on static analysis hints (CKV_K8S_49).

If the wildcard is necessary for test coverage, consider adding a comment explaining why full permissions are required.

test/library/kube_client.go (1)

33-33: InsecureSkipTLSVerify is acceptable for e2e tests.

Setting InsecureSkipTLSVerify: true is appropriate for test environments where certificate validation may not be configured. This is a common pattern in e2e testing.

test/e2e/must_gather_operator_runner_test.go (2)

33-40: LGTM!

The test directory resolution correctly handles both CI and local development environments.

42-53: LGTM!

The cluster name extraction is simple and effective for test logging. Note that the returned name includes the port (e.g., api.cluster.example.com:6443), which is acceptable for identification purposes.

test/library/dynamic_resources.go (1)

24-42: LGTM - Clean struct design and constructor.

The DynamicResourceLoader struct properly encapsulates the required clients and test context. The constructor delegates client creation to NewClientsConfigForTest, maintaining separation of concerns.

test/e2e/must_gather_operator_test.go (4)

661-673: LGTM - Clean impersonation client creation.

The helper properly creates an impersonated client by copying the admin config and setting the Impersonate field. Good separation of concerns.

675-695: LGTM - Operator deployment verification is thorough.

The function validates both namespace existence and deployment readiness, with clear logging for debugging.

52-60: No action needed. The variables loader and ns are properly declared as package-level variables in test/e2e/must_gather_operator_runner_test.go and are correctly accessible throughout the package, including in test/e2e/must_gather_operator_test.go. This is standard Go package-scope behavior and will not cause compilation errors.

461-485: The test assertion is correct. When a Job references a non-existent ServiceAccount, Kubernetes deterministically prevents pod creation, leaving job.Status.Active at 0. The Consistently() check over 30 seconds appropriately verifies this behavior. The test properly validates that Job creation succeeds while pod scheduling fails due to the missing ServiceAccount.

test/e2e/must_gather_operator_runner_test.go

test/e2e/must_gather_operator_test.go

coderabbitai · 2025-12-15T10:26:31Z

test/e2e/testdata/serviceaccount-clusterrole-binding.yaml

+subjects:
+  - kind: ServiceAccount
+    name: must-gather-serviceaccount


⚠️ Potential issue | 🔴 Critical

Add required namespace field to ServiceAccount subject.

ServiceAccount subjects in ClusterRoleBinding must specify a namespace. Without it, the binding will fail to apply.

Apply this diff to fix:

subjects: - kind: ServiceAccount name: must-gather-serviceaccount + namespace: <namespace-name>

Note: Based on test/e2e/testdata/serviceaccount.yaml line 5, the namespace is set at runtime. Ensure consistency in how you apply both manifests.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

subjects:

- kind: ServiceAccount

name: must-gather-serviceaccount

subjects:

- kind: ServiceAccount

name: must-gather-serviceaccount

namespace: <namespace-name>

🤖 Prompt for AI Agents

In test/e2e/testdata/serviceaccount-clusterrole-binding.yaml around lines 11-13, the ClusterRoleBinding subject for the ServiceAccount is missing the required namespace field; add a namespace entry to the subject (e.g., namespace: <must-gather-namespace>) so the subject becomes fully qualified, and ensure the value matches the namespace used at runtime in test/e2e/testdata/serviceaccount.yaml (line 5) or the template/variable used to inject that namespace.

test/library/dynamic_resources.go

test/library/utils.go

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

test/library/kube_client.go (2)
29-38: Redundant error return and duplicate logging.

GetConfigForTest calls require.NoError(t, err) at line 37, which means the function will never return a non-nil error (it will fail the test instead). The returned err at line 38 is always nil, making the error return value misleading.

Additionally, the host is logged here at line 34, and also in NewClientsConfigForTest at line 18, resulting in duplicate log messages.

Consider either:

Removing the error return value since it's always nil after require.NoError, or

Not using require.NoError and letting callers handle the error.
 func GetConfigForTest(t TestingT) (*rest.Config, error) {
 	loader := clientcmd.NewDefaultClientConfigLoadingRules()
 	clientConfig := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(loader, &clientcmd.ConfigOverrides{ClusterInfo: api.Cluster{InsecureSkipTLSVerify: true}})
 	config, err := clientConfig.ClientConfig()
-	if err == nil {
-		t.Logf("Found configuration for host %v.\n", config.Host)
-	}
-
 	require.NoError(t, err)
+	t.Logf("Found configuration for host %v.\n", config.Host)
 	return config, err
 }
15-27: Duplicate host logging with GetConfigForTest.

Both GetConfigForTest (line 34) and NewClientsConfigForTest (line 18) log the host information, resulting in duplicate log output. Since GetConfigForTest always logs on success (after require.NoError), the logging here is redundant.
 func NewClientsConfigForTest(t TestingT) (kubernetes.Interface, dynamic.Interface) {
 	config, err := GetConfigForTest(t)
-	if err == nil {
-		t.Logf("Found configuration for host %v.\n", config.Host)
-	}
-
 	require.NoError(t, err)
 	kubeClient, err := kubernetes.NewForConfig(config)
test/e2e/must_gather_operator_test.go (2)
675-695: Local variable shadows package-level ns.

At line 677, ns := &corev1.Namespace{} creates a local variable that shadows the package-level ns declared in must_gather_operator_runner_test.go. This works in this context since you're only using it to fetch the namespace, but it could cause confusion.
 func verifyOperatorDeployment() {
 	ginkgo.By("Checking must-gather-operator namespace exists")
-	ns := &corev1.Namespace{}
-	err := adminClient.Get(testCtx, client.ObjectKey{Name: operatorNamespace}, ns)
+	operatorNS := &corev1.Namespace{}
+	err := adminClient.Get(testCtx, client.ObjectKey{Name: operatorNamespace}, operatorNS)
 	Expect(err).NotTo(HaveOccurred(), "must-gather-operator namespace should exist")
138-154: Silent error handling in AfterEach cleanup.

At line 141, _ = adminClient.Delete(testCtx, mustGatherCR) silently ignores errors. While cleanup errors are often non-critical, completely ignoring them could mask legitimate issues during test debugging.

Consider logging the error if deletion fails unexpectedly:
 		ginkgo.By("Cleaning up MustGather CR")
-		_ = adminClient.Delete(testCtx, mustGatherCR)
+		if err := adminClient.Delete(testCtx, mustGatherCR); err != nil && !apierrors.IsNotFound(err) {
+			ginkgo.GinkgoWriter.Printf("Warning: failed to delete MustGather CR: %v\n", err)
+		}
test/library/dynamic_resources.go (1)
128-148: Uses context.TODO() instead of stored context.

The DynamicResourceLoader stores a context in d.context (line 34), but DeleteFromFile (line 131) and CreateFromFile (line 142) use context.TODO() instead. This bypasses any context cancellation or timeout that might be set on the stored context.
 func (d DynamicResourceLoader) DeleteFromFile(assetFunc func(name string) ([]byte, error), filename string, overrideNamespace string) {
 	d.t.Logf("Deleting resource %v\n", filename)
 	deleteFunc := func(t TestingT, unstructured *unstructured.Unstructured, dynamicResourceInterface dynamic.ResourceInterface) {
-		err := dynamicResourceInterface.Delete(context.TODO(), unstructured.GetName(), metav1.DeleteOptions{})
+		err := dynamicResourceInterface.Delete(d.context, unstructured.GetName(), metav1.DeleteOptions{})
 		d.noErrorSkipNotExisting(err)
 	}

 	d.do(deleteFunc, assetFunc, filename, overrideNamespace)
 	d.t.Logf("Resource %v deleted\n", filename)
 }

 func (d DynamicResourceLoader) CreateFromFile(assetFunc func(name string) ([]byte, error), filename string, overrideNamespace string) {
 	d.t.Logf("Creating resource %v\n", filename)
 	createFunc := func(t TestingT, unstructured *unstructured.Unstructured, dynamicResourceInterface dynamic.ResourceInterface) {
-		_, err := dynamicResourceInterface.Create(context.TODO(), unstructured, metav1.CreateOptions{})
+		_, err := dynamicResourceInterface.Create(d.context, unstructured, metav1.CreateOptions{})
 		d.noErrorSkipExists(err)
 	}

 	d.do(createFunc, assetFunc, filename, overrideNamespace)
 	d.t.Logf("Resource %v created\n", filename)
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between d552982 and 94de487.

📒 Files selected for processing (5)

test/e2e/must_gather_operator_runner_test.go (1 hunks)
test/e2e/must_gather_operator_test.go (1 hunks)
test/library/dynamic_resources.go (1 hunks)
test/library/kube_client.go (1 hunks)
test/library/utils.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

test/library/utils.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

test/library/kube_client.go
test/e2e/must_gather_operator_test.go
test/e2e/must_gather_operator_runner_test.go
test/library/dynamic_resources.go

🧬 Code graph analysis (4)

test/library/kube_client.go (1)

test/library/dynamic_resources.go (1)

TestingT (24-28)

test/e2e/must_gather_operator_test.go (2)

api/v1alpha1/groupversion_info.go (1)

AddToScheme (35-35)

api/v1alpha1/mustgather_types.go (3)

MustGather (193-199)

MustGatherSpec (28-65)

MustGatherList (204-208)

test/e2e/must_gather_operator_runner_test.go (1)

test/library/dynamic_resources.go (2)

DynamicResourceLoader (30-36)

NewDynamicResourceLoader (40-48)

test/library/dynamic_resources.go (2)

test/library/kube_client.go (1)

NewClientsConfigForTest (15-27)

version/version.go (1)

Version (4-4)

🔇 Additional comments (6)

test/e2e/must_gather_operator_runner_test.go (2)

66-77: LGTM - BeforeSuite correctly uses GinkgoT().

The loader is correctly initialized with GinkgoT() at line 75, which properly integrates with the Ginkgo test framework for error reporting. The setup follows a reasonable pattern for e2e test initialization.

26-31: No action needed. The test suite correctly uses ginkgo.Ordered on the Describe block (line 71 of must_gather_operator_test.go), which guarantees that BeforeAll executes before all subsequent specs and that all tests run in the defined order. This is the proper Ginkgo v2 pattern for managing shared state across tests. The cross-file package-level ns variable is safe because test ordering is explicitly controlled.

test/e2e/must_gather_operator_test.go (2)

71-109: Comprehensive test setup looks good.

The BeforeAll setup correctly:

Initializes admin config and client

Verifies operator deployment before running tests

Creates a dedicated test namespace

Sets up RBAC resources (ClusterRole, ClusterRoleBinding, ServiceAccount)

Creates a non-admin impersonated client

The ordered test execution with ginkgo.Ordered ensures proper sequencing.

237-292: Thorough Job verification tests.

The test properly validates:

Job creation with correct ServiceAccount

Container specifications including gather container

Output volume presence for artifacts

The debugging output when non-admin can't see the Job (lines 249-259) is helpful for troubleshooting RBAC issues.

test/library/dynamic_resources.go (2)

23-48: Well-designed abstraction for test utilities.

The TestingT interface provides a clean abstraction that works with both *testing.T and GinkgoT(). The DynamicResourceLoader struct encapsulates all necessary clients and context for dynamic resource operations, making it easy to use in e2e tests.

86-126: Core resource loading logic is solid.

The do() method correctly:

Loads and decodes YAML/JSON assets

Resolves GVK to REST mapping via discovery

Handles namespace-scoped vs cluster-scoped resources appropriately

Applies namespace overrides where needed

Properly updates subject namespaces for ClusterRoleBindings (line 119)

test/e2e/must_gather_operator_test.go

swghosh · 2025-12-22T12:48:46Z

/label tide/merge-method-squash

test/e2e/must_gather_operator_runner_test.go

test/e2e/must_gather_operator_test.go

coderabbitai

Actionable comments posted: 4

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 94de487 and 8791f83.

📒 Files selected for processing (4)

test/e2e/must_gather_operator_runner_test.go
test/e2e/must_gather_operator_test.go
test/library/utils.go
test/must-gather.yaml

💤 Files with no reviewable changes (1)

test/must-gather.yaml

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

test/e2e/must_gather_operator_test.go
test/library/utils.go
test/e2e/must_gather_operator_runner_test.go

🧬 Code graph analysis (3)

test/e2e/must_gather_operator_test.go (2)

api/v1alpha1/groupversion_info.go (1)

AddToScheme (35-35)

api/v1alpha1/mustgather_types.go (3)

MustGather (193-199)

MustGatherSpec (28-65)

MustGatherList (204-208)

test/library/utils.go (1)

test/library/dynamic_resources.go (1)

DynamicResourceLoader (30-36)

test/e2e/must_gather_operator_runner_test.go (1)

test/library/dynamic_resources.go (2)

DynamicResourceLoader (30-36)

NewDynamicResourceLoader (40-48)

coderabbitai · 2025-12-23T12:42:52Z

test/e2e/must_gather_operator_runner_test.go

+var _ = BeforeSuite(func() {
+	var err error
+	cfg, err = config.GetConfig()
+	Expect(err).NotTo(HaveOccurred())
+
+	By("creating dynamic resources client")
+	loader = library.NewDynamicResourceLoader(context.TODO(), GinkgoT())
+
+})


⚠️ Potential issue | 🟠 Major

Pass a proper context instead of context.TODO() to the loader.

Line 56 passes context.TODO() to NewDynamicResourceLoader, which becomes the d.context field used throughout the loader's methods (including the utilities in test/library/utils.go). This prevents proper test cancellation and cleanup. Use a test-scoped context that can be cancelled when the suite terminates.

🔎 Proposed fix

Add a suite-level context variable and use it:

var ( loader library.DynamicResourceLoader cfg *rest.Config ns *corev1.Namespace + suiteCtx context.Context + suiteCancel context.CancelFunc ) var _ = BeforeSuite(func() { var err error cfg, err = config.GetConfig() Expect(err).NotTo(HaveOccurred()) By("creating dynamic resources client") - loader = library.NewDynamicResourceLoader(context.TODO(), GinkgoT()) + suiteCtx, suiteCancel = context.WithCancel(context.Background()) + loader = library.NewDynamicResourceLoader(suiteCtx, GinkgoT()) +}) +var _ = AfterSuite(func() { + if suiteCancel != nil { + suiteCancel() + } })

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In test/e2e/must_gather_operator_runner_test.go around lines 50 to 58, the loader is created with context.TODO() which prevents proper cancellation and cleanup; replace that with a suite-level cancellable context: declare a package/suite-level context and cancel function (e.g., suiteCtx, suiteCancel), initialize them in BeforeSuite using context.WithCancel(context.Background()) and pass suiteCtx to NewDynamicResourceLoader, and ensure you call suiteCancel() in AfterSuite to cancel and clean up tests.

coderabbitai · 2025-12-23T12:42:52Z

test/library/utils.go

+func (d DynamicResourceLoader) CreateTestNS(namespacePrefix string, noSuffix bool) (*corev1.Namespace, error) {
+	namespace := &corev1.Namespace{
+		ObjectMeta: metav1.ObjectMeta{
+			Labels: map[string]string{
+				"e2e-test": "support-log-gather",
+			},
+		},
+	}
+
+	if noSuffix {
+		namespace.Name = namespacePrefix
+	} else {
+		namespace.GenerateName = fmt.Sprintf("%v-", namespacePrefix)
+	}
+
+	var got *corev1.Namespace
+	if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) {
+		var err error
+		got, err = d.KubeClient.CoreV1().Namespaces().Create(context.Background(), namespace, metav1.CreateOptions{})
+		if err != nil {
+			log.Printf("Error creating namespace: %v", err)
+			return false, nil
+		}
+		return true, nil
+	}); err != nil {
+		return nil, err
+	}
+	return got, nil
+}


⚠️ Potential issue | 🟠 Major

Use d.context instead of context.TODO() and context.Background().

Both the polling loop (line 34) and the Create call (line 36) ignore d.context, preventing proper cancellation and timeout propagation from the test framework. If the test is cancelled or times out, this function will continue polling and creating resources, potentially causing resource leaks.

🔎 Proposed fix

- if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) { + if err := wait.PollUntilContextTimeout(d.context, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) { var err error - got, err = d.KubeClient.CoreV1().Namespaces().Create(context.Background(), namespace, metav1.CreateOptions{}) + got, err = d.KubeClient.CoreV1().Namespaces().Create(ctx, namespace, metav1.CreateOptions{}) if err != nil { log.Printf("Error creating namespace: %v", err) return false, nil } return true, nil }); err != nil {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (d DynamicResourceLoader) CreateTestNS(namespacePrefix string, noSuffix bool) (*corev1.Namespace, error) {

namespace := &corev1.Namespace{

ObjectMeta: metav1.ObjectMeta{

Labels: map[string]string{

"e2e-test": "support-log-gather",

},

},

}

if noSuffix {

namespace.Name = namespacePrefix

} else {

namespace.GenerateName = fmt.Sprintf("%v-", namespacePrefix)

}

var got *corev1.Namespace

if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) {

var err error

got, err = d.KubeClient.CoreV1().Namespaces().Create(context.Background(), namespace, metav1.CreateOptions{})

if err != nil {

log.Printf("Error creating namespace: %v", err)

return false, nil

}

return true, nil

}); err != nil {

return nil, err

}

return got, nil

}

func (d DynamicResourceLoader) CreateTestNS(namespacePrefix string, noSuffix bool) (*corev1.Namespace, error) {

namespace := &corev1.Namespace{

ObjectMeta: metav1.ObjectMeta{

Labels: map[string]string{

"e2e-test": "support-log-gather",

},

},

}

if noSuffix {

namespace.Name = namespacePrefix

} else {

namespace.GenerateName = fmt.Sprintf("%v-", namespacePrefix)

}

var got *corev1.Namespace

if err := wait.PollUntilContextTimeout(d.context, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {

var err error

got, err = d.KubeClient.CoreV1().Namespaces().Create(ctx, namespace, metav1.CreateOptions{})

if err != nil {

log.Printf("Error creating namespace: %v", err)

return false, nil

}

return true, nil

}); err != nil {

return nil, err

}

return got, nil

}

🤖 Prompt for AI Agents

In test/library/utils.go around lines 18 to 46, the polling loop and Create call use context.TODO() and context.Background(), ignoring the loader's d.context which prevents proper cancellation/timeouts; replace context.TODO() passed to wait.PollUntilContextTimeout with d.context and replace context.Background() passed to Namespaces().Create with d.context so the poll and create operations respect the test framework's cancellation and timeout; ensure the function returns any context-derived errors from the poll instead of swallowing them.

coderabbitai · 2025-12-23T12:42:52Z

test/library/utils.go

+func (d DynamicResourceLoader) DeleteTestingNS(name string, shouldDumpEvents func() bool) error {
+	ctx := context.Background()
+	if shouldDumpEvents() {
+		d.DumpEventsInNamespace(name)
+	}
+
+	err := d.KubeClient.CoreV1().Namespaces().Delete(ctx, name, metav1.DeleteOptions{})
+	if err != nil {
+		log.Printf("Error deleting namespace %v, err: %v", name, err)
+	}
+
+	if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) {
+		// Poll until namespace is deleted
+		_, err := d.KubeClient.CoreV1().Namespaces().Get(ctx, name, metav1.GetOptions{})
+		if err != nil && k8serrors.IsNotFound(err) {
+			return true, nil
+		}
+		return false, nil
+	}); err != nil {
+		log.Printf("Timed out after 30s waiting for namespace %v to become deleted", name)
+		return err
+	}
+	return nil
+}


⚠️ Potential issue | 🟠 Major

Use d.context instead of context.Background() and context.TODO().

This function creates a new context.Background() (line 49) and uses context.TODO() in polling (line 59), ignoring d.context. This prevents proper cancellation and cleanup when tests are terminated, potentially leaving namespaces in terminating state.

🔎 Proposed fix

func (d DynamicResourceLoader) DeleteTestingNS(name string, shouldDumpEvents func() bool) error { - ctx := context.Background() if shouldDumpEvents() { d.DumpEventsInNamespace(name) } - err := d.KubeClient.CoreV1().Namespaces().Delete(ctx, name, metav1.DeleteOptions{}) + err := d.KubeClient.CoreV1().Namespaces().Delete(d.context, name, metav1.DeleteOptions{}) if err != nil { log.Printf("Error deleting namespace %v, err: %v", name, err) } - if err := wait.PollUntilContextTimeout(context.TODO(), 1*time.Second, 30*time.Second, true, func(context.Context) (bool, error) { + if err := wait.PollUntilContextTimeout(d.context, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) { // Poll until namespace is deleted - _, err := d.KubeClient.CoreV1().Namespaces().Get(ctx, name, metav1.GetOptions{}) + _, err := d.KubeClient.CoreV1().Namespaces().Get(ctx, name, metav1.GetOptions{}) if err != nil && k8serrors.IsNotFound(err) { return true, nil } return false, nil }); err != nil {

🤖 Prompt for AI Agents

In test/library/utils.go around lines 48 to 71, the function creates a new context.Background() and uses context.TODO(), ignoring the loader's d.context which prevents proper cancellation; replace both context.Background() and context.TODO() with d.context (e.g., set ctx := d.context at the top) and use that ctx when calling Namespaces().Delete and as the context argument to wait.PollUntilContextTimeout so the operations respect cancellation and test cleanup.

coderabbitai · 2025-12-23T12:42:52Z

test/library/utils.go

+func (d DynamicResourceLoader) DumpEventsInNamespace(name string) {
+	log.Printf("Dumping events in namespace %s...", name)
+	events, err := d.KubeClient.CoreV1().Events(name).List(context.TODO(), metav1.ListOptions{})
+	if err != nil {
+		log.Printf("Error listing events in namespace %s: %v", name, err)
+		return
+	}
+
+	for _, e := range events.Items {
+		log.Printf("At %v - event for %v %v: %v %v: %v", e.FirstTimestamp, e.InvolvedObject.Kind, e.InvolvedObject.Name, e.Source, e.Reason, e.Message)
+	}
+}


⚠️ Potential issue | 🟡 Minor

Use d.context instead of context.TODO() for consistency.

While this is a best-effort logging operation, using d.context ensures the event listing respects test cancellation and avoids unnecessary API calls when the test framework is shutting down.

🔎 Proposed fix

func (d DynamicResourceLoader) DumpEventsInNamespace(name string) { log.Printf("Dumping events in namespace %s...", name) - events, err := d.KubeClient.CoreV1().Events(name).List(context.TODO(), metav1.ListOptions{}) + events, err := d.KubeClient.CoreV1().Events(name).List(d.context, metav1.ListOptions{}) if err != nil { log.Printf("Error listing events in namespace %s: %v", name, err) return }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (d DynamicResourceLoader) DumpEventsInNamespace(name string) {

log.Printf("Dumping events in namespace %s...", name)

events, err := d.KubeClient.CoreV1().Events(name).List(context.TODO(), metav1.ListOptions{})

if err != nil {

log.Printf("Error listing events in namespace %s: %v", name, err)

return

}

for _, e := range events.Items {

log.Printf("At %v - event for %v %v: %v %v: %v", e.FirstTimestamp, e.InvolvedObject.Kind, e.InvolvedObject.Name, e.Source, e.Reason, e.Message)

}

}

func (d DynamicResourceLoader) DumpEventsInNamespace(name string) {

log.Printf("Dumping events in namespace %s...", name)

events, err := d.KubeClient.CoreV1().Events(name).List(d.context, metav1.ListOptions{})

if err != nil {

log.Printf("Error listing events in namespace %s: %v", name, err)

return

}

for _, e := range events.Items {

log.Printf("At %v - event for %v %v: %v %v: %v", e.FirstTimestamp, e.InvolvedObject.Kind, e.InvolvedObject.Name, e.Source, e.Reason, e.Message)

}

}

🤖 Prompt for AI Agents

In test/library/utils.go around lines 73 to 84, the call to context.TODO() when listing events should use d.context so the call respects test cancellation; change List(context.TODO(), ...) to List(d.context, ...) (ensure d.context is a valid context.Context on the DynamicResourceLoader struct).

swghosh

generally, looking great except one small nit.

(also, I din't 👀 through the dynamic_resource_loader with the assumption it was copied as-is from https://github.com/openshift/cert-manager-operator/tree/master/test/library)

I'm only concerned that the serviceaccount-clusterrole will be unable to collect a full must-gather during the gather Job (cc: @shivprakashmuley to evaluate if that's blocking/not for this PR; or follow-up in #264 or similar).

Thank you for the great work and pulling this through @praveencodes!

swghosh · 2025-12-23T13:00:30Z

test/e2e/must_gather_operator_test.go

+		ginkgo.It("should fail when non-admin user tries to use privileged ServiceAccount", func() {
+			mustGatherName := fmt.Sprintf("test-privileged-sa-%d", time.Now().UnixNano())
+
+			ginkgo.By("Attempting to create MustGather with non-existent privileged SA")
+			mg = createMustGatherCR(mustGatherName, ns.Name, "cluster-admin-sa", false) // SA doesn't exist
+
+			ginkgo.By("Verifying Job is created but Pod fails due to missing SA")
+			job := &batchv1.Job{}
+			Eventually(func() error {
+				return adminClient.Get(testCtx, client.ObjectKey{
+					Name:      mustGatherName,
+					Namespace: ns.Name,
+				}, job)
+			}).WithTimeout(2 * time.Minute).WithPolling(5 * time.Second).Should(Succeed())
+
+			// Job should exist but pods won't be created due to missing SA
+			Consistently(func() int32 {
+				_ = adminClient.Get(testCtx, client.ObjectKey{
+					Name:      mustGatherName,
+					Namespace: ns.Name,
+				}, job)
+				return job.Status.Active
+			}).WithTimeout(30*time.Second).WithPolling(5*time.Second).Should(Equal(int32(0)),
+				"Job should not have active pods due to missing ServiceAccount")
+		})


while this is a good test to have, "check if absent serviceaccount actually fails"
do you think this is wrongly worded as a "privileged ServiceAccount" check?

swghosh · 2025-12-23T13:01:21Z

test/e2e/must_gather_operator_test.go

+			for _, container := range gatherPod.Spec.Containers {
+				if container.SecurityContext != nil && container.SecurityContext.Privileged != nil {
+					Expect(*container.SecurityContext.Privileged).To(BeFalse(),
+						"Container should not run in privileged mode")
+				}
+			}


this is sufficient check, I'd say.
Or if you want to take further you can check the SCC/PSA annotations on the pod not being "privileged".

openshift-ci · 2025-12-23T13:03:49Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: praveencodes, swghosh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [swghosh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-12-23T14:04:21Z

@praveencodes: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/validate-boilerplate	`8791f83`	link	false	`/test validate-boilerplate`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

shivprakashmuley · 2025-12-24T09:22:34Z

/label no-qe
as E2E is not user facing

shivprakashmuley · 2025-12-24T09:23:29Z

/verified

openshift-ci-robot · 2025-12-24T09:23:33Z

@shivprakashmuley: The /verified command must be used with one of the following actions: by, later, remove, or bypass. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

/verified

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

shivprakashmuley · 2025-12-24T09:23:55Z

/verified bypass

openshift-ci-robot · 2025-12-24T09:24:06Z

@shivprakashmuley: The verified label has been added.

Details

In response to this:

/verified bypass

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 5, 2025

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 5, 2025

praveencodes marked this pull request as ready for review November 6, 2025 04:32

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 6, 2025

openshift-ci bot requested review from Tafhim and bng0y November 6, 2025 04:34

coderabbitai bot reviewed Nov 6, 2025

View reviewed changes

coderabbitai bot reviewed Nov 17, 2025

View reviewed changes

coderabbitai bot reviewed Nov 18, 2025

View reviewed changes

openshift-ci bot requested review from shivprakashmuley and swghosh November 19, 2025 05:58

swghosh mentioned this pull request Nov 24, 2025

MG-134: Add e2e-gcp-operator check in the must gather operator CI pipeline openshift/release#71208

Merged

coderabbitai bot reviewed Nov 26, 2025

View reviewed changes

shivprakashmuley reviewed Nov 27, 2025

View reviewed changes

examples/mustgather_full.yaml Show resolved Hide resolved

shivprakashmuley reviewed Nov 27, 2025

View reviewed changes

examples/mustgather_non_internal_user.yaml Show resolved Hide resolved

shivprakashmuley reviewed Nov 27, 2025

View reviewed changes

test/e2e/must_gather_operator_tests.go Outdated Show resolved Hide resolved

swghosh approved these changes Dec 3, 2025

View reviewed changes

test/e2e/must_gather_operator_tests.go Outdated Show resolved Hide resolved

test/e2e/must_gather_operator_tests.go Outdated Show resolved Hide resolved

test/e2e/must_gather_operator_tests.go Outdated Show resolved Hide resolved

openshift-ci bot assigned swghosh Dec 3, 2025

openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 3, 2025

swghosh reviewed Dec 3, 2025

View reviewed changes

test/e2e/must_gather_operator_runner_test.go Outdated Show resolved Hide resolved

praveencodes added 2 commits December 7, 2025 21:51

fix the Junit report generation issue and clean up

59e14db

addressed the comments and fixed it.

6ef1b31

praveencodes force-pushed the MG-128 branch from 829d98d to 6ef1b31 Compare December 7, 2025 16:22

swghosh reviewed Dec 10, 2025

View reviewed changes

remove unnecessary checks, create resources from manifests, etc

d552982

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

address coderabbit comments

94de487

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

test/e2e/must_gather_operator_test.go Show resolved Hide resolved

swghosh mentioned this pull request Dec 18, 2025

MG-66: Remove proxy config from CR #313

Merged

openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Dec 22, 2025

swghosh reviewed Dec 22, 2025

View reviewed changes

test/e2e/must_gather_operator_test.go Outdated Show resolved Hide resolved

swghosh reviewed Dec 22, 2025

View reviewed changes

test/e2e/must_gather_operator_test.go Outdated Show resolved Hide resolved

address the review comments

8791f83

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

swghosh approved these changes Dec 23, 2025

View reviewed changes

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 23, 2025

openshift-ci bot added the no-qe Indicates that QE signoff is not required. label Dec 24, 2025

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Dec 24, 2025

openshift-merge-bot bot merged commit 130bc8a into openshift:master Dec 24, 2025
9 of 10 checks passed

	ginkgo.Context("Non-Admin User MustGather Collection Flow", func() {
	ginkgo.Context("Non-Admin User", func() {

	ginkgo.It("non-admin user should successfully submit MustGather CR with SA", func() {
	ginkgo.It("should successfully create MustGather CR", func() {

Conversation

praveencodes commented Nov 5, 2025

Uh oh!

openshift-ci bot commented Nov 5, 2025

Uh oh!

openshift-ci-robot commented Nov 5, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

praveencodes commented Nov 19, 2025

Uh oh!

swghosh commented Nov 25, 2025

Uh oh!

shivprakashmuley commented Nov 26, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

praveencodes commented Nov 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

openshift-ci-robot commented Nov 5, 2025 •

edited by openshift-ci bot

Loading

coderabbitai bot commented Nov 5, 2025 •

edited

Loading

codecov-commenter commented Nov 6, 2025 •

edited

Loading

swghosh left a comment •

edited

Loading