-
Notifications
You must be signed in to change notification settings - Fork 65
SREP-1884: Add support for ACM Hosted Control Plane (HCP) must-gathers #296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rolandmkunkel: This pull request references SREP-1884 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rolandmkunkel The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
WalkthroughAdds Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Points to review carefully:
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #296 +/- ##
==========================================
+ Coverage 70.98% 71.37% +0.39%
==========================================
Files 7 7
Lines 479 503 +24
==========================================
+ Hits 340 359 +19
- Misses 136 139 +3
- Partials 3 5 +2
🚀 New features to boost your workflow:
|
fbb92a0 to
39bb966
Compare
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (7)
api/v1alpha1/mustgather_types.go (1)
63-64: Avoid hardcoding an exact image tag in public API comments.Move concrete image/tag details to operator docs/env var description; keep API comment generic to reduce churn.
bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml (1)
211-212: Mirror quoting/digest guidance in CSV.Keep CSV aligned with Deployment to prevent OLM drift.
- - name: ACM_HCP_MUST_GATHER_IMAGE - value: registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8 + - name: ACM_HCP_MUST_GATHER_IMAGE + value: "registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8"README.md (1)
108-109: Add a minimal HCP CR example using the new spec fields.Helps users understand when/how to set mustGatherImage and hosted fields.
Example:
apiVersion: operator.openshift.io/v1alpha1 kind: MustGather metadata: name: example-acm-hcp spec: mustGatherImage: acm_hcp hostedClusterNamespace: clusters hostedClusterName: my-hc caseID: "02527285" caseManagementAccountSecretRef: name: case-management-creds # audit must be omitted/false with acm_hcpdeploy/99_must-gather-operator.Deployment.yaml (1)
45-46: Quote the image value for YAML best practice and consistency.The env var is properly consumed by the controller code (via
os.Getenv()incontrollers/mustgather/template.go:176-178) and correctly selects the image whenmustGatherImage='acm_hcp'. Hosted cluster arguments are properly validated and passed. Quoting is recommended for consistency, since bothDEFAULT_MUST_GATHER_IMAGEandACM_HCP_MUST_GATHER_IMAGEshould follow the same pattern. Regarding digest pinning: both images currently use tags; applying digests only toACM_HCP_MUST_GATHER_IMAGEwould create inconsistency with the default image.Apply:
- - name: ACM_HCP_MUST_GATHER_IMAGE - value: registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8 + - name: ACM_HCP_MUST_GATHER_IMAGE + value: "registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8"deploy/crds/operator.openshift.io_mustgathers.yaml (1)
48-68: Consider adding CRD-level validation for conditional field requirements.The descriptions state that
hostedClusterNameandhostedClusterNamespaceare required whenmustGatherImage='acm_hcp', but there's no CEL validation rule to enforce this at the CRD level. While the controller'sIsValidmethod provides application-level validation, adding CRD-level validation would reject invalid resources earlier and provide better user feedback.Consider adding this validation rule to the spec's
x-kubernetes-validations:spec: description: MustGatherSpec defines the desired state of MustGather properties: # ... existing properties ... type: object x-kubernetes-validations: - message: hostedClusterNamespace and hostedClusterName must be set when mustGatherImage is 'acm_hcp' rule: "self.mustGatherImage != 'acm_hcp' || (has(self.hostedClusterNamespace) && self.hostedClusterNamespace != '' && has(self.hostedClusterName) && self.hostedClusterName != '')"bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml (1)
159-179: Consider consistent field placement across CRD files.The new fields (
mustGatherImage,hostedClusterName,hostedClusterNamespace) are placed afteruploadTargetin this file, but indeploy/crds/operator.openshift.io_mustgathers.yamlthey're placed beforeuploadTarget(lines 48-68). While YAML doesn't care about property order, consistent placement across CRD files improves maintainability and makes diffs easier to review.Additionally, the same recommendation for CRD-level validation applies here (see comment on
deploy/crds/operator.openshift.io_mustgathers.yaml).controllers/mustgather/template.go (1)
167-197: Consider optimizing env var reads and handling missing variables.Two minor observations:
- Optimize environment variable reads: Line 176 reads the default image env var unconditionally, even when
mustGatherImage == acmHcpMustGatherImage. Consider only reading the needed variable:- commandArgs := "" - containerImage := strings.TrimSpace(os.Getenv(defaultMustGatherImageEnv)) - if mustGatherImage == acmHcpMustGatherImage { - containerImage = strings.TrimSpace(os.Getenv(acmHcpMustGatherImageEnv)) + var containerImage string + commandArgs := "" + if mustGatherImage == acmHcpMustGatherImage { + containerImage = strings.TrimSpace(os.Getenv(acmHcpMustGatherImageEnv)) commandArgs = fmt.Sprintf("--hosted-cluster-namespace=%s --hosted-cluster-name=%s", hostedClusterNamespace, hostedClusterName) + } else { + containerImage = strings.TrimSpace(os.Getenv(defaultMustGatherImageEnv)) }
- Missing environment variable handling: If the required environment variable is not set,
containerImagewill be an empty string, which could lead to pod creation failure. Consider logging a warning or using a fallback value, though this may be intentional if the operator deployment guarantees these vars are set.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (10)
README.md(1 hunks)api/v1alpha1/mustgather_types.go(1 hunks)bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml(1 hunks)bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml(1 hunks)controllers/mustgather/mustgather_controller.go(2 hunks)controllers/mustgather/mustgather_controller_test.go(1 hunks)controllers/mustgather/template.go(4 hunks)controllers/mustgather/template_test.go(2 hunks)deploy/99_must-gather-operator.Deployment.yaml(1 hunks)deploy/crds/operator.openshift.io_mustgathers.yaml(1 hunks)
🔇 Additional comments (4)
controllers/mustgather/mustgather_controller.go (1)
365-379: Validation logic and constant sharing verified.The
acmHcpMustGatherImageconstant is properly defined once incontrollers/mustgather/template.goand consistently referenced across the codebase (controller, template, and tests), ensuring single source of truth with no string drift.controllers/mustgather/mustgather_controller_test.go (1)
1192-1271: LGTM! Comprehensive validation test coverage.The test function covers all validation scenarios appropriately:
- Default image without hosted cluster info (valid)
- ACM HCP image with complete hosted cluster info (valid)
- ACM HCP image with missing hosted cluster name (invalid)
- ACM HCP image with missing hosted cluster namespace (invalid)
- ACM HCP image with no hosted cluster info (invalid)
The table-driven test pattern is idiomatic and the assertions verify both the validity boolean and error messages correctly.
controllers/mustgather/template_test.go (1)
45-123: LGTM! Test coverage expanded appropriately.The test updates correctly exercise the new parameters:
- New test constants for both must-gather images
- Expanded test cases including "HCP default" scenario
- Environment variables set for both image types
- Assertions verify both image selection and argument construction
The changes maintain consistency with the updated function signature and add valuable test coverage for the ACM HCP use case.
controllers/mustgather/template.go (1)
49-55: LGTM! Constants are well-defined.The environment variable names and image type constants follow clear naming conventions and align with the CRD enum values.
39bb966 to
4dc694a
Compare
api/v1alpha1/mustgather_types.go
Outdated
| // The namespace of the hosted cluster to collect the must-gather for. | ||
| // This field is required if mustGatherImage='acm_hcp'. | ||
| // +kubebuilder:validation:Optional | ||
| HostedClusterNamespace string `json:"hostedClusterNamespace,omitempty"` | ||
|
|
||
| // The name of the hosted cluster to collect the must-gather for. | ||
| // This field is required if mustGatherImage='acm_hcp'. | ||
| // +kubebuilder:validation:Optional | ||
| HostedClusterName string `json:"hostedClusterName,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we think that we'll ever need to add additional options specific to HCP must gathers? I'm not close-enough to the functionality to know if we will or not.
If so - should we put it into its own subfield to keep the api cleaner?
So it would be something like HostedClusterOptions that's a HostedClusterOptions type, which would then contain Namespace and Name string fields.
It would end up looking like this:
spec:
mustGatherImage: acm_hcp
hostedClusterOptions:
Namespace: my_hcp_namespace
Name: my_cluster_name
api/v1alpha1/mustgather_types.go
Outdated
| // The name of the hosted cluster to collect the must-gather for. | ||
| // This field is required if mustGatherImage='acm_hcp'. | ||
| // +kubebuilder:validation:Optional | ||
| HostedClusterName string `json:"hostedClusterName,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally - do we actually need the name here? Could we, given a namespace, infer the name somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could, but that would make the code between HCP and classic diverge more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming back to this after diving a bit deeper into hypershift there's actually not an enforced 1:1 relationship with where the hosted cluster namespace is and the name of the hosted cluster - meaning you can put multiple hosted cluster CRs into a single namespace. So I'm now +1 to keeping this. It's always easier to make something optional later than it is to make it required later, as that change shouldn't break anything that's already relying on that functionality.
Thanks for looking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
controllers/mustgather/mustgather_controller.go (1)
110-114: Use the request-scoped ctx instead of context.TODO().This issue was previously flagged: the call to
r.ManageErrorusescontext.TODO(), which loses request-scoped cancellation/deadline and logging correlation. Replace with thectxparameter passed toReconcile.Apply this diff:
- return r.ManageError(context.TODO(), instance, err) + return r.ManageError(ctx, instance, err)
🧹 Nitpick comments (2)
controllers/mustgather/mustgather_controller_test.go (1)
1192-1271: Consider adding a test case for the Audit constraint with ACM HCP image.The test suite covers the hosted cluster field requirements, but there's no test case verifying that
Auditmust be false whenMustGatherImageis "acm_hcp" (as enforced by the XValidation rule in the API). Adding this test case would improve coverage and ensure the validation logic matches the API constraints.Add a test case like:
{ name: "ACM HCP image should be invalid with audit enabled", mustGather: &mustgatherv1alpha1.MustGather{ Spec: mustgatherv1alpha1.MustGatherSpec{ MustGatherImage: "acm_hcp", HostedClusterNamespace: "test-ns", HostedClusterName: "test-cluster", Audit: true, }, }, expectedValid: false, expectedError: errors.New("cannot collect audit logs when using the acm_hcp must gather image"), },controllers/mustgather/template.go (1)
167-180: Add error handling for missing environment variables.If the environment variables
DEFAULT_MUST_GATHER_IMAGEorACM_HCP_MUST_GATHER_IMAGEare not set,containerImagewill be an empty string, leading to a pod creation failure with an obscure error. Consider returning an error or using fallback defaults to fail fast with a clear message.Apply this pattern:
func getGatherContainer(audit bool, timeout time.Duration, mustGatherImage, hostedClusterNamespace, hostedClusterName string) corev1.Container { var commandBinary string if audit { commandBinary = gatherCommandBinaryAudit } else { commandBinary = gatherCommandBinaryNoAudit } commandArgs := "" containerImage := strings.TrimSpace(os.Getenv(defaultMustGatherImageEnv)) + if containerImage == "" { + // Fallback or panic with clear error message + panic(fmt.Sprintf("environment variable %s is not set", defaultMustGatherImageEnv)) + } if mustGatherImage == acmHcpMustGatherImage { containerImage = strings.TrimSpace(os.Getenv(acmHcpMustGatherImageEnv)) + if containerImage == "" { + panic(fmt.Sprintf("environment variable %s is not set", acmHcpMustGatherImageEnv)) + } commandArgs = fmt.Sprintf("--hosted-cluster-namespace=%s --hosted-cluster-name=%s", hostedClusterNamespace, hostedClusterName) }Alternatively, change
getGatherContainerto return(corev1.Container, error)and propagate the error up throughgetJobTemplate.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (10)
README.md(1 hunks)api/v1alpha1/mustgather_types.go(2 hunks)bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml(1 hunks)bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml(1 hunks)controllers/mustgather/mustgather_controller.go(2 hunks)controllers/mustgather/mustgather_controller_test.go(1 hunks)controllers/mustgather/template.go(4 hunks)controllers/mustgather/template_test.go(2 hunks)deploy/99_must-gather-operator.Deployment.yaml(1 hunks)deploy/crds/operator.openshift.io_mustgathers.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- deploy/crds/operator.openshift.io_mustgathers.yaml
- README.md
- bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml
- deploy/99_must-gather-operator.Deployment.yaml
🔇 Additional comments (5)
controllers/mustgather/mustgather_controller.go (1)
365-379: LGTM! Defensive validation complements CRD-level checks.The validation logic correctly enforces that ACM HCP must-gathers require hosted cluster context and cannot collect audit logs. While these constraints are also enforced by CRD XValidation rules, this runtime check provides an additional safety layer and clearer error messages.
controllers/mustgather/template_test.go (1)
45-123: LGTM! Test coverage appropriately expanded for ACM HCP support.The test updates correctly:
- Expand test cases to include HCP scenarios with hosted cluster parameters
- Set both must-gather image environment variables for test isolation
- Verify container image selection based on
mustGatherImagevalue- Assert that hosted-cluster arguments are properly included in the command
The test coverage ensures the new functionality works as expected.
api/v1alpha1/mustgather_types.go (1)
28-80: LGTM! API changes correctly implement ACM HCP support with proper validation.The additions are well-structured:
- XValidation rules enforce cross-field constraints at the API layer (acm_hcp requires hosted cluster fields, forbids audit, and non-acm_hcp forbids hosted fields)
- New fields include appropriate kubebuilder markers (Optional, Enum, default values)
- Default value "default" for
MustGatherImageensures backward compatibility- Field documentation clearly explains the requirements and usage
This addresses the past review comment requesting struct-level XValidation to encode HCP invariants in the type system.
bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml (1)
159-179: LGTM! CRD schema correctly reflects the new API fields.The OpenAPI schema additions properly define:
mustGatherImagewith enum constraint and default value matching the API typehostedClusterNameandhostedClusterNamespacewith clear descriptions- Field descriptions that reference the acm_hcp requirement
The schema changes align with the Go type definitions in
api/v1alpha1/mustgather_types.go.controllers/mustgather/template.go (1)
50-54: LGTM! Constants appropriately defined for image selection.The new constants clearly identify:
- Environment variable names for must-gather images
- Image type identifiers matching the API enum values
This makes the code self-documenting and maintainable.
4dc694a to
697a9b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
controllers/mustgather/mustgather_controller.go (1)
109-113: Use the request-scoped ctx instead of context.TODO().This issue was previously flagged. Using
context.TODO()loses request-scoped cancellation/deadline and logging correlation.Apply this diff:
- return r.ManageError(context.TODO(), instance, err) + return r.ManageError(ctx, instance, err)
🧹 Nitpick comments (1)
controllers/mustgather/template.go (1)
199-204: Add defensive nil check for hostedClusterOptions before dereferencing.While the controller's
IsValidmethod ensureshostedClusterOptionsis non-nil whenmustGatherImageisacm_hcp, defensive programming suggests adding an explicit nil check here to prevent panics if the validation is bypassed or modified in the future.Apply this diff:
commandArgs := "" containerImage := strings.TrimSpace(os.Getenv(defaultMustGatherImageEnv)) if mustGatherImage == acmHcpMustGatherImage { containerImage = strings.TrimSpace(os.Getenv(acmHcpMustGatherImageEnv)) + if hostedClusterOptions == nil { + // This should not happen if IsValid is called, but adding defensive check + log.Error(nil, "hostedClusterOptions is nil for acm_hcp image") + return corev1.Container{} + } commandArgs = fmt.Sprintf("--hosted-cluster-namespace=%s --hosted-cluster-name=%s", hostedClusterOptions.HostedClusterNamespace, hostedClusterOptions.HostedClusterName) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (11)
README.md(1 hunks)api/v1alpha1/mustgather_types.go(2 hunks)api/v1alpha1/zz_generated.deepcopy.go(2 hunks)bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml(1 hunks)bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml(1 hunks)controllers/mustgather/mustgather_controller.go(2 hunks)controllers/mustgather/mustgather_controller_test.go(1 hunks)controllers/mustgather/template.go(5 hunks)controllers/mustgather/template_test.go(3 hunks)deploy/99_must-gather-operator.Deployment.yaml(1 hunks)deploy/crds/operator.openshift.io_mustgathers.yaml(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- deploy/99_must-gather-operator.Deployment.yaml
- README.md
- bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml
- bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml
🔇 Additional comments (10)
api/v1alpha1/zz_generated.deepcopy.go (1)
12-25: LGTM! Auto-generated deepcopy code is correct.The deepcopy implementation for
HostedClusterOptionsand the integration intoMustGatherSpec.DeepCopyIntofollow standard controller-runtime patterns and are correctly generated.Also applies to: 102-106
controllers/mustgather/mustgather_controller_test.go (1)
1150-1235: LGTM! Comprehensive test coverage for IsValid validation logic.The test cases thoroughly cover all validation scenarios:
- Default image without hosted cluster options
- ACM HCP with valid configuration
- ACM HCP with missing required fields (name, namespace, or entire options struct)
The table-driven test approach is clean and maintainable.
controllers/mustgather/mustgather_controller.go (2)
334-353: LGTM! IsValid validation logic is correct.The validation logic properly enforces the ACM HCP constraints:
- Requires
HostedClusterOptionsto be non-nil when usingacm_hcpimage- Requires both
HostedClusterNamespaceandHostedClusterNameto be set- Enforces that audit must be disabled for
acm_hcpmode
334-353: No duplicateIsValidimplementations found.The verification confirms only one
IsValidmethod declaration exists in the controller file at line 334. The original concern about duplicate implementations causing compilation errors is unfounded. The review comment should be disregarded.Likely an incorrect or invalid review comment.
controllers/mustgather/template_test.go (1)
79-167: LGTM! Test updates correctly reflect the new signature and behavior.The tests properly:
- Set up both
DEFAULT_MUST_GATHER_IMAGEandACM_HCP_MUST_GATHER_IMAGEenvironment variables- Pass the new
mustGatherImageandhostedClusterOptionsparameters- Verify the container uses the correct image
- Validate that ACM HCP command arguments are correctly appended
api/v1alpha1/mustgather_types.go (2)
28-30: LGTM! XValidation rules comprehensively enforce ACM HCP constraints.The struct-level validations ensure:
- When
mustGatherImageisacm_hcp, the hosted cluster namespace and name are required- Audit must be false for
acm_hcpmode- Hosted cluster fields are only valid when using
acm_hcpimageThis prevents invalid configurations at the API layer before reaching the controller.
68-92: LGTM! New fields are well-structured with proper validation.The
MustGatherImageenum andHostedClusterOptionstype are properly defined with:
- Clear enum values and default for
MustGatherImage- Required validation for
HostedClusterOptionsfields- Good inline documentation
deploy/crds/operator.openshift.io_mustgathers.yaml (1)
48-75: LGTM! CRD schema correctly reflects the Go type definitions.The CRD updates are consistent with
api/v1alpha1/mustgather_types.go:
hostedClusterOptionsfields match theHostedClusterOptionsstructmustGatherImageenum and default match the Go field tags- X-kubernetes-validations match the XValidation rules
Also applies to: 233-243
controllers/mustgather/template.go (2)
182-217: LGTM! Image selection and command argument logic is correct.The implementation properly:
- Selects the container image based on
mustGatherImagevalue- Constructs ACM HCP-specific command arguments with namespace and cluster name
- Maintains backward compatibility with the default image flow
200-204: Environment variables are properly configured in the deployment manifest.The
DEFAULT_MUST_GATHER_IMAGEandACM_HCP_MUST_GATHER_IMAGEenvironment variables are both explicitly set with non-empty values in the deployment manifest (deploy/99_must-gather-operator.Deployment.yaml):
DEFAULT_MUST_GATHER_IMAGE:quay.io/openshift/origin-must-gather:latestACM_HCP_MUST_GATHER_IMAGE:registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8The code will receive these values when
os.Getenv()is called, andcontainerImagewill not be empty. The concern raised in the original review comment is not applicable.
697a9b1 to
c1779a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
deploy/99_must-gather-operator.Deployment.yaml (1)
45-46: Consider version management strategy for the ACM HCP must-gather image.The image reference is pinned to
v2.8. If ACM releases newer versions with fixes or features, this hardcoded value will require manual updates across deployments.api/v1alpha1/mustgather_types.go (1)
68-75: Clarify image selection in documentation.The comments reference specific registry URLs, but the actual image selection happens via environment variables (
DEFAULT_MUST_GATHER_IMAGEandACM_HCP_MUST_GATHER_IMAGE). Consider updating the comments to reference the environment variables rather than hardcoding URLs, which improves maintainability.Example:
// The image to use for the must-gather pods. // This can currently only be one of two values: -// 'default' for most cases including OCP and Managed OpenShift, which will use quay.io/openshift/must-gather -// 'acm_hcp' for HCP must gathers, which will use registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8 +// 'default' for most cases including OCP and Managed OpenShift, which will use the image from DEFAULT_MUST_GATHER_IMAGE env var +// 'acm_hcp' for HCP must gathers, which will use the image from ACM_HCP_MUST_GATHER_IMAGE env var
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (11)
README.md(1 hunks)api/v1alpha1/mustgather_types.go(2 hunks)api/v1alpha1/zz_generated.deepcopy.go(2 hunks)bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml(1 hunks)bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml(1 hunks)controllers/mustgather/mustgather_controller.go(2 hunks)controllers/mustgather/mustgather_controller_test.go(1 hunks)controllers/mustgather/template.go(5 hunks)controllers/mustgather/template_test.go(3 hunks)deploy/99_must-gather-operator.Deployment.yaml(1 hunks)deploy/crds/operator.openshift.io_mustgathers.yaml(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- bundle/manifests/tech-preview/support-log-gather-operator.clusterserviceversion.yaml
- controllers/mustgather/template_test.go
- controllers/mustgather/template.go
- README.md
- controllers/mustgather/mustgather_controller_test.go
- bundle/manifests/tech-preview/operator.openshift.io_mustgathers.yaml
- deploy/crds/operator.openshift.io_mustgathers.yaml
🔇 Additional comments (5)
api/v1alpha1/mustgather_types.go (2)
28-30: LGTM! XValidation rules correctly enforce HCP invariants.The struct-level validations properly enforce that:
- When
mustGatherImageisacm_hcp, hosted cluster fields are required and audit must be false- When
mustGatherImageis notacm_hcp, hosted cluster fields must be absentThis prevents invalid configurations at the API layer.
82-92: LGTM! HostedClusterOptions struct is well-defined.The required validations on both fields align with the XValidation rules at the spec level and ensure that hosted cluster information is complete when specified.
controllers/mustgather/mustgather_controller.go (2)
98-102: LGTM! Validation is correctly placed before finalizer handling.Early validation prevents invalid MustGather resources from entering the reconciliation flow, ensuring clean error handling.
299-318: No issues found. The review comment is factually incorrect on multiple counts.The script results definitively show:
acmHcpMustGatherImage constant IS defined in
controllers/mustgather/template.go:55as"acm_hcp". The concern that it "is not visible in the provided code" is incorrect.Only ONE IsValid method exists in the controller file (
mustgather_controller.go:299-318). The concern about a "duplicate IsValid implementation within this file" is incorrect. (The IsValid method found in vendored code is from an external dependency, not a duplicate within the controller.)The validation logic may or may not duplicate XValidation rules in the API types, but the factual errors regarding the constant definition and duplicate method implementation invalidate the core premises of this review comment.
Likely an incorrect or invalid review comment.
api/v1alpha1/zz_generated.deepcopy.go (1)
12-25: LGTM! Generated deepcopy methods are correct.The auto-generated deepcopy logic properly handles the new
HostedClusterOptionstype and its integration intoMustGatherSpec, including appropriate nil checks for the pointer field.Also applies to: 119-123
|
Thanks @iamkirkbater, I've incorporated your suggestions |
|
/retest |
| - name: DEFAULT_MUST_GATHER_IMAGE | ||
| value: quay.io/openshift/origin-must-gather:latest | ||
| - name: ACM_HCP_MUST_GATHER_IMAGE | ||
| value: registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know a ton about this image - should this be pointing at quay.io for OCP deployments?
@TrilokGeer @vaidehi411 @swghosh @shivprakashmuley - any thoughts here? Would love to lean on your expertise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are typically placeholder images, which get replaced with acceptable production grade images (i.e. generally registry.redhat.io ones) once building through our ART pipelines.
| - name: DEFAULT_MUST_GATHER_IMAGE | ||
| value: quay.io/openshift/origin-must-gather:latest | ||
| - name: ACM_HCP_MUST_GATHER_IMAGE | ||
| value: registry.redhat.io/multicluster-engine/must-gather-rhel9:v2.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question about this image here.
|
Overall I think this looks good - I'd love to get a review from someone on the must gather team though, especially with the open questions about using the registry.redhat.io images and whether that's something that would work on all OCP environments they need to account for. @TrilokGeer @vaidehi411 @swghosh @shivprakashmuley if any of you could take a look and make sure this aligns with your expectations going forward? |
|
@rolandmkunkel: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
We have decided to perform the must-gather locally in CAD without MGO instead, so I'm no longer going to move forward with this PR. I'm still happy to work with you in order to get this merged, if this contribution is still helpful for you. |
|
@rolandmkunkel thank you for your PR! We're looking into the specifics and security aspects of implementing custom must-gather images with this operator, which is still in an early evaluation phase. Likely, a custom must-gather image with scripts to collect data specific to HostedCluster(s) should help with this specific use case, IMHO. Also, given that we'd really appreciate if you or folks from HCM team be part of the feedback loop for us in this. |
|
@rolandmkunkel @iamkirkbater Could either of you please help read openshift/enhancements#1906 enhancement and let us know if the specific use case being tackled through this PR can be covered through support for custom must-gather images too, thanks in advance! |
|
@swghosh Yes, I believe the enhancement covers this use case as well. As stated previously, we currently do not require this implementation at this time. I'm going to go ahead and close this PR. Thank you for following this up! |
This change introduces the capability to run a must-gather against an ACM (Advanced Cluster Management) Hosted Control Plane cluster. This required adding a mechanism to select a different must-gather image and provide specific arguments for it.
Key changes include:
API Extension: The MustGather CRD spec has been updated with three new fields:
mustGatherImage: An enum that allows selecting the type of must-gather to run. It supports "default" (the existing behavior) and "acm_hcp" for hosted clusters.
hostedClusterName: The name of the hosted cluster.
hostedClusterNamespace: The namespace of the hosted cluster.