-
Notifications
You must be signed in to change notification settings - Fork 535
MG-155: Proposal to add custom must-gather image option in MustGather Spec. #1906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MG-155: Proposal to add custom must-gather image option in MustGather Spec. #1906
Conversation
|
@shivprakashmuley: This pull request references MG-155 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@shivprakashmuley: This pull request references MG-155 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@shivprakashmuley: This pull request references MG-155 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
952fbd3 to
32a4048
Compare
32a4048 to
ca603c4
Compare
|
|
||
| 1. **Create Roles:** A cluster-admin creates the `Roles` or `ClusterRoles` that contain the permissions required for various diagnostic tasks. | ||
| 2. **Create ServiceAccounts and Bindings:** The admin creates long-lived `ServiceAccount`s and binds each one to the appropriate role using a `RoleBinding` or `ClusterRoleBinding`. | ||
| 3. **Define Allowlist via `ImageStream`:** The admin creates `ImageStream`s in operator's namespace, where each `ImageStreamTag` points to an allowed custom image URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a well defined imagestream with a pre-defined name we agree upon? how do we know which imagestream to look for? or are we assuming there will be only one imagestream, so we just pick that up and use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep the parameter ImageStreamTag in <ImageStream_name>:<Tag_name> format.
Multiple ImageStreams can be created in the operator namespace.
By parsing ImageStreamTag we can get hold of which ImageStream to look for and the respective tag.
|
|
||
| **Part 2: User Request and Operator Execution** | ||
|
|
||
| 1. **User Request:** A user creates a `MustGather` CR, setting the new `spec.imageStreamTag` field to an allowed tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we allow multiple custom images to be specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will allow one custom image per CR. It will be easier to maintain the success/failures of each image run.
| namespace: team-a-namespace | ||
| spec: | ||
| # Reference to the allowed image via the new field | ||
| imageStreamTag: "network-debug-tools:v1.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will look for an imagestream tag in an imagestream in the namespace of the CR. how do we map it to fetch the imagestream tag from the must-gather-operator namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can recommend to create all imagestreams in operator namespace only. Operator (in reconcile logic) will only look for imagestreams in operator namespace.
Please let me know if you see issues with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so operator will resolve the imagestream tag and use that image to create the pod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. Operator will resolve imagestream tag, fetch corresponding image - if valid - create the must-gather pod with that image.
| // +kubebuilder:validation:Optional | ||
| // ImageStreamTag is the new field to specify a custom image from the allowlist. | ||
| ImageStreamTag string `json:"imageStreamTag,omitempty"` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to factor in some field for letting users choose some custom arguments for the custom must-gather images that they trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's mainly so we can conclusively support these two use cases:
- IIRC, in the past enhancement discussion we wanted to introduce an
additionalConfigstruct that allows customizing the default platform must-gather into running/usr/bin/gather_audit - HyperShift custom must-gather dumping i.e. SREP-1884: Add support for ACM Hosted Control Plane (HCP) must-gathers must-gather-operator#296 (comment) requires the ability to specify arbitrary config value for
HostedClusterOptions{Namespace, Name}, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly for audit or /usr/bin/gather_sudit_logs we have a separate API field in the spec.
https://github.com/openshift/must-gather-operator/blob/master/controllers/mustgather/template.go#L29
https://github.com/openshift/must-gather-operator/blob/master/controllers/mustgather/template.go#L27
HyperShift custom must-gather dumping i.e. openshift/must-gather-operator#296 (comment) requires the ability to specify arbitrary config value for HostedClusterOptions{Namespace, Name}, etc.
yes we can add a new struct additionalConfig for any other custom arguments.
Should we do that as part of the same enhancement or a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable IMHO to add this in the same enhancement, as I'd assume many custom must-gather images could benefit from custom args.
|
|
||
| **Part 2: User Request and Operator Execution** | ||
|
|
||
| 1. **User Request:** A user creates a `MustGather` CR, setting the new `spec.imageStreamTag` field to an allowed tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now with the introduction of imageStreamTag, how is the default platform must-gather image (that we ship/referred to be known in our operator) handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, am assuming we treat the "default platform must-gather image" as an exception and don't change the behavior for it.
Still we need to support the images in https://github.com/openshift/oc/blob/main/pkg/cli/admin/mustgather/mustgather.go#L166 i.e. the custom must-gather images for each operator that is installed on the cluster and that ships a well trusted gather image. It can be very cumbersome for a cluster admin to have an imageStreamTag against each operator image especially when operators are updated the tags/images can change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, am assuming we treat the "default platform must-gather image" as an exception and don't change the behavior for it.
imageStreamTag will be an optional parameter. If user does not specify it, operator will fall back to default must-gather image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still we need to support the images in https://github.com/openshift/oc/blob/main/pkg/cli/admin/mustgather/mustgather.go#L166 i.e. the custom must-gather images for each operator that is installed on the cluster and that ships a well trusted gather image. It can be very cumbersome for a cluster admin to have an imageStreamTag against each operator image especially when operators are updated the tags/images can change.
There are some ways we can tackle this:
- For this we can introduce another API field, similar
--all-images. If that is set operator will look for the must-gather annotation and createimagestreams for those images instead of an admin. - As part of operator initialization/deployment operator scans and discovers these annotations and create separate imagestream or create a single imagestream and add these operator-must-gather images as a unique tag under one imagestream.
What we will save by this is, admin does not have toc reate imagestream(s) for these operator-must-gather images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still we need to support the images in https://github.com/openshift/oc/blob/main/pkg/cli/admin/mustgather/mustgather.go#L166 i.e. the custom must-gather images for each operator that is installed on the cluster and that ships a well trusted gather image. It can be very cumbersome for a cluster admin to have an imageStreamTag against each operator image especially when operators are updated the tags/images can change.
There are some ways we can tackle this:
- For this we can introduce another API field, similar
--all-images. If that is set operator will look for the must-gather annotation and createimagestreams for those images instead of an admin.- As part of operator initialization/deployment operator scans and discovers these annotations and create separate imagestream or create a single imagestream and add these operator-must-gather images as a unique tag under one imagestream.
What we will save by this is, admin does not have toc reate imagestream(s) for these operator-must-gather images.
Can we first find out if there is a need to do this? i.e, do current users of the operator actually require this feature. I would like to start with just supporting collection of must-gather through specifying the custom images explicitly
5a9bb15 to
12cd8ba
Compare
| // ... existing fields ... | ||
| // +kubebuilder:validation:Optional | ||
| // ImageStreamTag is the new field to specify a custom image from the allowlist. | ||
| ImageStreamTag string `json:"imageStreamTag,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imageStreamRef with {Name, Tag, ..} would be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or generally, a LocalObjectReference is a good choice for this type of references as this keep future extensibility options open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imageStreamRef
(eg. our CI release contoller uses imageStreamRef)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion @swghosh. I have included it in the proposal now.
| - https://issues.redhat.com/browse/MG-155 | ||
| --- | ||
|
|
||
| # Enhancement Proposal: Custom Must-Gather Images |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Enhancement Proposal: Custom Must-Gather Images | |
| # Custom must-gather Images for Support Log Gather operator |
| type AdditionalConfig struct { | ||
| // +kubebuilder:validation:Optional | ||
| // Metrics is an example field that could specify whether to collect Prometheus metrics. | ||
| Metrics bool `json:"metrics,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as the intent here was to support arbitrary parameter values, would you think we should opt for an unstructured field instead?
eg. runtime.RawExtension to allow any arbitrary key structs being placed in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was, if we want to add/support any specific functionality from operator side (for which changes are needed in operator space) will go under AdditionalConfig, in this case then I felt structured field would make sense. For custom args anyway we will have args parameter, that can have any value depending the custom image requirements.
Do you feel we should support custom/arbitrary fields under additionalConfig as well? @swghosh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO sounds fair, we can take that approach. But in case we fully want to support the least customisation on the custom gather images, I guess we should atleast provide for
struct{ GatherCommand string, Args []string }
that way we won't need to jump to arbitrary unstructured immediately, yet provide room for enough customisation (args and command is something our oc adm must-gather cli already provides).
| // +kubebuilder:validation:Optional | ||
| // Command allows overriding the command and/or arguments for the custom must-gather image. | ||
| // This field is ignored if ImageStreamRef is not specified. | ||
| Command *GatherCommand `json:"command,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Command *GatherCommand `json:"command,omitempty"` | |
| Command *GatherCommand `json:"gatherCommand,omitempty"` |
swghosh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ small nit, otherwise LGTM
|
/lgtm |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: swghosh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@shivprakashmuley: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR introduces an enhancement proposal to support the use of custom must-gather images with the must-gather-operator. This feature allows cluster administrators to define an allowlist of trusted images that can be used for diagnostic data collection, providing a secure and flexible way to gather specialized information.
Summary:
The proposal introduces two main API changes:
A new
MustGatherImage CRD: This cluster-scoped resource acts as an allowlist for custom must-gather images. Cluster administrators can manage this resource to control which images are permitted to run in the cluster.An update to the
MustGather CRD: The MustGather CRD is extended with an optional mustGatherImage field. When creating a MustGather resource, users can specify an image from the allowlist to be used for the data collection job.The must-gather-operator's logic is updated to validate the mustGatherImage against the MustGatherImage allowlist. If the image is valid, the operator will use it to run the must-gather job. If the image is not in the allowlist, or if the allowlist is not configured, the MustGather resource's status will be updated with an error. If no custom image is specified, the operator will use the default must-gather image, ensuring backward compatibility.
User-Facing Changes:
Cluster administrators can now create and manage a
MustGatherImageresource to control the use of custom must-gather images.Users can specify a custom must-gather image in the
MustGather CR, provided it is in the allowlist.JIRA tracker:
https://issues.redhat.com/browse/MG-155