-
Notifications
You must be signed in to change notification settings - Fork 33
OCPBUGS-63617: Fix Prevent conflicting feature gate settings in catalogd deployment #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jianzhangbjz: This pull request references Jira Issue OCPBUGS-63617, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jianzhangbjz: This pull request references Jira Issue OCPBUGS-63617, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Hi @tmshort , could you help take a look when you get a chance? Thanks! |
|
Failed to test it since https://redhat-internal.slack.com/archives/CNHC2DK2M/p1761643892932239 |
|
/retest-required |
1 similar comment
|
/retest-required |
|
Test passed, details: https://issues.redhat.com/browse/OCPBUGS-63617 |
|
@jianzhangbjz: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
In what scenario did this happen? I suspect this happened because the experimental.yaml file enabled it, but it was disabled by the feature-gates. We should really modify that file to remove the feature gate configuration as well. |
tmshort
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
There's an optimization (that I strongly recommend) that can be done in ClearFeatureGates().
Even with this fix as a safety check, the experimental.yaml file in openshift/operator-framework-operator-controller should still be updated
pkg/helmvalues/helmvalues.go
Outdated
| // over any feature gates defined in helm values files | ||
| func (v *HelmValues) ClearFeatureGates() error { | ||
| // Clear operator-controller feature gates if they exist | ||
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(EnableOperatorController, ".")...); found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retrieving the NestedStringSlice() is not necessary. Every one of these can simply be a SetNestedStringSlice().
pkg/helmvalues/helmvalues.go
Outdated
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(EnableOperatorController, ".")...); found { | ||
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(EnableOperatorController, ".")...); err != nil { | ||
| return err | ||
| } | ||
| } | ||
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(DisableOperatorController, ".")...); found { | ||
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(DisableOperatorController, ".")...); err != nil { | ||
| return err | ||
| } | ||
| } | ||
| // Clear catalogd feature gates if they exist | ||
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(EnableCatalogd, ".")...); found { | ||
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(EnableCatalogd, ".")...); err != nil { | ||
| return err | ||
| } | ||
| } | ||
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(DisableCatalogd, ".")...); found { | ||
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(DisableCatalogd, ".")...); err != nil { | ||
| return err | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be greatly simplified by using a string slice and iterating over that slice. Less chance of copy/paste errors, and the code is smaller, to boot.
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(EnableOperatorController, ".")...); found { | |
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(EnableOperatorController, ".")...); err != nil { | |
| return err | |
| } | |
| } | |
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(DisableOperatorController, ".")...); found { | |
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(DisableOperatorController, ".")...); err != nil { | |
| return err | |
| } | |
| } | |
| // Clear catalogd feature gates if they exist | |
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(EnableCatalogd, ".")...); found { | |
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(EnableCatalogd, ".")...); err != nil { | |
| return err | |
| } | |
| } | |
| if _, found, _ := unstructured.NestedStringSlice(v.values, strings.Split(DisableCatalogd, ".")...); found { | |
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(DisableCatalogd, ".")...); err != nil { | |
| return err | |
| } | |
| } | |
| locationsToClear := []string{ | |
| EnableOperatorController, | |
| DisableOperatorController, | |
| EnableCatalogd, | |
| DisableCatalogd, | |
| } | |
| for _, s := range locationsToClear { | |
| if err := unstructured.SetNestedStringSlice(v.values, []string{}, strings.Split(s, ".")...); err != nil { | |
| return err | |
| } | |
| } |
No specific configuration. Just a fresh OCP 4.21 cluster. jiazha-mac:~ jiazha$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.21.0-0.nightly-2025-10-30-125804 True False 40m Cluster version is 4.21.0-0.nightly-2025-10-30-125804
jiazha-mac:long_duration jiazha$ oc get deploy -n openshift-operator-controller operator-controller-controller-manager -o yaml |grep feature-gates
- --feature-gates=WebhookProviderOpenshiftServiceCA=true
- --feature-gates=SingleOwnNamespaceInstallSupport=true
- --feature-gates=PreflightPermissions=true
- --feature-gates=WebhookProviderOpenshiftServiceCA=true
- --feature-gates=WebhookProviderCertManager=false
- --feature-gates=HelmChartSupport=false
- --feature-gates=BoxcutterRuntime=false
- --feature-gates=PreflightPermissions=false
- --feature-gates=SingleOwnNamespaceInstallSupport=false
- --feature-gates=WebhookProviderCertManager=false
jiazha-mac:long_duration jiazha$ oc get deploy catalogd-controller-manager -o yaml -n openshift-catalogd |grep feature-gates
- --feature-gates=APIV1MetasHandler=true
- --feature-gates=APIV1MetasHandler=false
jiazha-mac:~ jiazha$ oc get featuregate cluster -o yaml
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
annotations:
include.release.openshift.io/self-managed-high-availability: "true"
creationTimestamp: "2025-10-31T06:15:54Z"
generation: 1
name: cluster
resourceVersion: "639"
uid: c041dbef-c894-41a4-93a9-8c4a7682b5a9
spec: {}
status:
featureGates:
- disabled:
- name: AWSClusterHostedDNS
- name: AWSClusterHostedDNSInstall
- name: AWSDedicatedHosts
- name: AWSDualStackInstall
- name: AWSServiceLBNetworkSecurityGroup
- name: AutomatedEtcdBackup
- name: AzureClusterHostedDNSInstall
- name: AzureDedicatedHosts
- name: AzureDualStackInstall
- name: AzureMultiDisk
- name: BootImageSkewEnforcement
- name: BootcNodeManagement
- name: CBORServingAndStorage
- name: ClientsAllowCBOR
- name: ClientsPreferCBOR
- name: ClusterAPIInstall
- name: ClusterAPIInstallIBMCloud
- name: ClusterAPIMachineManagementVSphere
- name: ClusterMonitoringConfig
- name: ClusterVersionOperatorConfiguration
- name: DNSNameResolver
- name: DualReplica
- name: DyanmicServiceEndpointIBMCloud
- name: DynamicResourceAllocation
- name: EtcdBackendQuota
- name: EventTTL
- name: EventedPLEG
- name: Example
- name: Example2
- name: ExternalSnapshotMetadata
- name: GCPClusterHostedDNS
- name: GCPCustomAPIEndpoints
- name: GCPCustomAPIEndpointsInstall
- name: GCPDualStackInstall
- name: ImageModeStatusReporting
- name: ImageStreamImportMode
- name: IngressControllerDynamicConfigurationManager
- name: InsightsConfig
- name: InsightsOnDemandDataGather
- name: IrreconcilableMachineConfig
- name: KMSEncryptionProvider
- name: MachineAPIMigration
- name: MachineAPIOperatorDisableMachineHealthCheckController
- name: ManagedBootImagesCPMS
- name: MaxUnavailableStatefulSet
- name: MinimumKubeletVersion
- name: MixedCPUsAllocation
- name: MultiArchInstallAzure
- name: MultiDiskSetup
- name: MutableCSINodeAllocatableCount
- name: MutatingAdmissionPolicy
- name: NewOLMCatalogdAPIV1Metas
- name: NewOLMOwnSingleNamespace
- name: NewOLMPreflightPermissionChecks
- name: NoRegistryClusterOperations
- name: NutanixMultiSubnets
- name: OSStreams
- name: OVNObservability
- name: SELinuxMount
- name: ShortCertRotation
- name: SignatureStores
- name: SigstoreImageVerificationPKI
- name: TranslateStreamCloseWebsocketRequests
- name: VSphereConfigurableMaxAllowedBlockVolumesPerNode
- name: VSphereHostVMGroupZonal
- name: VSphereMixedNodeEnv
- name: VolumeGroupSnapshot
enabled:
- name: AdditionalRoutingCapabilities
- name: AdminNetworkPolicy
- name: AlibabaPlatform
- name: AzureWorkloadIdentity
- name: BuildCSIVolumes
- name: CPMSMachineNamePrefix
- name: ConsolePluginContentSecurityPolicy
- name: ExternalOIDC
- name: ExternalOIDCWithUIDAndExtraClaimMappings
- name: GCPClusterHostedDNSInstall
- name: GatewayAPI
- name: GatewayAPIController
- name: HighlyAvailableArbiter
- name: ImageVolume
- name: KMSv1
- name: MachineConfigNodes
- name: ManagedBootImages
- name: ManagedBootImagesAWS
- name: ManagedBootImagesAzure
- name: ManagedBootImagesvSphere
- name: MetricsCollectionProfiles
- name: NetworkDiagnosticsConfig
- name: NetworkLiveMigration
- name: NetworkSegmentation
- name: NewOLM
- name: NewOLMWebhookProviderOpenshiftServiceCA
- name: OpenShiftPodSecurityAdmission
- name: PinnedImages
- name: PreconfiguredUDNAddresses
- name: ProcMountType
- name: RouteAdvertisements
- name: RouteExternalCertificate
- name: ServiceAccountTokenNodeBinding
- name: SigstoreImageVerification
- name: StoragePerformantSecurityPolicy
- name: UpgradeStatus
- name: UserNamespacesPodSecurityStandards
- name: UserNamespacesSupport
- name: VSphereMultiDisk
- name: VSphereMultiNetworks
- name: VolumeAttributesClass
version: 4.21.0-0.nightly-2025-10-30-125804 |
b0ce393 to
4f89f16
Compare
4f89f16 to
be362d1
Compare
|
Hi @tmshort , I have updated it based on your comments, thanks very much! And, I don't think we need to explicitly set |
|
Test passed, details: 1. Build an OCP cluster with this unmerged PR via the cluster-bot
jiazha-mac:~ jiazha$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.21.0-0-2025-10-31-101545-test-ci-ln-97i27lk-latest True False 6m32s Cluster version is 4.21.0-0-2025-10-31-101545-test-ci-ln-97i27lk-latest
jiazha-mac:~ jiazha$ oc get featuregate cluster -o yaml
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
annotations:
include.release.openshift.io/self-managed-high-availability: "true"
creationTimestamp: "2025-10-31T10:34:21Z"
generation: 1
name: cluster
resourceVersion: "701"
uid: 7b745465-5fc0-4456-822e-38cbdaeb65a7
spec: {}
status:
featureGates:
- disabled:
- name: AWSClusterHostedDNS
- name: AWSClusterHostedDNSInstall
- name: AWSDedicatedHosts
- name: AWSDualStackInstall
- name: AWSServiceLBNetworkSecurityGroup
- name: AutomatedEtcdBackup
- name: AzureClusterHostedDNSInstall
- name: AzureDedicatedHosts
- name: AzureDualStackInstall
- name: AzureMultiDisk
- name: BootImageSkewEnforcement
- name: BootcNodeManagement
- name: CBORServingAndStorage
- name: ClientsAllowCBOR
- name: ClientsPreferCBOR
- name: ClusterAPIInstall
- name: ClusterAPIInstallIBMCloud
- name: ClusterAPIMachineManagementVSphere
- name: ClusterMonitoringConfig
- name: ClusterVersionOperatorConfiguration
- name: DNSNameResolver
- name: DualReplica
- name: DyanmicServiceEndpointIBMCloud
- name: DynamicResourceAllocation
- name: EtcdBackendQuota
- name: EventTTL
- name: EventedPLEG
- name: Example
- name: Example2
- name: ExternalSnapshotMetadata
- name: GCPClusterHostedDNS
- name: GCPCustomAPIEndpoints
- name: GCPCustomAPIEndpointsInstall
- name: GCPDualStackInstall
- name: ImageModeStatusReporting
- name: ImageStreamImportMode
- name: IngressControllerDynamicConfigurationManager
- name: InsightsConfig
- name: InsightsOnDemandDataGather
- name: IrreconcilableMachineConfig
- name: KMSEncryptionProvider
- name: MachineAPIMigration
- name: MachineAPIOperatorDisableMachineHealthCheckController
- name: ManagedBootImagesCPMS
- name: MaxUnavailableStatefulSet
- name: MinimumKubeletVersion
- name: MixedCPUsAllocation
- name: MultiArchInstallAzure
- name: MultiDiskSetup
- name: MutableCSINodeAllocatableCount
- name: MutatingAdmissionPolicy
- name: NewOLMCatalogdAPIV1Metas
- name: NewOLMOwnSingleNamespace
- name: NewOLMPreflightPermissionChecks
- name: NoRegistryClusterOperations
- name: NutanixMultiSubnets
- name: OSStreams
- name: OVNObservability
- name: SELinuxMount
- name: ShortCertRotation
- name: SignatureStores
- name: SigstoreImageVerificationPKI
- name: TranslateStreamCloseWebsocketRequests
- name: VSphereConfigurableMaxAllowedBlockVolumesPerNode
- name: VSphereHostVMGroupZonal
- name: VSphereMixedNodeEnv
- name: VolumeGroupSnapshot
enabled:
- name: AdditionalRoutingCapabilities
- name: AdminNetworkPolicy
- name: AlibabaPlatform
- name: AzureWorkloadIdentity
- name: BuildCSIVolumes
- name: CPMSMachineNamePrefix
- name: ConsolePluginContentSecurityPolicy
- name: ExternalOIDC
- name: ExternalOIDCWithUIDAndExtraClaimMappings
- name: GCPClusterHostedDNSInstall
- name: GatewayAPI
- name: GatewayAPIController
- name: HighlyAvailableArbiter
- name: ImageVolume
- name: KMSv1
- name: MachineConfigNodes
- name: ManagedBootImages
- name: ManagedBootImagesAWS
- name: ManagedBootImagesAzure
- name: ManagedBootImagesvSphere
- name: MetricsCollectionProfiles
- name: NetworkDiagnosticsConfig
- name: NetworkLiveMigration
- name: NetworkSegmentation
- name: NewOLM
- name: NewOLMWebhookProviderOpenshiftServiceCA
- name: OpenShiftPodSecurityAdmission
- name: PinnedImages
- name: PreconfiguredUDNAddresses
- name: ProcMountType
- name: RouteAdvertisements
- name: RouteExternalCertificate
- name: ServiceAccountTokenNodeBinding
- name: SigstoreImageVerification
- name: StoragePerformantSecurityPolicy
- name: UpgradeStatus
- name: UserNamespacesPodSecurityStandards
- name: UserNamespacesSupport
- name: VSphereMultiDisk
- name: VSphereMultiNetworks
- name: VolumeAttributesClass
version: 4.21.0-0-2025-10-31-101545-test-ci-ln-97i27lk-latest
2, Check those feature-gates settings, no explicitly set --feature-gates=FeatureNamexx=false, LGTM.
jiazha-mac:~ jiazha$ oc get deploy catalogd-controller-manager -o yaml -n openshift-catalogd |grep feature-gates
jiazha-mac:~ jiazha$
jiazha-mac:~ jiazha$ oc get deploy -n openshift-operator-controller operator-controller-controller-manager -o yaml |grep feature-gates
- --feature-gates=WebhookProviderOpenshiftServiceCA=true
- --feature-gates=WebhookProviderCertManager=false
-
3. Enable TP.
jiazha-mac:cluster-olm-operator jiazha$ oc patch featuregate cluster -p '{"spec": {"featureSet": "TechPreviewNoUpgrade"}}' --type=merge
featuregate.config.openshift.io/cluster patched
4, Check those feature-gates settings, LGTM.
jiazha-mac:cluster-olm-operator jiazha$ oc get deploy catalogd-controller-manager -o yaml -n openshift-catalogd |grep feature-gates
- --feature-gates=APIV1MetasHandler=true
jiazha-mac:cluster-olm-operator jiazha$ oc get deploy -n openshift-operator-controller operator-controller-controller-manager -o yaml |grep feature-gates
- --feature-gates=PreflightPermissions=true
- --feature-gates=SingleOwnNamespaceInstallSupport=true
- --feature-gates=WebhookProviderOpenshiftServiceCA=true
- --feature-gates=WebhookProviderCertManager=false |
|
/verified by @jianzhangbjz |
|
@jianzhangbjz: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
tmshort
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a Slack discussion regarding the GA of WebHook enablement, and the need to explicitly disable feature-gates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we do want to explicitly disable (regardless of Kubernetes "best practices"), as this allows us to have features GA'd upstream and not downstream. That was the whole point of #144.
We need to explicitly disable WebHookCertManager, otherwise, it takes precedence over WebHookOpenshiftServiceCA (although you did leave that disablement in).
Please revert mapper.go and mapper_test.go.
I hate to harp on this, but this was done on purpose due to the Fundamentally, this mechanism allows operator-framework to GA features upstream without doing it upstream, and allows CustomNoUpgrade to work as expected for features that are GA'd, but still have a feature flag (because features can still be disabled). |
|
Alternative in #148 that keeps the |
|
Closed this PR since
|
|
@jianzhangbjz: This pull request references Jira Issue OCPBUGS-63617. The bug has been updated to no longer refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jianzhangbjz: This pull request references Jira Issue OCPBUGS-63617, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jianzhangbjz The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
be362d1 to
02e796a
Compare
|
/retest-required |
02e796a to
330728c
Compare
|
/test openshift-e2e-aws-techpreview |
|
@jianzhangbjz: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Closed it since #148 has fixed this issue. |
|
@jianzhangbjz: This pull request references Jira Issue OCPBUGS-63617. The bug has been updated to no longer refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Fix Conflicting Feature Gate Settings in Catalogd
Problem
APIV1MetasHandlerwas set to both true and false due to helm valuesfiles conflicting with cluster feature gate configuration.
Solution
Clear helm values file feature gates before applying cluster configuration.
This ensures cluster-driven settings take precedence.
Changes
ClearFeatureGates()to helmvalues packageClearFeatureGates()before merging cluster config in helm.goResult
Feature gates are now set correctly based on cluster configuration:
--feature-gates=APIV1MetasHandler=false--feature-gates=APIV1MetasHandler=trueAssisted-by: Claude Code