Skip to content

SPLAT-2137: Support Security Group on NLB for Default router on AWS#1802

Open
mtulio wants to merge 8 commits intoopenshift:masterfrom
mtulio:SPLAT-2137
Open

SPLAT-2137: Support Security Group on NLB for Default router on AWS#1802
mtulio wants to merge 8 commits intoopenshift:masterfrom
mtulio:SPLAT-2137

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented May 20, 2025

Enhancement proposal to introduce the support of Security Group to the Service type-loadBalancer NLBs to the AWS Cloud Controller Manager (CCM), ensuring OpenShift sets the default configuration to teach to manage CCM to new NLBs.

https://issues.redhat.com/browse/OCPSTRAT-1553
https://issues.redhat.com/browse/SPLAT-2137

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 20, 2025
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 20, 2025

@mtulio: This pull request references SPLAT-2137 which is a valid jira issue.

Details

In response to this:

https://issues.redhat.com/browse/OCPSTRAT-1553
https://issues.redhat.com/browse/SPLAT-2137

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 20, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented May 20, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 20, 2025

/test all

Copy link
Copy Markdown

@bchandra-ocp bchandra-ocp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing this enhancement - it's great to see this progress ahead.

I'm just starting to review but had basic questions on the summary so want to wait before I proceed.

Copy link
Copy Markdown
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is generally making sense to me, i've left some comments and questions.

- Configure Ingress rules in the Security Group to allow traffic on the ports defined in the Service's `spec.ports`. The source for these rules will be determined by the `service.beta.kubernetes.io/load-balancer-source-ranges` annotation on the Service (if present, otherwise default to allowing from all IPs).
- Configure Egress rules in the Security Group to allow traffic to the backend pods on the targetPort specified in the Service's `spec.ports` and the health check port. Initially, this should be restricted to the cluster's VPC CIDR or the specific CIDRs of the worker nodes.
- When creating the NLB using the AWS ELBv2 API, the CCM will include the ID of the newly created Security Group in the `SecurityGroups` parameter of the `CreateLoadBalancerInput.`
- When the Service is deleted, the CCM will also delete the associated Security Group, ensuring proper cleanup.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if this annotation is added after the Service is created? (ie what happens on update)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on it, ensuring I will follow the current state of CCM along side the ALBC to correctly document it. Thanks for raising that question.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be able to answer these questions for upstream, but downstream we could prevent those transitions with VAP


- The CCM's service controller will watch for Service creations and updates.
- When it encounters a Service with the annotation `service.beta.kubernetes.io/aws-load-balancer-managed-security-group: "true"` and `service.beta.kubernetes.io/aws-load-balancer-type: nlb`, the CCM will:
- Create a new AWS Security Group for the NLB. The name should follow a convention like `k8s-elb-a<generated-name-from-service-uid>`.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name should follow a convention like `k8s-elb-a

This is a interesting point, the convention for CCM to create NLB from Services is different than the ALBC, which follow the pattern: k8s-<namespace>-<service_name>-<id>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, I see NLB tags aren't standardized too:
CCM:

kubernetes.io/cluster/clusterID: owned
kubernetes.io/service-name: namespace/service-name

ALBC:

elbv2.k8s.aws/cluster: clusterID
service.k8s.aws/resource: LoadBalancer
service.k8s.aws/stack: namespace/service-name

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question to @JoelSpeed @elmiko - do we want to standardize the NLB Tags between controllers too?

IIUC kubernetes.io/cluster/clusterID: owned was not added in my ALBC exploration because the service was created by ALBO/ALBC which seems not to enforce cluster tags.

region: us-east-1
lbType: NLB <-- deprecate by platform.aws.ingressController.loadBalancerType?
ingressController: <-- proposing to aggregate CIO configurations
securityGroupEnabled: True <-- new field
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want to have different security groups for ingress vs the rest of the cluster? Is that possible?

Do we need the option for this to be automatic (use the same as you'd expect for default) but also a BYO option where users can specify specific SG IDs to be used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want to have different security groups for ingress vs the rest of the cluster? Is that possible?

Would you mind elaborate it? I am not sure if I followed correctly as the proposal is already add a dedicated SG to the NLB of the rest of cluster.

Do we need the option for this to be automatic (use the same as you'd expect for default) but also a BYO option where users can specify specific SG IDs to be used?

That's a fair point, but I am note sure if we have customer use case for BYO SG on CIO, and also I wonder if supporting BYO SG would diverge of the main focus of this EP: enable NLB with security group.

BYO SG would increase a bit the implementation scope, specially in the CCM. IIUC By definition when SG IDs are added (BYO SG) through annotations, the CCM (Classic LB), or ALBC, won't manage those SGs' lifecycle. The ALBC also provides an extra annotation (manage-backend-security-group-rules) to allow managing node rules:

If you specify this annotation, you need to configure the security groups on your Node/Pod to allow inbound traffic from the load balancer. You could also set the manage-backend-security-group-rules if you want the controller to manage the access rule

So what we are targeting is to provide the initial ability of enabling SG on NLB, similar it deploys CLB by default, as requested by managed Services. I am thinking if any additional feature/parity with ALBC would fall into the long-term planning we've been discussing with PMs. Do you think we could phase it? Thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the latest version I added the BYO SG workflow as a later phase as opt-in to the Service object, removing the installer/CIO option/API.

annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-managed-security-group: "true" <-- new annotation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the annotation scheme look like in the AWS LBC? I thought it just allowed you to specify IDs

I think the upstream change to the CCM wants to mimic the behaviour described in https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/service/annotations/#security-groups

Is our described behaviour here compatible with that, if not, have we deliberately deviated from that pattern?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is our described behaviour here compatible with that

It is not, proposed annotation service.beta.kubernetes.io/aws-load-balancer-managed-security-group is not the same as BYO SG annotation. To recap BYO SG annotations:

on ALBC:

on CCM:

, if not, have we deliberately deviated from that pattern?

Yes, it is intentionally proposing a new annotation to signalize the CCM to manage the SG when NLB (allowing users to transition to this config: opt-in). It was added mainly to prevent changing the default behavior of CCM when provisioning NLB.

AFAICT the ALBC does not provide this option as it defaults to SG since v2.6.0 (Aug 10, 2023), and it's not possible to disable it (?).

Alternatively, I can see:

  • Changing explicitly the default behavior of NLB to always create SGs (do we want that?)

I believe we can converge to the thread https://github.com/openshift/enhancements/pull/1802/files#r2111532244 where you mentioned the transition and suggested configuration changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latest version of this EP we are moving to a global configuration (cloud-config) for CCM, enforced in OpenShift by CCCMO, instead of a "managed" annotation as described above.

The BYO SG flow is also covered in a later phase of this EP, ensuring customers can opt-out the enforced managed SG on NLBs, following existing ALBC flow.

- Configure Ingress rules in the Security Group to allow traffic on the ports defined in the Service's `spec.ports`. The source for these rules will be determined by the `service.beta.kubernetes.io/load-balancer-source-ranges` annotation on the Service (if present, otherwise default to allowing from all IPs).
- Configure Egress rules in the Security Group to allow traffic to the backend pods on the targetPort specified in the Service's `spec.ports` and the health check port. Initially, this should be restricted to the cluster's VPC CIDR or the specific CIDRs of the worker nodes.
- When creating the NLB using the AWS ELBv2 API, the CCM will include the ID of the newly created Security Group in the `SecurityGroups` parameter of the `CreateLoadBalancerInput.`
- When the Service is deleted, the CCM will also delete the associated Security Group, ensuring proper cleanup.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be able to answer these questions for upstream, but downstream we could prevent those transitions with VAP

Comment on lines +262 to +266
// ServiceAnnotationLoadBalancerManagedSecurityGroup is the annotation used
// on the service to specify the instruct CCM to manage the security group when creating a Network Load Balancer. When enabled,
// the CCM creates the security group and it's rules. This option can not be used with annotations
// "service.beta.kubernetes.io/aws-load-balancer-security-groups" and "service.beta.kubernetes.io/aws-load-balancer-extra-security-groups".
const ServiceAnnotationLoadBalancerManagedSecurityGroup = "service.beta.kubernetes.io/aws-load-balancer-managed-security-group"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this doesn't exist in LBC right? Is this being introduced to allow a transition from a CCM where it does not currently create a security group, to enabling users to opt-in to creating security groups?

Have you considered if it might be better to make this a CCM configuration that an admin would set for the cluster, rather than setting it for each service?

I could see in the future OpenShift changing the default to say that all new NLBs should have a security group created automatically for them

Copy link
Copy Markdown
Contributor Author

@mtulio mtulio May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this doesn't exist in LBC right? Is this being introduced to allow a transition from a CCM where it does not currently create a security group, to enabling users to opt-in to creating security groups?

yes and yes. The idea was to prevent disrupt existing flow when creating services with NLB.

Have you considered if it might be better to make this a CCM configuration that an admin would set for the cluster, rather than setting it for each service?

I didn't but this is an excellent idea. It would decrease a lot the amount of API changes proposed in this EP, furthermore helping us in the future by (if) transitioning to ALBC.

@elmiko mentioned about requiring the CCM changes to be under a feature gate, what about if we introduce a FG that will enable SG by default when provisioning NLBs on CCM, so we can enable it on OCP and remove mostly API proposals, and annotations, in this EP?

It would also decrease the UX overhead, and also laser focus in the initial problem.

Would the workflow be like the following options (superficially)?:

openshift-install:
- user sets `platform.aws.lbType` to `NLB` value (currently opt-in)
- CCM config is added on OCP deployments (do we need/expose it through installer manifest?)
- CCM creates SG when gate is enabled when provisioning NLB

ROSA Classic or HCP:
- ensure CCM config is updated (or will it be enabled by default in KCM when API FG is set?)
- (same CCM flow)

No changes in CIO.

Is that makes sense?

Copy link
Copy Markdown
Contributor Author

@mtulio mtulio Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just finished the exploration, and this is the main idea (tl;dr):

    1. Create a new configuration in the cloud config (upstream CCM). Example
    1. Enforce the configuration in the 3CMO. Example

Once new service type loadbalancer NLB is created, the controller will manage an Security Group, attaching it to the new LB.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to follow the pattern set out in LBC (https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/deploy/security_groups/#security-groups-for-load-balancers)

Which means:

  • service.beta.kubernetes.io/aws-load-balancer-security-groups on the service allows a user to specify a pre-existing set of security groups to attach to the front-end of the LB
  • If the annotation is not set, create and manage a front-end security group for each LB automatically

We don't want to just enable this create and manage front-end LB by default, since that would be a major change.

So, this is where the CCM config option would come in, and allow users to opt-in/out of having a default security group created for each service.

I think that mostly aligns with your suggestions above in this thread, but I think we still want to have the annotation to allow the user to override the behaviour?

Do we need to also account for the shared backend SG behaviour of LBC?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think we still want to have the annotation to allow the user to override the behaviour?

Are you referring to opt-out/override the global config to manage frontend SG creation (proposal of this EP) without a BYO SG approach? Do we need or have a strong reason/use case to do so considering a best practice/recommendation is to assign a SG for NLB? I also wonder if we would be against the ALBC strategy used v2.6.0+ (I really didn't find a configuration to opt-out SG in NLBs in recent ALBC versions).

Do we need to also account for the shared backend SG behaviour of LBC?

I think it would benefit in clusters with high number of services, but if we don't have an strong use case to do so in short-term, I would not increase the amount of features to incorporate to CCM in this EP as the long-term approach on OCP is TBD.

LMK WDYT +@elmiko

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once enabled, will the ccm try to create SGs for older LB services?

It is important not to change the old services, they should remain without the SG

is this saying we don't want to autocreate SGs once the feature is enabled?

We should not just blanketly change this in the upstream CCM, it needs to be introduced slowly and opt-in at first, later we may change the default though

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joel and Mike - Thanks for your thoughts.

it needs to be introduced slowly and opt-in at first

ACK. My understanding is the cloud-config flag covers that expectation in upstream, and on OCP we can gate it until we look it's ready to default to SG enforced by 3CMO. (current proposal)

Looks like we have a plan/scope defined for this EP. My takeaway from this thread and Slack conversation are:

    1. we are introducing a global cloud-config on CCM allowing to opt-in the managed SG by default across all Service type-loadbalancer NLB
    1. in later phase (still in this EP) we are introducing/enabling a BYO SG annotation on NLB, and this one will be available in the Service level (not planning to change CIO/Installer)
    1. We don't need to an additional annotation to opt-out SG in the Service level
    1. We are not introducing backend/shared SG as it would be covered in long-term research - and it is not an use case we are working in this EP

LMK if I missed something to wrap up this thread. Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in later phase (still in this EP) we are introducing/enabling a BYO SG annotation on NLB, and this one will be available in the Service level (not planning to change CIO/Installer)

I would expect users to want to be able to configure this through CIO eventually, cc @Miciah @alebedev87 who might have opinions

Otherwise all agreed

Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in later phase (still in this EP) we are introducing/enabling a BYO SG annotation on NLB, and this one will be available in the Service level (not planning to change CIO/Installer)

It seems to me that this is primarily a question of the EP’s scope. From a quick review, I understand the intent of the EP is to support SG for the load balancer that sits in front of the OCP router. If that’s the case, then the cluster ingress operator should be able to determine when to apply the new annotation (which adds the BYO frontend SG) to the publishing service - similar to what we did for the subnet configuration.

However, if the EP’s scope is more generic and aims to enable frontend SG support for NLB services in CCM, then we likely don’t need to configure the router during installation (as part of this EP, can be done as a follow-up EP).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the intent of the EP is to support SG for the load balancer that sits in front of the OCP router.[...] If that’s the case, then the cluster ingress operator should be able to determine when to apply the new annotation
However, if the EP’s scope is more generic and aims to enable frontend SG support for NLB services in CCM, then we likely don’t need to configure the router during installation

@alebedev87 those are the goal of this EP: https://github.com/openshift/enhancements/pull/1802/files#diff-84882e6fc6fb023742b0ac09960b79620cfea983c45def4739a89fd404cdc05aR70-R91

  • (Phase 1, 2) Enable opt-in configuration to CCM, and enforced to OCP, to provision NLB with SG by default on all new services including new routers, without CIO intervention (requirement for ROSA HCP)
  • (Phase 3) Introduce BYO SG annotation to CCM when provisioning NLB services, so CIO would be able to expose it to users when it is prioritized (follow up EP).


> WIP/TBReviewed

- The implementation in CCM should handle the case where the `service.beta.kubernetes.io/aws-load-balancer-managed-security-group` annotation is set to `true` but the service type is not `NLB` (`aws-load-balancer-type: nlb`). In this scenario, the CCM should likely log a warning mentioning the annotation is supported only on NLB.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the CCM do today for annotations that don't apply? I suspect it ignores them

We can use VAP downstream to prevent this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect too, so we don't need to warn/log. I will ensure the existing approach and update this thread. Thanks


Customers deploying OpenShift on AWS using Network Load Balancers (NLBs) for the default router have expressed the need for a similar security configuration as provided by Classic Load Balancers (CLBs), where a security group is created by CCM and associated with the load balancer. This allows for more granular control over inbound and outbound traffic at the load balancer level, aligning with AWS security best practices and addressing security findings that flag the lack of security groups on NLBs provisioned by the default CCM.

The default router in OpenShift, an IngressController object managed by Cluster Ingress Controller Operator (CIO), can be created with a Service type Load Balancer NLB instead of default Classic Load Balancer (CLB) during installation by enabling it in the `install-config.yaml`. Currently, the Cloud Controller Manager (CCM), which satisfies Service resources, provisions an AWS Load Balancer of type NLB without a Security Group (SG) directly attached to it. Instead, security rules are managed on the worker nodes' security groups.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, security rules are managed on the worker nodes' security groups.

What are the benefits of relying on LB security groups over the node sg? Do we get more fine-grained rules that are managed corresponding to the services? Can we reduce the current rules on compute nodes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the benefits of relying on LB security groups over the node sg?
Do we get more fine-grained rules that are managed corresponding to the services?

User can improve security rules targeting the lb only, instead of opening rules on node's SG. But also a best practice to associate SG to an NLB (minimum privileges approach):

"We recommend that you associate a security group with your Network Load Balancer when you create it."

[1] https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-security-groups.html#security-group-considerations

Can we reduce the current rules on compute nodes?

I don't think this could be a primarily goal, but we can review if it would have some duplicated/unused rule on node's SG.

Action item: I will keep this thread open to make sure this is reflected in the EP.

Copy link
Copy Markdown
Contributor Author

@mtulio mtulio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @patrickdillon and @JoelSpeed for the review/suggestions. Hopefully I've address your questions.

Perhaps we could focus in the thread where is suggested to change the CCM configuration to enable SG by default in NLBs? if this would be the path forward for this EP (I personally think it is an excellent idea), we could decrease the scope of changes in many components here.

Please let me know your thoughts.

cc @elmiko @rvanderp3 @Miciah


Customers deploying OpenShift on AWS using Network Load Balancers (NLBs) for the default router have expressed the need for a similar security configuration as provided by Classic Load Balancers (CLBs), where a security group is created by CCM and associated with the load balancer. This allows for more granular control over inbound and outbound traffic at the load balancer level, aligning with AWS security best practices and addressing security findings that flag the lack of security groups on NLBs provisioned by the default CCM.

The default router in OpenShift, an IngressController object managed by Cluster Ingress Controller Operator (CIO), can be created with a Service type Load Balancer NLB instead of default Classic Load Balancer (CLB) during installation by enabling it in the `install-config.yaml`. Currently, the Cloud Controller Manager (CCM), which satisfies Service resources, provisions an AWS Load Balancer of type NLB without a Security Group (SG) directly attached to it. Instead, security rules are managed on the worker nodes' security groups.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the benefits of relying on LB security groups over the node sg?
Do we get more fine-grained rules that are managed corresponding to the services?

User can improve security rules targeting the lb only, instead of opening rules on node's SG. But also a best practice to associate SG to an NLB (minimum privileges approach):

"We recommend that you associate a security group with your Network Load Balancer when you create it."

[1] https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-security-groups.html#security-group-considerations

Can we reduce the current rules on compute nodes?

I don't think this could be a primarily goal, but we can review if it would have some duplicated/unused rule on node's SG.

Action item: I will keep this thread open to make sure this is reflected in the EP.


> WIP/TBReviewed

- The implementation in CCM should handle the case where the `service.beta.kubernetes.io/aws-load-balancer-managed-security-group` annotation is set to `true` but the service type is not `NLB` (`aws-load-balancer-type: nlb`). In this scenario, the CCM should likely log a warning mentioning the annotation is supported only on NLB.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect too, so we don't need to warn/log. I will ensure the existing approach and update this thread. Thanks

@mtulio mtulio changed the title WIP/SPLAT-2137: Support Security Group on NLB for Default router on AWS SPLAT-2137: Support Security Group on NLB for Default router on AWS Jun 6, 2025
@mtulio mtulio marked this pull request as ready for review June 6, 2025 02:07
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 6, 2025
@openshift-ci openshift-ci bot requested review from jeffdyoung and rvanderp3 June 6, 2025 02:07
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Jun 6, 2025

Thanks @patrickdillon and @JoelSpeed for the review/suggestions. Hopefully I've address your questions.

Perhaps we could focus in the thread where is suggested to change the CCM configuration to enable SG by default in NLBs? if this would be the path forward for this EP (I personally think it is an excellent idea), we could decrease the scope of changes in many components here.

Please let me know your thoughts.

cc @elmiko @rvanderp3 @Miciah

Thanks you all for the feedabck. The EP has been reviewed with the comments, updating the proposal to limit to CCM changes by introducing a cloud-config (global configuration) to opt-in enable the managed front-end security group when creating Service type-LoadBalancer NLB, allowing CCCMO to enforce the default on OpenShift. The proposal also introduce an optional Service annotation to BYO SG will opt-out the manage SG.

This PR is ready for review.

Copy link
Copy Markdown
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is reading well to me, we probably need to chat about the TBD items but i have a couple suggestions/questions.


AWS [announced support for Security Groups when deploying an NLB in August 2023][nlb-supports-sg], but the CCM for AWS (within kubernetes/cloud-provider-aws) does not currently implement the feature of automatically creating and managing security groups for `Service` resources type-LoadBalancer using NLBs. While the [AWS Load Balancer Controller (ALBC/LBC)][aws-lbc] project already supports deploying security groups for NLBs, this enhancement focuses on adding minimal, opt-in support to the existing CCM to address immediate customer needs without a full migration to the LBC. This approach aims to provide the necessary functionality without requiring significant changes in other OpenShift components like the Ingress Controller, installer, ROSA, etc.

Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the beginning of this sentence is a little confusing:

Using a Network Load Balancer is a recommended network-based Load Balancer by AWS,

is this saying that NLB is the recommended way to do load balancing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's recommended way for network-based LBs. Currently AWS offers two LB replacing ELB/Classic (default by CCM): NLB (network-based) and ALB (application-based). So the idea is to mention the NLB is the recommended one. Do you think I need to state that replacement to improve the reading?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, perhaps to make the sentence clearer you could say:

Suggested change
Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Using a Network Load Balancer, as opposed to an Application Load Balancer, is the recommended way to do network-based load balancing by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.

is that accurate?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @elmiko , what about this?

Suggested change
Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Using a Network Load Balancer, as opposed to an Classic Load Balancer, is the recommended way to do network-based load balancing by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.

We can compare NLB with CLB.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think apply the suggestion from @mtulio here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Applying the suggestion in the next commit.

- a) decreases the amount of provider-specific changes on CIO;
- b) decreases the amount of maintained code/projects by the team (e.g., ALBC);
- c) enhances new configurations to the Ingress Controller when using NLB;
- d) decreases the amount of images in the core payload;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this decrease in reference to the ALBC?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, ALBC + ALBO would be required if CIO defaults to ALBC

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i might say this as "does not increase the amount of images in the core payload"


## Alternatives (Not Implemented)

> TODO/TBD
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's worth mentioning the idea of making the ALBC functionality into a module that can be imported into the CCM as something we should investigate for the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, Added! Thanks!

- a) decreases the amount of provider-specific changes on CIO;
- b) decreases the amount of maintained code/projects by the team (e.g., ALBC);
- c) enhances new configurations to the Ingress Controller when using NLB;
- d) decreases the amount of images in the core payload;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, ALBC + ALBO would be required if CIO defaults to ALBC

Comment on lines +359 to +369
## Graduation Criteria

> TODO/TBD

### Dev Preview -> Tech Preview

N/A. This feature will be introduced as Tech Preview (TBReviewed).

### Tech Preview -> GA

The E2E tests should be consistently passing, and a PR will be created to enable the feature gate by default.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand the FG added here openshift/api#2354

Initially we've been asked to go directly to TP, but considering the impact of this change (default to SG) we are considering starting from DP. We are evaluating the velocity in upstream and how fast we can move it.

@openshift-bot
Copy link
Copy Markdown

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2025
@openshift-bot
Copy link
Copy Markdown

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 26, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Jan 30, 2026

@mtulio: The lifecycle/frozen label cannot be applied to Pull Requests.

Details

In response to this:

/remove-lifecycle rotten

/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Jan 30, 2026

/reopen

@openshift-ci openshift-ci bot reopened this Jan 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Jan 30, 2026

@mtulio: Reopened this PR.

Details

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jan 30, 2026

@mtulio: This pull request references SPLAT-2137 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.22." or "openshift-4.22.", but it targets "openshift-4.20" instead.

Details

In response to this:

Enhancement proposal to introduce the support of Security Group to the Service type-loadBalancer NLBs to the AWS Cloud Controller Manager (CCM), ensuring OpenShift sets the default configuration to teach to manage CCM to new NLBs.

https://issues.redhat.com/browse/OCPSTRAT-1553
https://issues.redhat.com/browse/SPLAT-2137

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Jan 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Jan 30, 2026

Interim update:

  • Managed SG implemented and in TP from 4.21
  • collecting the data points in the hypershift project to complete this EP

@openshift-bot
Copy link
Copy Markdown

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 28, 2026
@openshift-bot
Copy link
Copy Markdown

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 7, 2026
@openshift-bot
Copy link
Copy Markdown

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this Mar 15, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 15, 2026

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.


AWS [announced support for Security Groups when deploying an NLB in August 2023][nlb-supports-sg], but the CCM for AWS (within kubernetes/cloud-provider-aws) does not currently implement the feature of automatically creating and managing security groups for `Service` resources type-LoadBalancer using NLBs. While the [AWS Load Balancer Controller (ALBC/LBC)][aws-lbc] project already supports deploying security groups for NLBs, this enhancement focuses on adding minimal, opt-in support to the existing CCM to address immediate customer needs without a full migration to the LBC. This approach aims to provide the necessary functionality without requiring significant changes in other OpenShift components like the Ingress Controller, installer, ROSA, etc.

Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) I would slightly clarify the last portion of the sentence as:

Suggested change
Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs initially created without an associated Security Group do not support Security Group association after creation.

The reason is that if a NLB was initally provisioned with a Security Group, then one can associate new SGs with it after creation, the limitation holds only if the NLB was originally provisioned without a SG.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, good suggestion. I am updating in a batch commit.


The CCM, the controller which manages the `Service` resource, will have a global configuration on cloud-config to signalize the controller to manage the Security Group by default when creating a Service type-LoadBalancer NLB - annotation `service.beta.kubernetes.io/aws-load-balancer-type` set to `nlb`. This change paves the path to default the controller to managed security groups, following the same path AWS LBC defaults to since version v2.6.0.

The controller must create and manage the entire lifecycle of the Security Group resource when the load balancer is created, update the SG ingress rules according to the NLB Listeners configurations, and the Egress Rules according to the Target Group configurations.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to be explicit that the current proposal won't allow users using the "managed" (not BYO) Security Group to customize the ingress rules of the security group, essentially only allowing in all internet traffic or blocking it entirely.

This is an important limitation in my opinion since the ability to selectively limit inbound traffic by incoming IP CIDR ranges is of the core security capabilities of Security Groups. Furthermore this gap will increase the importance of the BYO Security Group feature for those customers needing this capability.

Additionally, we could consider to assess the feasibility of adding this capability e.g. as a "Phase 4" feature, after BYO SG implementation, maybe using the ingresscontroller.spec.endpointPublishingStrategy.loadBalancer.allowedSourceRanges property of an Ingress Controller and/or the standard K8s loadBalancerSourceRanges property of a Service to read and configure the Security Group rules.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great call.

Good point, @mfbonfigli . I've added this information:
https://github.com/openshift/enhancements/pull/1802/changes#diff-84882e6fc6fb023742b0ac09960b79620cfea983c45def4739a89fd404cdc05aR339

Allowing custom sources is really a good idea to enhance security keeping the automation. I can see two paths here:

  • Users can use BYO SG approach, so they will manage the SG in their end, including all SG rules.
  • (your suggestion) CCM support that custom annotation with custom rules.

As this was not requested by the Epic, I would defer to component SME @JoelSpeed and @alebedev87 to share if we need to plan adding this feature (not sure if we will be able to implement soon cc @rvanderp3 ), or defer this for later.

Thanks!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replying here to my own comment, I did some digging and it seems that the upstream AWS CCM actually already implements support for NLB SGs and source IP range restrictions via both the loadBalancerSourceRanges property of the Service and also via the service.beta.kubernetes.io/load-balancer-source-ranges annotation.

The annotation support in particular does not seem to be explicitly documented in AWS CCM but nonetheless it works because source ranges are derived by AWS CCM through the helper function contained in the upstream library K8s Cloud Provider library, which checks both the service definition and the annotation content, in this order.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @mfbonfigli , thanks for digging into this. So I think my last comment can be ignore. We just need to make sure this functionality inherited automatically from the cloud-provider library is tested and documented on OpenShift as part of Phase 2. I will update the EP accordainly. 👍🏽

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Mar 31, 2026

Reviewing planning in Hypershift phase, as well addressing recent PR updates.

/reopen
/lifecycle frozen
/remove-lifecycle rotten

@openshift-ci openshift-ci bot reopened this Mar 31, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 31, 2026

@mtulio: Reopened this PR.

Details

In response to this:

Reviewing planning in Hypershift phase, as well addressing recent PR updates.

/reopen
/lifecycle frozen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 31, 2026

@mtulio: This pull request references SPLAT-2137 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.22." or "openshift-4.22.", but it targets "openshift-4.20" instead.

Details

In response to this:

Enhancement proposal to introduce the support of Security Group to the Service type-loadBalancer NLBs to the AWS Cloud Controller Manager (CCM), ensuring OpenShift sets the default configuration to teach to manage CCM to new NLBs.

https://issues.redhat.com/browse/OCPSTRAT-1553
https://issues.redhat.com/browse/SPLAT-2137

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 31, 2026

@mtulio: The lifecycle/frozen label cannot be applied to Pull Requests.

Details

In response to this:

Reviewing planning in Hypershift phase, as well addressing recent PR updates.

/reopen
/lifecycle frozen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 31, 2026
This commit addresses unresolved feedback from PR openshift#1802, particularly
around Phase 2b HyperShift implementation and other technical gaps:

Phase 2b - HyperShift/ROSA HCP Implementation:
- Documented that HyperShift does NOT use CCCMO (critical architectural difference)
- Explained Control Plane Operator's role in managing cloud-config ConfigMap
- Detailed cluster-scoped feature gate evaluation from HostedControlPlane.Spec.Configuration.FeatureGate
- Provided implementation approach for CPO's AWS CCM config adapter
- Removed blocking TODO from Hypershift section

Phase 2 Restructuring:
- Split Phase 2 into clearly defined sub-phases (2a: Self-Managed/ROSA Classic, 2b: ROSA HCP)
- Added specific implementation goals for each architecture
- Clarified CCCMO vs CPO usage patterns

Workflow Descriptions:
- Separated ROSA Classic and ROSA HCP workflows to show architectural differences
- Added detailed step-by-step flows for both deployment models
- Clarified component roles (CPO, CCCMO, CIO, CCM)

Implementation Details Enhancements:
- Documented limitation: managed SG does not support custom ingress CIDR filtering (addresses comment #3017312641)
- Added explicit IAM permissions list required for CCM service account
- Converted TODO items into concrete Security Group naming convention details
- Noted that custom CIDR filtering requires BYO SG (Phase 3)

Phase 3 Clarifications:
- Resolved TBD about backend security group rule management annotation
- Clarified it is deferred to future phase
- Specified exact ALBC annotation names for consistency

Other Improvements:
- Enhanced ROSA Classic section with specific CCCMO enforcement details
- Expanded Single-Node Deployments section with clear guidance
- Fixed grammar: "an Classic" → "a Classic"
- Multiple wording improvements for technical clarity

These changes leverage knowledge from recent HyperShift implementation work
where cluster-scoped feature gate evaluation was implemented for AWS CCM
configuration.

Grammar Reviewed by Claude Code.
Hypershift implementation coverage crated by Claude Code.

Signed-off-by: Marco Braga <mrbraga@redhat.com>
Assisted-by: Claude Sonnet 4.5 (via Cursor)
Copy link
Copy Markdown
Contributor Author

@mtulio mtulio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you all for review/feedback.

I just updated pending comments as well, the pending phase that was uncertain for now: ROSA HCP/Hypershift.


AWS [announced support for Security Groups when deploying an NLB in August 2023][nlb-supports-sg], but the CCM for AWS (within kubernetes/cloud-provider-aws) does not currently implement the feature of automatically creating and managing security groups for `Service` resources type-LoadBalancer using NLBs. While the [AWS Load Balancer Controller (ALBC/LBC)][aws-lbc] project already supports deploying security groups for NLBs, this enhancement focuses on adding minimal, opt-in support to the existing CCM to address immediate customer needs without a full migration to the LBC. This approach aims to provide the necessary functionality without requiring significant changes in other OpenShift components like the Ingress Controller, installer, ROSA, etc.

Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, good suggestion. I am updating in a batch commit.


AWS [announced support for Security Groups when deploying an NLB in August 2023][nlb-supports-sg], but the CCM for AWS (within kubernetes/cloud-provider-aws) does not currently implement the feature of automatically creating and managing security groups for `Service` resources type-LoadBalancer using NLBs. While the [AWS Load Balancer Controller (ALBC/LBC)][aws-lbc] project already supports deploying security groups for NLBs, this enhancement focuses on adding minimal, opt-in support to the existing CCM to address immediate customer needs without a full migration to the LBC. This approach aims to provide the necessary functionality without requiring significant changes in other OpenShift components like the Ingress Controller, installer, ROSA, etc.

Using a Network Load Balancer is a recommended network-based Load Balancer by AWS, and attaching a Security Group to an NLB is a security best practice. NLBs also do not support attaching security groups after they are created.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Applying the suggestion in the next commit.


- Introduce Annotations to CCM to allow BYO SG to Service type-LoadBalancer NLB to opt-out the global `Managed` security group configuration.
- The annotation must follow the same standard as ALBC. Must be optional.
- (TBD if it is required) An annotation to allow managing backend rules must be added to prevent manual changes by the user. Must be opt-out by default
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, I think it will be out of the scope of this deliverable, unless @alebedev87 think this is something required by NE-1792. cc @mfbonfigli


The CCM, the controller which manages the `Service` resource, will have a global configuration on cloud-config to signalize the controller to manage the Security Group by default when creating a Service type-LoadBalancer NLB - annotation `service.beta.kubernetes.io/aws-load-balancer-type` set to `nlb`. This change paves the path to default the controller to managed security groups, following the same path AWS LBC defaults to since version v2.6.0.

The controller must create and manage the entire lifecycle of the Security Group resource when the load balancer is created, update the SG ingress rules according to the NLB Listeners configurations, and the Egress Rules according to the Target Group configurations.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great call.

Good point, @mfbonfigli . I've added this information:
https://github.com/openshift/enhancements/pull/1802/changes#diff-84882e6fc6fb023742b0ac09960b79620cfea983c45def4739a89fd404cdc05aR339

Allowing custom sources is really a good idea to enhance security keeping the automation. I can see two paths here:

  • Users can use BYO SG approach, so they will manage the SG in their end, including all SG rules.
  • (your suggestion) CCM support that custom annotation with custom rules.

As this was not requested by the Epic, I would defer to component SME @JoelSpeed and @alebedev87 to share if we need to plan adding this feature (not sure if we will be able to implement soon cc @rvanderp3 ), or defer this for later.

Thanks!

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 1, 2026

@mtulio: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 49c053d link true /test markdownlint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants