-
Notifications
You must be signed in to change notification settings - Fork 1.5k
AWS: Enable clusters with no public endpoints #2526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS: Enable clusters with no public endpoints #2526
Conversation
|
Create your VPC+subnets Create bastion into your VPC. I'm using sshuttle to create a tunnel to the VPC using ssh, it also handles DNS resolution Open another terminal and run the installer with your list of subnets and publish: Internal |
40b2128 to
586800a
Compare
Could you SSH in and run the installer on the bastion? I'd expect you'd have AWS-endpoint access to any cluster-managed AWS resources you'd need, because the cluster would be managing those resources going forward. And it would get us one step closer to the installer-appliance AMI flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂
It would mean I have to ssh, and move all the binaries, configuration, secrets to the Bastian and then run the installer, sshuttle was extremely useful for me to develop and iterate easily. Bonus sshuttle captured my DNS and I could access the cluster console through my browser... Most of our customers will have their computers already configured to have access to network only private resources.. it was easy for me...
I agree 💯 but a UI based installer is very important to make that easy to use. |
586800a to
83adc04
Compare
This commit adds a enum(string) type PublishingStrategy. It supports 2 options
* ExternalPublishingStrategy : the endpoints are exposed to the Internet
* InternalPublishingStrategy : the endpoints are exposed to the `Private Network` only
This enum type is added to the InstallConfig to control the strategy for the cluster endpoints. Cluster endpoints include the API server,
the default Ingress controller, public IPs etc.
Also adds the defaulting to make sure this strategy is `External` by default, keeping the backward-compatible behavior, and validation to make sure only valid enum options are set.
Also it looks like gofmt prefers `{}` over `struct{}{}` in validation map.
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_installer/2526/pull-ci-openshift-installer-master-gofmt/6229
For aws, only when the strategy is External, do we set the public Route53 zone for the basedomain in DNSes.config.openshift.com cluster object. Also for aws, the master machines objects are configured to belong to the External Load Balancer target group.
83adc04 to
09284a4
Compare
|
All green :) |
data/data/aws/master/main.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 12570 was closed with 0.12's better error message, and the actual support tracking has moved to terraform#4149. Dunno if it's worth updating your links?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 12570 definitely has the correct context
Adds the `aws_publish_strategy` variable to the terraform root variables to allow skipping public resources. The brief list of resources that are only created when strategy is External are: - Public Load Balancers for API (6443) (aws_lb.api_external, aws_lb_target_group.api_external, aws_lb_listener.api_external_api) - Public DNS record (aws_route53_record.api_external) - No public IP address associated to bootstrap-host (associate_public_ip_address: false) - No security group rule that allows SSH to bootstrap-host from Internet (aws_security_group_rule.ssh), rather switching to VPC only. Due to terraform issue hashicorp/terraform#12570, where count cannot use variables that are dynamic as they are not computed during plan, certain implicit assumes/hacks need to be continued An error below shows how the aws_lb_target list can't be used as sole source for allowing other modules like bootstrap and master attach isntances without caring about publish strategy. Similarly the error also shows how route53 module cannot turn off the public records based on existence of the external LBs alone. ``` ERROR Error: Invalid count argument ERROR ERROR on ../../../../../../../tmp/openshift-install-940637305/bootstrap/main.tf line 154, in resource "aws_lb_target_group_attachment" "bootstrap": ERROR 154: count = length(var.target_group_arns) ERROR ERROR The "count" value depends on resource attributes that cannot be determined ERROR until apply, so Terraform cannot predict how many instances will be created. ERROR To work around this, use the -target argument to first apply only the ERROR resources that the count depends on. ERROR ERROR ERROR Error: Invalid count argument ERROR ERROR on ../../../../../../../tmp/openshift-install-940637305/master/main.tf line 131, in resource "aws_lb_target_group_attachment" "master": ERROR 131: count = var.instance_count * length(var.target_group_arns) ERROR ERROR The "count" value depends on resource attributes that cannot be determined ERROR until apply, so Terraform cannot predict how many instances will be created. ERROR To work around this, use the -target argument to first apply only the ERROR resources that the count depends on. ERROR ERROR ERROR Error: Invalid count argument ERROR ERROR on ../../../../../../../tmp/openshift-install-940637305/route53/base.tf line 2, in data "aws_route53_zone" "public": ERROR 2: count = var.api_external_lb_dns_name != null ? 1 : 0 ERROR ERROR The "count" value depends on resource attributes that cannot be determined ERROR until apply, so Terraform cannot predict how many instances will be created. ERROR To work around this, use the -target argument to first apply only the ERROR resources that the count depends on. ``` For instances to communicate with internet, they need to - either have public IP in a public subnet or, - be launched in private subnet. For more info see https://forums.aws.amazon.com/thread.jspa?threadID=96369 Therefore the bootstrap instance is created in public subnet for External strategy and private subnet for others Also the bootstrap instance SSH rules are modified to target Internet (0.0.0.0) for External and allow only VPC CIDR ranges otherwise
The install-config's publish strategy is passed to the terraform using tf-variables Since in Internal strategy, public subnets are going to be empty, making sure `not set` is not propagated in pre-existing VPC.
09284a4 to
2bd7e95
Compare
|
/hold cancel fixed two nits from #2526 (comment) |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
@abhinavdahiya: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
pkg/types: add the publish strategy to install-config
This commit adds a enum(string) type PublishingStrategy. It supports 2 options
Private NetworkonlyThis enum type is added to the InstallConfig to control the strategy for the cluster endpoints. Cluster endpoints include the API server,
the default Ingress controller, public IPs etc.
Also adds the defaulting to make sure this strategy is
Externalby default, keeping the backward-compatible behavior, and validation to make sure only valid enum options are set.pkg/asset: update machines and DNS for external-internal strategy
For aws, only when the strategy is External, do we set the public Route53 zone for the basedomain in DNSes.config.openshift.com cluster object.
data/aws: add publish strategy External/Internal feature
Adds the
aws_publish_strategyvariable to the terraform root variables to allow skipping public resources.The brief list of resources that are only created when strategy is External are:
Due to terraform issue hashicorp/terraform#12570, where count cannot use variables that are dynamic as they are not computed during plan, certain implicit assumes/hacks need to be continued
An error below shows how the aws_lb_target list can't be used as sole source for allowing other modules like bootstrap and master attach isntances without caring about publish strategy.
Similarly the error also shows how route53 module cannot turn off the public records based on existence of the external LBs alone.
For instances to communicate with internet, they need to
For more info see https://forums.aws.amazon.com/thread.jspa?threadID=96369
Therefore the bootstrap instance is created in public subnet for External strategy and private subnet for others
Also the bootstrap instance SSH rules are modified to target Internet (0.0.0.0) for External and allow only VPC CIDR ranges otherwise
xref: https://jira.coreos.com/browse/CORS-1221
/cc @wking