-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] Cloudformation support for cluster upgrades #115
Comments
@dcherman EKS supports in-place cluster upgrades via the EKS API (https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html) and worker node updates via Cloudformation (https://docs.aws.amazon.com/eks/latest/userguide/update-stack.html) - shipped as part of #21. Does this resolve your issue or are you thinking of additional/different functionality for cluster upgrades? |
@tabern So part of what you can do with Cloudformation is specify the Kubernetes version that you want. If you change that in your template and re-apply the stack, it required replacement of the resource. What I'm proposing is that Cloudformation should use the EKS API internally to perform these upgrades rather than replacing the resource. |
Got it - so the idea is you can do an entire cluster incl. nodes with a single CF stack update? |
Exactly; I want to avoid creating and updating clusters using different methods since the Cloudformation template is no longer the source of truth if you're updating the cluster outside of it. |
+1 for this |
Ok - good info. Thanks! |
@dcherman if you are have used Check out - eksctl-io/eksctl#348 for more details. |
@christopherhein is the goal, recommendation by Amazon, for people to use |
@christopherhein I'm actively monitoring That said, |
It's an option, we have contributed a handful of things to @dcherman check out eksctl-io/eksctl#19 if you haven't seen it, at one point we were discussing using the ClusterAPI functions to support this style of deployment, still a lot to do if you want to help. :) |
@christopherhein understood the option, I just wanted to confirm that wasn't a replacement. The OP was for better way to improve cluster updates in CloudFormation, just didn't want it to get lost. I'm very intrigued on how much AWS has used eksctl that it might be more of a preferred way than even using CF directly anymore... something to ponder over 🤔 thanks as always! |
CloudFormation now supports cluster upgrades! What's new: https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-eks-now-supports-kubernetes-version-1-12-and-cluster-vers/ |
@tabern - To clarify, do you mean that if I am managing my EKS cluster in CloudFormation, and it is Or will it just be the equivalent of pressing the 'upgrade cluster' button in the console, where it just controls the upgrade/rollout on the masters? Because this ticket was specifically intended (at least in my reading?) to be the former. The latter is great and all... but there's still a huge pain point in rolling out the AMI update, because we basically have to build our own automation to cleanly and safely upgrade the EKS NodeGroup. |
@geerlingguy pretty sure the CF upgrade only applies to the control plane. I notice when AWS says 'cluster' they are often only thinking of the bit they manage! 😄 And I think that was what this ticket was about, because before with CF, if you changed the EKS version, CF would delete and recreate the control plane. Not what you want! 😢 For the worker nodes, users can do anything the want, including any custom AMI's, so it wouldn't easily be possible for CF to identify the AMI to use to upgrade nodes in the general case. CF/EKS doesn't actually know what ASGs are relevant to the cluster, just which instances have registered, further complicating any possible upgrade. One option, not just for EKS, is to bring up a new, upgraded node group ASG. Then once it is stable, drain the old node group nodes, and then delete that node group ASG. If you are using
There is also discussion of adding a If you just want to update the AMI in your ASG and let it roll and update, then you can run an auto-drain Daemonset like |
@geerlingguy The feature that we shipped today is the later. When you update the version via CloudFormation, it triggers the updateClusterVersion API to begin the cluster update process.
What you describe makes a lot of sense:
The functionality you (and @whereisaaron) are describing is a bit more complex and is most similar to #139 |
I'm having an issue with the current implementation of this feature. Scenario 1 Scenario 2 Suggested solution FYI: I opened a case with support but they mentioned it would be good if I place my issue here as well. It would be great if this issue can be fixed. |
This is also impacting our automated rollout of EKS 1.12 (and controlling our version deployment automation). We are experiencing the same issue as shown above. +1 |
I was able to reproduce @vincentheet issue #2 and you can not update the stack anymore once it in this state. Here is the error I see. Update failed because of Unsupported Kubernetes minor version update from 1.12 to 1.12 (Service: AmazonEKS; Status Code: 400; Error Code: InvalidParameterException |
@tabern Do we need to create a new issue since this one is closed for it to be addressed? |
@qthuy sorry for the delay on this - we are taking a look at this |
How is this still a thing, wth? It's a major bug, reported almost a year ago now, still not fixed. This breaks CF stacks that launch EKS Clusters in a really bad way. |
I was also able to reproduce @vincentheet issue #2 and can not update the stack anymore once it in this state.
This error should not be thrown to fail stack update. |
I contacted support, they told me to use other tools to manage EKS pretty much 🤦♂ I guess that's how much AWS is going to focus on CFN, time to drop it completely. |
@tabern Any updates? Can this issue please be reopened while you/AWS are/is investigating? |
@iAnomaly @vincentheet @jia2 can you please open a new issue to track this? My understanding here is that the CFN template may not be looking at the patch version during the update and is thus failing. Want to apologize for any anguish this has caused, we want CFN to be a first class citizen for EKS and we have work lined up for end of 2019/early 2020 to address this and other areas where we can improve the capabilities for CFN to manage EKS clusters. |
Thanks @vincentheet - I'll pull that onto the roadmap and we can track status there. |
Hello Tabren , i am planning to deply EKS cluster with quickstart however want to know about future upgrade related problems and changes in the environment . How to do the further upgrades and migrations |
@vpuria, I've deployed EKS cluster with quickstart and facing issues with the upgrade. |
Tell us about your request
What do you want us to build?
Support for upgrading an existing EKS instance provisioned by Cloudformation rather than requiring replacement
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
I'm trying to upgrade an EKS cluster between versions without replacing the cluster which introduces risk since the behavior of replacement is not well defined (i.e., is the etcd state migrated? Backups? What about requests that might be in-flight when the changeover happens?). The existing behavior would also likely require rolling the worker nodes since the cluster API would change, unless you put it behind a CNAME or something.
Instead, CloudFormation should simply upgrade the cluster via the API that is already available for doing so and which both the AWS CLI and Terraform support.
Are you currently working around this issue?
Yes
How are you currently solving this problem?
Managing the EKS cluster with Terraform
Additional context
Anything else we should know?
Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
The text was updated successfully, but these errors were encountered: