-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: 0049 Continuous Delivery for CDK Apps #3437
Conversation
Request for comments on the spec for CI/CD support for CDK apps.
The Pipeline will be CICD as well? How is that deployed? its own pipeline? (Answered later) New Environments: Can we leverage some functionality in AWS Organizations to "pre-trust" the pipeline? Controlled deployment environment: I'm not sure what this is getting at. Is this talking about the target environment? (Answered Later i think. It's using a CDK-team managed build infrastructure?) There’s a one-to-one mapping between an app and a pipeline: One-to-many? There could be use-cases where an app is deployed in several different pipelines, but they share a code-base. We are not optimizing this experience to support any CD tool: Except CodeBuild? oh, coming back to this later, it looks like the goal is to be provider agnostic. +1 It reads like you're going to be hosting a deployment management SaaS, what's this going to cost me? It appears that you're going to spin up a CB job for each pipeline stage, am I going to be paying for that? I like the deployment process. I'd just like to make sure we keep some kind of run-order like setting so I can have a more granular control over my deployment process. |
design/continuous-delivery.md
Outdated
## Approach | ||
|
||
At a high-level, we will model the deployment process of a CDK app as follows: | ||
**source** => **build** => **pipeline => deploy:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting idea! One thing that concerns me is that when you update a CodePipeline, any revision being run through the Pipeline (and in this case, since it's a self-updating Pipeline, this revision), will be stopped.
docs:
The update-pipeline command stops the pipeline. If a revision is being run through the pipeline when you run the update-pipeline command, that run is stopped. You must start the pipeline manually to run that revision through the updated pipeline.
This might be confusing if you update the Pipeline as well as some other infrastructure.
In order to prevent re deploying every lambda on every deploy, the cdk.out should be cached between pipelines. This way cdk "remembers" the hash of the lambda and can tell which function changed and which didn't and thus needs to be redeployed. This is pretty important in my opinion, if you have ~30 go lambdas you would send around 200mb on every deploy, which means the size of your deployment bucket will grow rapidly. |
That's definitely something we'd like to enable at some point, but we need to look into AWS Organizations in general. I think there is no reason this won't be possible. One thing that we are thinking about is to allow extension points for
When I say "controlled deployment environment" I mean that we want to make sure that the runtime environment from which "cdk deploy" is executed, and which usually runs with administrative privileges, is well-defined and does not run arbitrary user-code. However, I've recently realized that the current design is faulty in that assumption that user code will not run from the deployment action because at the moment we have coupled asset bundling and publishing with deployment. This means that, for example,
Can you provide a more concrete example? I am not sure I understand the idea.
In the current design, yes. Users will be paying for the codebuild jobs that run within their pipeline (unlike CFN actions). We will look into changing that.
The CodePipeline construct library supports this I believe, @skinny85 is that right? |
Yes, absolutely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something feels a bit off here in the approach in that a pipeline is defined as a stage in a pipeline. While there are many interpretations of what constitutes a pipeline, its probably best to align with the general AWS interpretation as seen in the Codepipeline implementation. In a generic sense, a pipeline is nothing more than an application itself and should follow a serverless/ephemermal model using the same infrastructure-as-code principles or better yet, Pipeline-as-Code(PaC).
This is a bit of a chicken or egg situation, but as mentioned, there needs to be a bootstrap action to get started - this should be an option to deploy the actual pipeline stack for a CDK App (that includes any subordinate pipelines required to build assets and artifacts), be it a simple pipeline or a pipeline of pipelines, so to speak.
It's common to have branch or tag based deployments in CI ( I currently rely on a |
I also recall an AWS blog post on creating a branch based pipeline and destroying the pipeline when done with the branch. |
@vaneek here is the reference architecture from the blog post about the pipeline |
design/continuous-delivery.md
Outdated
|
||
### North America deployment | ||
|
||
1. Create an AWS account for the service in North America (`ACNT-NA`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this POC requires the manual creation of a third party role with temp credentials which you are supplying an EXTERNAL_ID for. I love the idea of passing temp credentials to the project build!
How does everyone feel about having CDK generate this role and an external id?
Referencing this issue: https://github.com/eladb/cdkcd-test/issues/1
Follow up in #3437 |
Sorry, accidentally closed |
Conversation about this here: #3555 Thanks everyone for the comments. We've been discussing some of the comments and here is an initial takeaway: we can't escape decoupling "prepare" and "deploy" in order to mitigate risks related to running docker build in an environment that has administrative privileges, and in order to allow the use of the stock cloudformation actions for deployments (to address concerns related to costs and to constraint the administrative IAM role in remote accounts to the cloudformation service principal). To that end, we will introduce a new command
This URL can be used as a self-contained token to deploy a specific instance (in time) of the stack. This command will allow us to separate actions of asset preparation and deployment for each stack, and use the stock cloudformation action for the actual deployment of the stack. This approach will also allow users who prefer not to use CodePipeline to take advantage of these more granular building blocks and integrate them into their own CI/CD systems. Would love to hear people's thoughts on this direction. |
initial introduction of the concept of cdk-package still needs end to end updates.
@jonny-rimek wrote:
According to this RFC, lambda bundles are identified based on the hash of their source (be it the contents of the directory specified in Does that make sense? |
@jogold wrote:
You can still specify a context variable during build when you invoke |
@EdoBarroso wrote:
Added information about how to use ALZ/CT for bootstrapping. |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@jamiepmullan wrote:
Not sure I am familiar with the error you are describing. Sounds like it shouldn't be the experience. At any rate, lambda bundles are identified by their source hash. This means that they will change only if their source changes.
You define your pipeline. We just provide the building blocks. So you can define any pipeline you want.
This change doesn't attempt to implement the CI/CD system, just to provide building blocks (at varying levels of abstraction) for integrating into CI/CD systems. We will offer high level stages/actions for CodePIpeline and tools that can be used in systems like GitHub Workflows, Jenkins, etc.
The idea is that bootstrapping should be very simple. Sort of like the BIOS of your computer. Ideally those resources should be quite static. We try to make sure that the bootstrapping process itself can be done through tools like StackSets and Control Tower so you can leverage these existing services as much as possible to manage these environments.
Resources will be destroyed automatically as long as their stacks exist. Destruction of stacks and environments is currently a gap in this RFC. I will add that. |
@gataka wrote:
In a sense, yes. The "cloud assembly" is that package.
You will be able to specify bucket/repo in your CDK app and the cdk-assets tool will use those. We intentionally wanted these values to resolve during synthesis and not during deployment. Can you find any specific use cases where this approach breaks down? One of the main reasons is that there is no standard way to wire cloudformation parameters in CI/CD systems and we realized that if we just avoid using parameters in the asset system, integrating our CI/CD solution into other non-CodePipeline systems would be much simpler. It will also give users much more flexibility to encode any custom unique logic for determining where assets need to be published to since it will be Just Code ™️ in their App.
Nested stacks are going to be fully supported by this solution.
Thanks! |
Title does not follow the guidelines of Conventional Commits. Please adjust title before merge. |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
yes. I reread the rfc and if I understand it correctly the information about the hash is saved inside the assets.json. That means before running cdk synth during build stage the old assets.json (or the whole cdk.out?) is downloaded so cdk compare it with the old hashes? I get the hashes are used to check if something is changed, I just don't understand what the values are compared to as each run of code build starts in an empty container, where as if I run cdk deploy on my machine the cdk.out stays (but it's gitignored) |
As part of our work on [continuous delivery for CDK apps], we decided to store all docker assets in a single ECR repository per environment. This is consistent with file assets, which are stored in a single S3 bucket which is part of the bootstrap stack. This environment-specific ECR repository uses a well-known physical name `aws-cdk/assets` and will be automatically created if needed. In the future, this repository will be created as part of the environment bootstrapping process. The primary change is that the location of docker image assets will be fully determined by the framework. The ECR repository name will be hard-coded to `aws-cdk/assets` and the image tag will be based on the source hash of the docker asset. This basically means that we could get rid of the CloudFormation parameter that the CLI used to assign with the image name which helps to reduce the number of asset parameters (#3463). Since from now on the asset ECR repository will contain different types of images (and not versions of the same image), there is no concept of "latest" image anymore and the optimization that was triggered by the `--ci` flag in the CLI is no longer relevant (pull the "latest"). Luckily CodeBuild now supports docker image layer caching, so this should be the preferred way to optimize docker build times. The `--ci` feature of the CLI is no longer doing anything. Furthermore, before this change, in order to clean up ECR repositories, a custom resource called `AdoptedRepository` was automatically added to the stack for each asset. The purpose of this resource was to remove the asset's ECR repository it if the asset was no longer referenced by this stack. To address this need with the centralized repository, we plan to introduce a garbage collection capability that users will be able to invoke in order to clean up unused assets both from ECR and S3. We will introduce a way to customize asset repository names as part of the CI/CD project. In the meantime, if you need to override the default "aws-cdk/assets" name, you can specify a repo name through the context key `assets-ecr-repository-name` (`--context` in the CLI, `cdk.json`, `new App({ context })` or `stack.setContext`). BACKWARDS COMPATIBILITY As per our guidelines for backwards compatibility, the CLI must be backwards compatible with apps from before this change. However, apps that use CDK >= 1.21.0 will require an upgrade of the CLI. Therefore, to introduce this change, we have made the following non-breaking changes in cx-api: 1. Make `imageNameParameter` optional. If it is specified, the CLI will continue ti 2. Add an optional `imageTag` which instructs the CLI what tag to use for the image. If omitted (by previous versions), `latest` will be used as before. To make it easy to reason about the behavior for old apps, the CLI now has a new implementations for `prepareContainerAsset` called `prepareContainerImageAssetNew`. This new code path is triggered when the asset metadata *does not include* `imageNameParameter`. The new implementation requires that both `repositoryName` and `imageTag` will be defined. The old code path was only modified to support the new optional `imageTag` property (although it is unlikely to be exercised). Additional backwards compatibility concerns: - New ECR repositories the CLI creates will not have the lifecycle policy that retains only the last 5 docker images. This should not have a functional impact on users, but goes back to the imminent garbage collection project. - The removal of the `AdoptedRepository` resource from all stacks will result in the deletion of all ECR previously created ECR repositories (this is what the AdoptedRepository resource is designed to do). This can be harmful since these repositories are being referenced by the stack. To address this, we invalidate the image ID by salting the source hash. This means that after this change, all container images will have a new ID, which is not maintained by the removed adopted repository resource. TESTING - Unit tests for `prepareContainerImage` were duplicated and extended to exercise the new code path while preserving tests for old path. - All CLI integration tests were executed successfully against the new version. - Manually tested that the new CLI works with old apps. This change also fixes #5807 so that custom docker file names are relative and not absolute paths. [continuous delivery for CDK apps]: #3437 BREAKING CHANGE: all docker image assets are now pushed to a single ECR repository named `aws-cdk/assets` with an image tag based on the hash of the docker build source directory (the directory where your `Dockerfile` resides). See PR #5733 for details and discussion.
* feat(ecr-assets): simplify docker asset publishing As part of our work on [continuous delivery for CDK apps], we decided to store all docker assets in a single ECR repository per environment. This is consistent with file assets, which are stored in a single S3 bucket which is part of the bootstrap stack. This environment-specific ECR repository uses a well-known physical name `aws-cdk/assets` and will be automatically created if needed. In the future, this repository will be created as part of the environment bootstrapping process. The primary change is that the location of docker image assets will be fully determined by the framework. The ECR repository name will be hard-coded to `aws-cdk/assets` and the image tag will be based on the source hash of the docker asset. This basically means that we could get rid of the CloudFormation parameter that the CLI used to assign with the image name which helps to reduce the number of asset parameters (#3463). Since from now on the asset ECR repository will contain different types of images (and not versions of the same image), there is no concept of "latest" image anymore and the optimization that was triggered by the `--ci` flag in the CLI is no longer relevant (pull the "latest"). Luckily CodeBuild now supports docker image layer caching, so this should be the preferred way to optimize docker build times. The `--ci` feature of the CLI is no longer doing anything. Furthermore, before this change, in order to clean up ECR repositories, a custom resource called `AdoptedRepository` was automatically added to the stack for each asset. The purpose of this resource was to remove the asset's ECR repository it if the asset was no longer referenced by this stack. To address this need with the centralized repository, we plan to introduce a garbage collection capability that users will be able to invoke in order to clean up unused assets both from ECR and S3. We will introduce a way to customize asset repository names as part of the CI/CD project. In the meantime, if you need to override the default "aws-cdk/assets" name, you can specify a repo name through the context key `assets-ecr-repository-name` (`--context` in the CLI, `cdk.json`, `new App({ context })` or `stack.setContext`). BACKWARDS COMPATIBILITY As per our guidelines for backwards compatibility, the CLI must be backwards compatible with apps from before this change. However, apps that use CDK >= 1.21.0 will require an upgrade of the CLI. Therefore, to introduce this change, we have made the following non-breaking changes in cx-api: 1. Make `imageNameParameter` optional. If it is specified, the CLI will continue ti 2. Add an optional `imageTag` which instructs the CLI what tag to use for the image. If omitted (by previous versions), `latest` will be used as before. To make it easy to reason about the behavior for old apps, the CLI now has a new implementations for `prepareContainerAsset` called `prepareContainerImageAssetNew`. This new code path is triggered when the asset metadata *does not include* `imageNameParameter`. The new implementation requires that both `repositoryName` and `imageTag` will be defined. The old code path was only modified to support the new optional `imageTag` property (although it is unlikely to be exercised). Additional backwards compatibility concerns: - New ECR repositories the CLI creates will not have the lifecycle policy that retains only the last 5 docker images. This should not have a functional impact on users, but goes back to the imminent garbage collection project. - The removal of the `AdoptedRepository` resource from all stacks will result in the deletion of all ECR previously created ECR repositories (this is what the AdoptedRepository resource is designed to do). This can be harmful since these repositories are being referenced by the stack. To address this, we invalidate the image ID by salting the source hash. This means that after this change, all container images will have a new ID, which is not maintained by the removed adopted repository resource. TESTING - Unit tests for `prepareContainerImage` were duplicated and extended to exercise the new code path while preserving tests for old path. - All CLI integration tests were executed successfully against the new version. - Manually tested that the new CLI works with old apps. This change also fixes #5807 so that custom docker file names are relative and not absolute paths. [continuous delivery for CDK apps]: #3437 BREAKING CHANGE: all docker image assets are now pushed to a single ECR repository named `aws-cdk/assets` with an image tag based on the hash of the docker build source directory (the directory where your `Dockerfile` resides). See PR #5733 for details and discussion. * update test expectations
For publishing: | ||
|
||
* **S3 Bucket (+ KMS resources)**: for file asset and CloudFormation template (a single bucket will contain all files keyed by their source hash) | ||
* **ECR Repository**: for all docker image assets (a single repo will contain all images tagged by their source hash). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider not always creating an ECR repo in the bootstrap process, this will allow bootstrapping regions which does not yet have ECR available. This might also be further generalized to allow the extension of assets in the future to locations other than S3 and ECR
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
* feat(ecr-assets): simplify docker asset publishing As part of our work on [continuous delivery for CDK apps], we decided to store all docker assets in a single ECR repository per environment. This is consistent with file assets, which are stored in a single S3 bucket which is part of the bootstrap stack. This environment-specific ECR repository uses a well-known physical name `aws-cdk/assets` and will be automatically created if needed. In the future, this repository will be created as part of the environment bootstrapping process. The primary change is that the location of docker image assets will be fully determined by the framework. The ECR repository name will be hard-coded to `aws-cdk/assets` and the image tag will be based on the source hash of the docker asset. This basically means that we could get rid of the CloudFormation parameter that the CLI used to assign with the image name which helps to reduce the number of asset parameters (#3463). Since from now on the asset ECR repository will contain different types of images (and not versions of the same image), there is no concept of "latest" image anymore and the optimization that was triggered by the `--ci` flag in the CLI is no longer relevant (pull the "latest"). Luckily CodeBuild now supports docker image layer caching, so this should be the preferred way to optimize docker build times. The `--ci` feature of the CLI is no longer doing anything. Furthermore, before this change, in order to clean up ECR repositories, a custom resource called `AdoptedRepository` was automatically added to the stack for each asset. The purpose of this resource was to remove the asset's ECR repository it if the asset was no longer referenced by this stack. To address this need with the centralized repository, we plan to introduce a garbage collection capability that users will be able to invoke in order to clean up unused assets both from ECR and S3. We will introduce a way to customize asset repository names as part of the CI/CD project. In the meantime, if you need to override the default "aws-cdk/assets" name, you can specify a repo name through the context key `assets-ecr-repository-name` (`--context` in the CLI, `cdk.json`, `new App({ context })` or `stack.setContext`). BACKWARDS COMPATIBILITY As per our guidelines for backwards compatibility, the CLI must be backwards compatible with apps from before this change. However, apps that use CDK >= 1.21.0 will require an upgrade of the CLI. Therefore, to introduce this change, we have made the following non-breaking changes in cx-api: 1. Make `imageNameParameter` optional. If it is specified, the CLI will continue ti 2. Add an optional `imageTag` which instructs the CLI what tag to use for the image. If omitted (by previous versions), `latest` will be used as before. To make it easy to reason about the behavior for old apps, the CLI now has a new implementations for `prepareContainerAsset` called `prepareContainerImageAssetNew`. This new code path is triggered when the asset metadata *does not include* `imageNameParameter`. The new implementation requires that both `repositoryName` and `imageTag` will be defined. The old code path was only modified to support the new optional `imageTag` property (although it is unlikely to be exercised). Additional backwards compatibility concerns: - New ECR repositories the CLI creates will not have the lifecycle policy that retains only the last 5 docker images. This should not have a functional impact on users, but goes back to the imminent garbage collection project. - The removal of the `AdoptedRepository` resource from all stacks will result in the deletion of all ECR previously created ECR repositories (this is what the AdoptedRepository resource is designed to do). This can be harmful since these repositories are being referenced by the stack. To address this, we invalidate the image ID by salting the source hash. This means that after this change, all container images will have a new ID, which is not maintained by the removed adopted repository resource. TESTING - Unit tests for `prepareContainerImage` were duplicated and extended to exercise the new code path while preserving tests for old path. - All CLI integration tests were executed successfully against the new version. - Manually tested that the new CLI works with old apps. This change also fixes #5807 so that custom docker file names are relative and not absolute paths. [continuous delivery for CDK apps]: aws/aws-cdk#3437 BREAKING CHANGE: all docker image assets are now pushed to a single ECR repository named `aws-cdk/assets` with an image tag based on the hash of the docker build source directory (the directory where your `Dockerfile` resides). See PR #5733 for details and discussion. * update test expectations
This PR includes the RFC for supporting CI/CD for CDK apps of any complexity.
It also includes the RFC for
cdk-assets
which is a tool for publishing CDK assets, and is part of the CI/CD solution.Tracking issue: aws/aws-cdk-rfcs#49
Addresses: #1312
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license