Lambda Bundles #7

eladb · 2019-12-08T21:39:00Z

PR	Champion
#	@eladb

Description

NodeFunction, JavaFunction, PythonFunction, etc.
Automatic publishing of environment variables
Combine "grant" & publish (e.g. bucket.grantXxx(lambda) will also add BUCKET_ARN environment variable)

Progress

The text was updated successfully, but these errors were encountered:

eladb · 2020-02-23T11:54:47Z

The new NodeJsFunction construct creates AWS Lambda bundles for Node.js only during synthesis.

If NodeJsFunction is used inside a 3rd-party construct library, the bundling will only happen when the app that consumes this library (directly or indirectly) is synthesized.

This means that, for example, bundling tools such as parcel or the docker image for building lambda native modules (which we wanted to introduce in #6323) will need to be installed in the environment of the top-level app.

In a sense, this is somewhat aligned with how docker assets work. We only include the source of the image in the cloud assembly, and only during publishing, we actually build the docker image.

I am wondering if perhaps the right approach is to move Lambda bundling into the publishing stage. This means that during synthesis, we will only copy the sources to the cloud assembly and we will add some hooks to the publishing stage (cdk-assets) which will allow processing these sources and producing an eventual bundle.

There is an interesting synergy related to Docker. We eventually want to bundle lambda functions inside a docker container that matches the Lambda environment, and the publishing environment supports docker.

Maybe the right, general purpose, solution is to basically treat these more like a docker asset than a .zip asset.

Copy: @rix0rrr, @jogold

ran-isenberg · 2020-02-24T07:48:30Z

As a developer who uses CDK, i'd rather get Lambda creation errors as soon as possible (aka cdk synth) and not wait until the cdk deploy part where it deploys other items.

To me, a solution would be that during the python lambda creation a docker container (with python dep and pipenv/poetry) will run, and pull the requirements into an output folder which can be then zipped. Each language, nodejs/python etc, will just spin up a different builder docker image but the logic can be the same overall.

eladb · 2020-02-27T09:38:16Z

Yes, I retract my proposal to do this during publishing. It won't work because you need dependencies from your project and those won't be available in the cloud assembly.

So this needs to happen either during build, before synth (and then during synth we will basically have a bundle that we can just reference as a .zip file) or it can happen during synth, but we need some way to abstract away any dependencies.

Let's examine these two options.

Before synth

In this option, the preparation of the bundle happens sometimes before we call cdk synth. This means that as far as the CDK app is concerned, the asset is just a .zip file. For example, this is how we publish the lambda bundle for the @aws-cdk/aws-s3-deployment module. The published module includes a .zip file that contains the lambda bundle as-is.

This is technically already supported, but requires that users will codify this in their library build process (see the prebuild configuration in s3-deployment's package.json and the actual lambda build script).

It's not too hard, but also not a great developer experience.

It is important to notice that when building libraries the CDK CLI is not involved at all. The CLI is only used by applications not when building and publishing libraries. This is just a normal TypeScript library.

Therefore, in this approach, we basically need to vend another command line tool that users will be able to integrate into their build system which will prepare these bundles and allow them to be referenced by the CDK library.

It won't be possible to rely configuration from the app to this new tool because the app is never executed when you build a library, so we will need some additional configuration that will be read by the CDK to identify the bundled assets.

A downside of this approach is also that the eventual library can technically be pretty big because it will include the compiled zip file with all it's dependencies, so we are not leveraging the standard dependency mechanisms.

During synth

In this approach we are basically saying that bundling only happens when the app is actually synthesized. This is similar to how NodeJsFunction works today (where parcel is only executed during synth) but we must find a way to abstract these dependencies.

One way to do that would be to always require that bundling happens inside a Docker container. This has the benefit of reducing the dependency surface area (consumers only need docker during synth) and will also allow us to actually build Lambda functions in a lambda-compatible container (like sam build), so native modules will be supported.

The main benefits of this approach:

Smaller library size (they contain only source, not artifacts)
Bundling configuration is self-contained inside the CDK code
No custom tools required to build libraries

Downsides:

Longer synth time

jogold · 2020-02-27T20:49:42Z

I think that the CDK should offer the best possible developer experience so I would definitely go for the during synth option.

Building inside a Docker container is indeed the right solution for maximum compatibility. But the main question is what kind of build workflow will run inside the container. The worklows here https://github.com/awslabs/aws-lambda-builders/tree/develop/aws_lambda_builders/workflows (= sam build) are an excellent source of inspiration and show how complex it can be when considering all the details of each language.

As far as JavaScript/TypeScript is concerned I'm not sure that the worklow offered by aws-lambda-builders gives the best developer experience:

potentially large Lambda package size: all the production dependencies are installed whether used in the Lambda function's source code or not + for a project with multiple Lambda functions they all end up in each Lambda packages unless you start with complex include/exclude logic or specific directory structure with package.json files for each Lambda
no transpiling: a real problem for JS developers using modern syntax, less a problem for TS where tsc will almost always run before synth
no monorepo support: copying the package.json of a single module and running npm install simply doesn't work here

We should maybe start with a minimum set of requirements: what is important/critical? what use cases do we want to support?

eladb · 2020-03-01T09:10:07Z

I tend to agree that synth is more inline with how we want CDK experience to work. Ideally we should offer some kind of an open framework for building assets inside docker images during synthesis.

The minimal surface can be something like "run this command inside a docker image with two mounted volumes: /src with the project source tree and /asset is mounted to where the asset output should be emitted (could be a directory or a file).

Then, we can implement our parcel bundler using something like this, and also perhaps implement an additional builder that leverages sam build.

jogold · 2020-03-03T21:06:06Z

Hey @eladb, I'm sure you've seen aws/aws-cdk#6535... I've been working further on this to come up with the best possible developer experience for JS/TS Lambda functions.

I have now a working construct offering the following API:

Code can be defined inside the construct and it can use any top level dependency imported in the file (another file or an external module). This offers a really great developer experience. I'm using the Typescript Compiler API to analyze the AST for this.

It ~~can also easily support~~ supports props like externals (= a list of module that should not be bundled like aws-sdk) and natives (= a list of modules that should be included/installed in the node_modules folder, can be done in a Lambda compatible docker image). The whole process could be dockerized.

We can discuss this further when you're available.

jogold · 2020-03-11T10:38:52Z

@eladb can we start the discussion with this?

NodejsCodeFunction integ test

It works like this:

Find the defining file
Extract top level import/require statements from this file
Write the top level import/require statements and the handler's code to a temporary file
Collect identifiers in this new temporary file to check for unused import/require statements
Remove unused top level import/requires statements in the temporary file
Give this to parcel

Moreover we have the following features:

Support for externals (typically aws-sdk)
Support for includes: modules that should not be bundled but included as "real" installs in a node_modules folder in the build dir. For this, versions are extracted from the package.json and if we have a lock file (package-lock.json or yarn.lock) it is taken into account by using it and running install with the right installer (npm or yarn). The install process can optionally run in a Lambda compatible container (currenty using lambci docker images).

eladb added the devex Developer Experience label Dec 8, 2019

MrArnoldPalmer added the status/proposed Newly proposed RFC label Jan 4, 2020

eladb changed the title ~~Runtime & infrastructure code integration~~ Lambda Bundles Feb 23, 2020

This was referenced Feb 23, 2020

Give feedback when parcel is building Lambda functions aws/aws-cdk#6319

Closed

CDK automated zip creation of Lambda function with custom library dependencies aws/aws-cdk#6294

Closed

jogold mentioned this issue Mar 9, 2020

feat(lambda-nodejs): inline runtime code next to infrastructure code aws/aws-cdk#6535

Closed

jogold mentioned this issue Jun 16, 2020

Improve build performance of NodejsFunction on Mac (@aws-cdk/aws-lambda-nodejs package) aws/aws-cdk#8544

Closed

2 tasks

eladb added status/done Implementation complete and removed status/proposed Newly proposed RFC labels Mar 11, 2021

eladb closed this as completed Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lambda Bundles #7

Lambda Bundles #7

eladb commented Dec 8, 2019 •

edited

Loading

eladb commented Feb 23, 2020

ran-isenberg commented Feb 24, 2020

eladb commented Feb 27, 2020

jogold commented Feb 27, 2020 •

edited

Loading

eladb commented Mar 1, 2020

jogold commented Mar 3, 2020 •

edited

Loading

jogold commented Mar 11, 2020

Lambda Bundles #7

Lambda Bundles #7

Comments

eladb commented Dec 8, 2019 • edited Loading

Description

Progress

eladb commented Feb 23, 2020

ran-isenberg commented Feb 24, 2020

eladb commented Feb 27, 2020

Before synth

During synth

jogold commented Feb 27, 2020 • edited Loading

eladb commented Mar 1, 2020

jogold commented Mar 3, 2020 • edited Loading

jogold commented Mar 11, 2020

eladb commented Dec 8, 2019 •

edited

Loading

jogold commented Feb 27, 2020 •

edited

Loading

jogold commented Mar 3, 2020 •

edited

Loading