Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support broader work tasks/models using Lambda Docker container #346

Open
twelch opened this issue Sep 8, 2024 · 2 comments
Open

Support broader work tasks/models using Lambda Docker container #346

twelch opened this issue Sep 8, 2024 · 2 comments
Milestone

Comments

@twelch
Copy link
Contributor

twelch commented Sep 8, 2024

Need

Right now we are limited to the Javascript ecosystem for geoprocessing functions. It should be possible to run a Docker container instead capable of running other environments.

Solution

  • user adds a Dockerfile to the project source code.
    • Dockerfile builds on Amazon base image capable of being run in a Lambda.
    • entry point is a Javascript function, not unlike GeoprocessingHandler.ts, that is a lambda handler. It receives input payload, calls the underlying user-provided geoprocessing function, and returns the result.
  • user registers a geoprocessing function in geoprocessing.json and includes a reference to the Dockerfile it requires.
    • The geoprocessing function is similar to how it is now, just the geoprocessing function gets run within the docker container as the entrypoint function. and it will include user-created code to call out to run tools with necessary input, mostly likely using a shell exec command. Input may need to be prepped/transformed going in and coming out.
    • It may also make sense to be able to invoke the container Lambda as a worker, so that it can scale better.
  • on deploy CDK will use lambda.DockerImageFunction to publish Dockerfiles as images and make them available to run as Lambda function
  • result S3 bucket can be used to store one or more results, and S3 metadata gets returned as the result, with temporary pre-signed URL's to access the results.

For example, to run an R model

  • the Dockerfile would install the necessary R environment and code.
  • The geoprocessing function would call out to run the model, passing it input, and getting output back.

See example of how done for SeaSketch - UploadHandlerLambdaStack

Challenges

  • Testing environment. How will smoke tests work? Should local environment build dockerfiles into images and run them?
  • Logging and plumbing for reporting status including progress, and errors.
  • Generating pre-signed URL's to read S3 bucket items

Limitations

  • Limited to what can be installed and run starting with a base Amazon Linux image

Questions

  • Does this need to work for preprocessor functions also?
@twelch twelch added this to the 8.0 milestone Sep 8, 2024
@twelch twelch changed the title Support devcontainer for running Support running non-JS work tasks in Docker container Sep 9, 2024
@twelch twelch changed the title Support running non-JS work tasks in Docker container Support running broader work tasks in Docker container Sep 10, 2024
@twelch twelch changed the title Support running broader work tasks in Docker container Support broader work tasks/models using Lambda Docker container Sep 10, 2024
@twelch
Copy link
Contributor Author

twelch commented Nov 9, 2024

Since we already use a container for developing reports, would Docker-in-Docker or Docker-from-Docker be needed? https://www.kenmuse.com/blog/docker-from-docker-in-alpine-dev-containers/

Running all smoke tests at once is already reaching limits of a local computer, is there a way we can overcome this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant