Override container runtime when inferentia support is enabled #2458
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Added an agent config InferentiaSupportEnabled populated by ECS_ENABLE_INF_SUPPORT env. For a container that has AWS_NEURON_VISIBLE_DEVICES specified, if InferentiaSupportEnabled is on, the agent will override its runtime to the neuron docker runtime which is needed for using the inferentia devices.
This change enables us to only use the neuron runtime for container that needs the inf device, and only do so when such runtime is installed on the AMI (which is indicated by the ECS_ENABLE_INF_SUPPORT config that we will add together with installing the neuron runtime).
Implementation details
api/task
: added logic to override container runtime to neuron if needed. had to do a refactor in dockerHostConfig to satisfy gocyclo complexity check.api/container
: added method RequireNeuronRuntime to check if the container specifies using inf.config
: added InferentiaSupportEnabled config, populated by ECS_ENABLE_INF_SUPPORT env.Testing
Unit tests added; Built the agent and successfully ran an inf task and verified that the runtime is only set to neuron for the container that specifies AWS_NEURON_VISIBLE_DEVICES.
New tests cover the changes: yes
Description for the changelog
TBD
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.