How to speed up development workflow with local SageMaker? #4797
Unanswered
richardkmichael
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm working with a large model framework (Nvidia NeMo), and a relatively complex inference script. There is a lot of experimentation, so I thought it would be faster to use SageMaker in "local" mode, and I have it working.
However:
the model data bundle has a
requirements.txt
which installs ~1GB of dependencies (NeMo and deps), downloaded each time. SageMaker'sdocker compose
rebuilds the container every time I re-run my localmodel.deploy(instance_type='local', ...)
which takes at least 5 minutes (installing dependencies). So it is a very slow development process -- change code, re-deploy, find a bug, etc.Since the inference code is not "live" within the running torch server, I need to save, recreate the model tar.gz bundle, and re-deploy for any change to the inference code. This is also painful, even if the docker container was re-used.
Any suggestions to speed up my local workflow?
I've considered --
Re 1:
Is it possible to specify a custom image with
instance_type='local'
? Themodel.deploy()
function doesn't seem to accept an image name argument. But if so, I could build a custom Docker image, with the framework already installed, so that pip would quickly findrequirement already satisfied
when it processes myrequirements.txt
file.Re 2:
I could automate the model bundle rebuild with a watcher on the inference code.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions