generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 521
[HuggingFace][Neuronx] Inference - Optimum Neuron 0.0.22(pt2.1.2) - Neuron sdk 2.18.0 - Transformers to 4.36.2 #3823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 17 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
d97105c
update dockerfile
JingyaHuang 35548a5
add artifacts
JingyaHuang 9ce55a3
change dlc_developer_config.toml
JingyaHuang e5157b0
Merge branch 'master' into add-neuronx-inf-2.18
JingyaHuang 7ce36bc
upgrade optimum neuron
JingyaHuang cd897d2
update artifacts
JingyaHuang 7c4b338
install peft
JingyaHuang 7b43c46
hacky fix for the sanity transformers_neuronx version check
JingyaHuang de9ca83
add new neuronx toolkit test
JingyaHuang a25591d
fix by endpoint name?
JingyaHuang e3a8af6
fix no context test
JingyaHuang 6c74c5a
fix sanity
JingyaHuang 2aa0107
test if fix
JingyaHuang cbc8141
fix
JingyaHuang e3104f4
test fix
JingyaHuang e78136d
fix import
JingyaHuang 5779cca
fix import
JingyaHuang 3e20124
Revert "change dlc_developer_config.toml"
JingyaHuang 7aea687
Merge branch 'master' into add-neuronx-inf-2.18
Captainia a8acbfc
fix style
JingyaHuang d3c80fe
Merge branch 'master' into add-neuronx-inf-2.18
Captainia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
185 changes: 185 additions & 0 deletions
185
huggingface/pytorch/inference/docker/2.1/py3/sdk2.18.0/Dockerfile.neuronx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,185 @@ | ||
| FROM ubuntu:20.04 | ||
|
|
||
| LABEL dlc_major_version="1" | ||
| LABEL maintainer="Amazon AI" | ||
| LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true | ||
|
|
||
| ARG PYTHON=python3.10 | ||
| ARG PYTHON_VERSION=3.10.12 | ||
| ARG MMS_VERSION=1.1.11 | ||
| ARG MAMBA_VERSION=23.1.0-4 | ||
|
|
||
| # Neuron SDK components version numbers | ||
| ARG NEURONX_FRAMEWORK_VERSION=2.1.2.2.1.0 | ||
| ARG NEURONX_DISTRIBUTED_VERSION=0.7.0 | ||
| ARG NEURONX_CC_VERSION=2.13.66.0 | ||
| ARG NEURONX_TRANSFORMERS_VERSION=0.10.0.21 | ||
| ARG NEURONX_COLLECTIVES_LIB_VERSION=2.20.22.0-c101c322e | ||
| ARG NEURONX_RUNTIME_LIB_VERSION=2.20.22.0-1b3ca6425 | ||
| ARG NEURONX_TOOLS_VERSION=2.17.1.0 | ||
|
|
||
| # HF ARGS | ||
| ARG TRANSFORMERS_VERSION | ||
| ARG DIFFUSERS_VERSION=0.27.2 | ||
| ARG OPTIMUM_NEURON_VERSION=0.0.22 | ||
| ARG SENTENCE_TRANSFORMERS=2.6.1 | ||
| ARG PEFT_VERSION=0.10.0 | ||
|
|
||
| # See http://bugs.python.org/issue19846 | ||
| ENV LANG C.UTF-8 | ||
| ENV LD_LIBRARY_PATH /opt/aws/neuron/lib:/lib/x86_64-linux-gnu:/opt/conda/lib/:$LD_LIBRARY_PATH | ||
| ENV PATH /opt/conda/bin:/opt/aws/neuron/bin:$PATH | ||
| ENV SAGEMAKER_SERVING_MODULE sagemaker_pytorch_serving_container.serving:main | ||
| ENV TEMP=/home/model-server/tmp | ||
|
|
||
| RUN apt-get update \ | ||
| && apt-get upgrade -y \ | ||
| && apt-get install -y --no-install-recommends software-properties-common \ | ||
| && add-apt-repository ppa:openjdk-r/ppa \ | ||
| && apt-get update \ | ||
| && apt-get install -y --no-install-recommends \ | ||
| build-essential \ | ||
| apt-transport-https \ | ||
| ca-certificates \ | ||
| cmake \ | ||
| curl \ | ||
| emacs \ | ||
| git \ | ||
| jq \ | ||
| libgl1-mesa-glx \ | ||
| libsm6 \ | ||
| libxext6 \ | ||
| libxrender-dev \ | ||
| openjdk-11-jdk \ | ||
| vim \ | ||
| wget \ | ||
| unzip \ | ||
| zlib1g-dev \ | ||
| libcap-dev \ | ||
| gpg-agent \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && rm -rf /tmp/tmp* \ | ||
| && apt-get clean | ||
|
|
||
| RUN echo "deb https://apt.repos.neuron.amazonaws.com focal main" > /etc/apt/sources.list.d/neuron.list | ||
| RUN wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | apt-key add - | ||
|
|
||
| # Install Neuronx tools | ||
| RUN apt-get update \ | ||
| && apt-get install -y \ | ||
| aws-neuronx-tools=$NEURONX_TOOLS_VERSION \ | ||
| aws-neuronx-collectives=$NEURONX_COLLECTIVES_LIB_VERSION \ | ||
| aws-neuronx-runtime-lib=$NEURONX_RUNTIME_LIB_VERSION \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && rm -rf /tmp/tmp* \ | ||
| && apt-get clean | ||
|
|
||
| # https://github.com/docker-library/openjdk/issues/261 https://github.com/docker-library/openjdk/pull/263/files | ||
| RUN keytool -importkeystore -srckeystore /etc/ssl/certs/java/cacerts -destkeystore /etc/ssl/certs/java/cacerts.jks -deststoretype JKS -srcstorepass changeit -deststorepass changeit -noprompt; \ | ||
| mv /etc/ssl/certs/java/cacerts.jks /etc/ssl/certs/java/cacerts; \ | ||
| /var/lib/dpkg/info/ca-certificates-java.postinst configure; | ||
|
|
||
| RUN curl -L -o ~/mambaforge.sh https://github.com/conda-forge/miniforge/releases/download/${MAMBA_VERSION}/Mambaforge-${MAMBA_VERSION}-Linux-x86_64.sh \ | ||
| && chmod +x ~/mambaforge.sh \ | ||
| && ~/mambaforge.sh -b -p /opt/conda \ | ||
| && rm ~/mambaforge.sh \ | ||
| && /opt/conda/bin/conda update -y conda \ | ||
| && /opt/conda/bin/conda install -c conda-forge -y \ | ||
| python=$PYTHON_VERSION \ | ||
| pyopenssl \ | ||
| cython \ | ||
| mkl-include \ | ||
| mkl \ | ||
| botocore \ | ||
| parso \ | ||
| scipy \ | ||
| typing \ | ||
| # Below 2 are included in miniconda base, but not mamba so need to install | ||
| conda-content-trust \ | ||
| charset-normalizer \ | ||
| && /opt/conda/bin/conda update -y conda \ | ||
| && /opt/conda/bin/conda clean -ya | ||
|
|
||
| RUN conda install -c conda-forge \ | ||
| scikit-learn \ | ||
| h5py \ | ||
| requests \ | ||
| && conda clean -ya \ | ||
| && pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \ | ||
| && ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \ | ||
| && pip install packaging \ | ||
| enum-compat \ | ||
| ipython | ||
|
|
||
| RUN pip install --no-cache-dir -U \ | ||
| opencv-python>=4.8.1.78 \ | ||
| "numpy>=1.22.2, <1.24" \ | ||
| "scipy>=1.8.0" \ | ||
| six \ | ||
| "pillow>=10.0.1" \ | ||
| "awscli<2" \ | ||
| pandas==1.* \ | ||
| boto3 \ | ||
| cryptography | ||
|
|
||
| # Install Neuronx-cc and PyTorch | ||
| RUN pip install --extra-index-url https://pip.repos.neuron.amazonaws.com \ | ||
| neuronx-cc==$NEURONX_CC_VERSION \ | ||
| torch-neuronx==$NEURONX_FRAMEWORK_VERSION \ | ||
| neuronx_distributed==$NEURONX_DISTRIBUTED_VERSION \ | ||
| transformers-neuronx==$NEURONX_TRANSFORMERS_VERSION \ | ||
| && pip install "protobuf>=3.18.3,<4" \ | ||
| && pip install --no-deps --no-cache-dir -U torchvision==0.16.* | ||
|
|
||
| WORKDIR / | ||
|
|
||
| RUN pip install --no-cache-dir \ | ||
| multi-model-server==$MMS_VERSION \ | ||
| sagemaker-inference | ||
|
|
||
| RUN useradd -m model-server \ | ||
| && mkdir -p /home/model-server/tmp \ | ||
| && chown -R model-server /home/model-server | ||
|
|
||
| COPY neuron-entrypoint.py /usr/local/bin/dockerd-entrypoint.py | ||
| COPY neuron-monitor.sh /usr/local/bin/neuron-monitor.sh | ||
| COPY config.properties /etc/sagemaker-mms.properties | ||
|
|
||
| RUN chmod +x /usr/local/bin/dockerd-entrypoint.py \ | ||
| && chmod +x /usr/local/bin/neuron-monitor.sh | ||
|
|
||
| ADD https://raw.githubusercontent.com/aws/deep-learning-containers/master/src/deep_learning_container.py /usr/local/bin/deep_learning_container.py | ||
|
|
||
| RUN chmod +x /usr/local/bin/deep_learning_container.py | ||
|
|
||
| ################################# | ||
| # Hugging Face specific section # | ||
| ################################# | ||
|
|
||
| RUN curl https://aws-dlc-licenses.s3.amazonaws.com/pytorch-1.13/license.txt -o /license.txt | ||
|
|
||
| # install Hugging Face libraries and its dependencies | ||
| RUN pip install --no-cache-dir \ | ||
| transformers[sentencepiece,audio,vision]==${TRANSFORMERS_VERSION} \ | ||
| diffusers==${DIFFUSERS_VERSION} \ | ||
| optimum-neuron==${OPTIMUM_NEURON_VERSION} \ | ||
| sentence_transformers==${SENTENCE_TRANSFORMERS} \ | ||
| peft==${PEFT_VERSION} \ | ||
| "sagemaker-huggingface-inference-toolkit>=2.4.0,<3" | ||
|
|
||
| RUN pip install --no-cache-dir -U "pillow>=10.0.1" | ||
|
|
||
| RUN HOME_DIR=/root \ | ||
| && curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \ | ||
| && unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \ | ||
| && cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \ | ||
| && chmod +x /usr/local/bin/testOSSCompliance \ | ||
| && chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \ | ||
| && ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \ | ||
| && rm -rf ${HOME_DIR}/oss_compliance* \ | ||
| # conda leaves an empty /root/.cache/conda/notices.cache file which is not removed by conda clean -ya | ||
| && rm -rf ${HOME_DIR}/.cache/conda | ||
|
|
||
| EXPOSE 8080 8081 | ||
| ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"] | ||
| CMD ["serve"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.