Skip to content

Commit 0e5defb

Browse files
authored
23.08 Release preparation (#71)
* add dgl internal repo as a submodule * update Dockerfile to use latest pyt staged image, updated dgl install and fix torch-harmonics stage * update nvfuser API * update dgl submodule * updates * update paths * add vtk and pyvista * revert Dockerfile changes, update the package to new version * update the decorator version for onnx * fix typo * fix security issues in filesystem.py * remove dgl as modulus core submodule * update DGL build * move some packages to Dockerfile * update * update tensorly installs * add more arch support * update python version * add recursive option * update Dockerfile * update * add test for http package * update ci tensorflow version * update changelog
1 parent 25287d5 commit 0e5defb

File tree

10 files changed

+103
-59
lines changed

10 files changed

+103
-59
lines changed

.gitmodules

Whitespace-only changes.

CHANGELOG.md

+8
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111
### Added
1212

1313
- Added a CHANGELOG.md
14+
- Added build support for internal DGL
1415

1516
### Changed
1617

18+
- DGL install changed from pypi to source
19+
1720
### Deprecated
1821

1922
### Removed
@@ -24,8 +27,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2427

2528
### Security
2629

30+
- Fixed security issues with subprocess and urllib in `filesystem.py`
31+
2732
### Dependencies
2833

34+
- Updated the base container to latest PyTorch base container which is based on torch 2.0
35+
- Container now supports CUDA 12, Python 3.10
36+
2937
## [0.1.0] - 2023-05-08
3038

3139
### Added

Dockerfile

+53-40
Original file line numberDiff line numberDiff line change
@@ -12,72 +12,85 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
ARG PYT_VER=22.12
15+
ARG PYT_VER=23.06
1616
FROM nvcr.io/nvidia/pytorch:$PYT_VER-py3 as builder
1717

1818
# Update pip and setuptools
1919
RUN pip install --upgrade pip setuptools
2020

21-
# Setup git lfs
21+
# Setup git lfs, graphviz gl1(vtk dep)
2222
RUN apt-get update && \
23-
apt-get install -y git-lfs && \
23+
apt-get install -y git-lfs graphviz libgl1 && \
2424
git lfs install
2525

26-
# Install nightly build of dgl
27-
RUN pip install --no-deps --pre dgl -f https://data.dgl.ai/wheels/cu117/repo.html
28-
RUN pip install --no-deps --pre dglgo -f https://data.dgl.ai/wheels-test/repo.html
29-
ENV DGLBACKEND=pytorch
30-
3126
ENV _CUDA_COMPAT_TIMEOUT=90
3227

28+
# TODO remove benchy dependency
29+
RUN pip install git+https://github.com/romerojosh/benchy.git
30+
# TODO use torch-harmonics pip package after the upgrade
31+
RUN pip install https://github.com/NVIDIA/torch-harmonics/archive/8826246cacf6c37b600cdd63fde210815ba238fd.tar.gz
32+
RUN pip install "tensorly>=0.8.1" "vtk>=9.2.6" "pyvista>=0.40.1" https://github.com/tensorly/torch/archive/715a0daa7ae0cbdb443d06780a785ae223108903.tar.gz
33+
34+
# Install DGL (Internal if present otherwise from source)
35+
ARG DGL_BACKEND=pytorch
36+
ENV DGL_BACKEND=$DGL_BACKEND
37+
ENV DGLBACKEND=$DGL_BACKEND
38+
39+
COPY . /modulus/
40+
RUN if [ -e "/modulus/deps/dgl" ]; then \
41+
echo "Internal DGL exists. Using internal DGL build" && \
42+
cp -r /modulus/deps/dgl/ /opt/ && \
43+
mkdir /opt/dgl/dgl-source/build \
44+
&& cd /opt/dgl/dgl-source/build \
45+
&& export NCCL_ROOT=/usr \
46+
&& cmake .. -GNinja -DCMAKE_BUILD_TYPE=Release \
47+
-DUSE_CUDA=ON -DCUDA_ARCH_BIN="60 70 75 80 86 90" -DCUDA_ARCH_PTX="90" \
48+
-DCUDA_ARCH_NAME="Manual" \
49+
-DBUILD_TORCH=ON \
50+
-DBUILD_SPARSE=ON \
51+
&& cmake --build . \
52+
&& cd ../python \
53+
&& python setup.py bdist_wheel \
54+
&& pip install ./dist/dgl*.whl \
55+
&& rm -rf ./dist \
56+
&& rm -rf ../build \
57+
&& cd /opt/dgl/ \
58+
&& pip install --no-cache-dir -r requirements.txt; \
59+
else \
60+
echo "No Internal DGL present. Building from source" && \
61+
git clone --recurse-submodules https://github.com/dmlc/dgl.git && \
62+
cd dgl/ && DGL_HOME="/workspace/dgl/" bash script/build_dgl.sh -g && \
63+
cd python && \
64+
python setup.py install && \
65+
python setup.py build_ext --inplace; \
66+
fi
67+
68+
# cleanup of stage
69+
RUN rm -rf /modulus/
70+
3371
# Install custom onnx
3472
# TODO: Find a fix to eliminate the custom build
3573
# Forcing numpy update to over ride numba 0.56.4 max numpy constraint
3674
COPY . /modulus/
37-
RUN if [ -e "/modulus/deps/onnxruntime_gpu-1.14.0-cp38-cp38-linux_x86_64.whl" ]; then \
75+
RUN if [ -e "/modulus/deps/onnxruntime_gpu-1.15.1-cp310-cp310-linux_x86_64.whl" ]; then \
3876
echo "Custom wheel exists, installing!" && \
39-
pip install --force-reinstall /modulus/deps/onnxruntime_gpu-1.14.0-cp38-cp38-linux_x86_64.whl; \
77+
pip install --force-reinstall /modulus/deps/onnxruntime_gpu-1.15.1-cp310-cp310-linux_x86_64.whl; \
4078
else \
4179
echo "No custom wheel present, skipping" && \
42-
pip install numpy==1.22.4; \
80+
pip install "numpy==1.22.4"; \
4381
fi
4482
# cleanup of stage
4583
RUN rm -rf /modulus/
4684

4785
# CI image
4886
FROM builder as ci
49-
RUN pip install tensorflow>=2.11.0 warp-lang>=0.6.0 black==22.10.0 interrogate==1.5.0 coverage==6.5.0 protobuf==3.20.0
50-
# TODO remove benchy dependency
51-
RUN pip install git+https://github.com/romerojosh/benchy.git
52-
# TODO use torch-harmonics pip package after the upgrade
53-
RUN pip install https://github.com/NVIDIA/torch-harmonics/archive/8826246cacf6c37b600cdd63fde210815ba238fd.tar.gz
54-
55-
# install libcugraphops and pylibcugraphops
56-
ENV DEBIAN_FRONTEND=noninteractive
57-
ENV TZ=Etc/UTC
58-
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
59-
60-
RUN apt-get update &&\
61-
apt-get install -y software-properties-common &&\
62-
add-apt-repository ppa:ubuntu-toolchain-r/test &&\
63-
apt-get install -y libstdc++6
64-
RUN mkdir -p /opt/cugraphops &&\
65-
cd /opt/cugraphops &&\
66-
wget https://anaconda.org/nvidia/libcugraphops/23.04.00/download/linux-64/libcugraphops-23.04.00-cuda11_230412_ga76892e3_0.tar.bz2 &&\
67-
wget https://anaconda.org/nvidia/pylibcugraphops/23.04.00/download/linux-64/pylibcugraphops-23.04.00-cuda11_py38_230412_ga76892e3_0.tar.bz2 &&\
68-
tar -xf libcugraphops-23.04.00-cuda11_230412_ga76892e3_0.tar.bz2 &&\
69-
tar -xf pylibcugraphops-23.04.00-cuda11_py38_230412_ga76892e3_0.tar.bz2 &&\
70-
rm libcugraphops-23.04.00-cuda11_230412_ga76892e3_0.tar.bz2 &&\
71-
rm pylibcugraphops-23.04.00-cuda11_py38_230412_ga76892e3_0.tar.bz2
72-
73-
ENV PYTHONPATH="${PYTHONPATH}:/opt/cugraphops/lib/python3.8/site-packages"
74-
87+
RUN pip install "tensorflow>=2.9.0" "warp-lang>=0.6.0" "black==22.10.0" "interrogate==1.5.0" "coverage==6.5.0" "protobuf==3.20.0"
7588
COPY . /modulus/
7689
RUN cd /modulus/ && pip install -e . && rm -rf /modulus/
7790

7891
# Deployment image
7992
FROM builder as deploy
80-
RUN pip install protobuf==3.20.0
93+
RUN pip install "protobuf==3.20.0"
8194
COPY . /modulus/
8295
RUN cd /modulus/ && pip install .
8396

@@ -87,6 +100,6 @@ RUN rm -rf /modulus/
87100
# Docs image
88101
FROM deploy as docs
89102
# Install CI packages
90-
RUN pip install tensorflow>=2.11.0 warp-lang>=0.6.0 protobuf==3.20.0
103+
RUN pip install "tensorflow>=2.9.0" "warp-lang>=0.6.0" "protobuf==3.20.0"
91104
# Install packages for Sphinx build
92-
RUN pip install recommonmark==0.7.1 sphinx==5.1.1 sphinx-rtd-theme==1.0.0 pydocstyle==6.1.1 nbsphinx==0.8.9 nbconvert==6.4.3 jinja2==3.0.3
105+
RUN pip install "recommonmark==0.7.1" "sphinx==5.1.1" "sphinx-rtd-theme==1.0.0" "pydocstyle==6.1.1" "nbsphinx==0.8.9" "nbconvert==6.4.3" "jinja2==3.0.3"

modulus/models/layers/fused_silu.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
import functools
1616
import torch
1717
from torch.autograd import Function
18-
from torch._C._nvfuser import Fusion, FusionDefinition, DataType
18+
from nvfuser._C import Fusion, FusionDefinition, DataType
1919

2020

2121
_torch_dtype_to_nvfuser = {

modulus/utils/filesystem.py

+10-9
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
import urllib.request
2020
import os
2121
import hashlib
22-
import subprocess
22+
import requests
2323

2424
import logging
2525

@@ -28,7 +28,7 @@
2828
try:
2929
LOCAL_CACHE = os.environ["LOCAL_CACHE"]
3030
except KeyError:
31-
LOCAL_CACHE = os.environ["HOME"] + "/.cache/modulus"
31+
LOCAL_CACHE = os.environ["HOME"] + "/.cache"
3232

3333

3434
def _cache_fs(fs):
@@ -55,15 +55,16 @@ def _download_cached(path: str, recursive: bool = False) -> str:
5555
if not os.path.exists(cache_path):
5656
logger.debug("Downloading %s to cache: %s", path, cache_path)
5757
if path.startswith("s3://"):
58-
if recursive:
59-
subprocess.check_call(
60-
["aws", "s3", "cp", path, cache_path, "--recursive"]
61-
)
62-
else:
63-
subprocess.check_call(["aws", "s3", "cp", path, cache_path])
58+
fs = _get_fs(path)
59+
fs.get(path, cache_path, recursive=recursive)
6460
elif url.scheme == "http":
61+
# urllib.request.urlretrieve(path, cache_path)
6562
# TODO: Check if this supports directory fetches
66-
urllib.request.urlretrieve(path, cache_path)
63+
response = requests.get(path, stream=True, timeout=5)
64+
with open(cache_path, "wb") as output:
65+
for chunk in response.iter_content(chunk_size=8192):
66+
if chunk:
67+
output.write(chunk)
6768
elif url.scheme == "file":
6869
path = os.path.join(url.netloc, url.path)
6970
return path

pyproject.toml

+1-3
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ authors = [
99
]
1010
description = "A deep learning framework for AI-driven multi-physics systems"
1111
readme = "README.md"
12-
requires-python = ">=3.7"
12+
requires-python = ">=3.8"
1313
license = {text = "Apache 2.0"}
1414
dependencies = [
1515
"h5py>=3.7.0",
@@ -20,8 +20,6 @@ dependencies = [
2020
"pytest>=6.0.0",
2121
"ruamel.yaml>=0.17.22",
2222
"setuptools>=67.6.0",
23-
"tensorly>=0.8.1",
24-
"tensorly-torch>=0.4.0",
2523
"torch>=1.12",
2624
"xarray>=2023.1.0",
2725
"zarr>=2.14.2",

test/deploy/test_onnx_fft.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -41,10 +41,10 @@ def check_ort_version():
4141
True,
4242
reason="Proper ONNX runtime is not installed. 'pip install onnxruntime onnxruntime_gpu'",
4343
)
44-
elif ort.__version__ != "1.14.0":
44+
elif ort.__version__ != "1.15.1":
4545
return pytest.mark.skipif(
4646
True,
47-
reason="Must install custom ORT 1.14.0. Other versions do not work \
47+
reason="Must install custom ORT 1.15.1. Other versions do not work \
4848
due to bug in IRFFT: https://github.com/microsoft/onnxruntime/issues/13236",
4949
)
5050
else:

test/deploy/test_onnx_utils.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ def check_ort_version():
3838
True,
3939
reason="Proper ONNX runtime is not installed. 'pip install onnxruntime onnxruntime_gpu'",
4040
)
41-
elif ort.__version__ != "1.14.0":
41+
elif ort.__version__ != "1.15.1":
4242
return pytest.mark.skipif(
4343
True,
44-
reason="Must install custom ORT 1.14.0. Other versions do not work \
44+
reason="Must install custom ORT 1.15.1. Other versions do not work \
4545
due to bug in IRFFT: https://github.com/microsoft/onnxruntime/issues/13236",
4646
)
4747
else:

test/models/common/inference.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ def check_ort_version():
3838
True,
3939
reason="Proper ONNX runtime is not installed. 'pip install onnxruntime onnxruntime_gpu'",
4040
)
41-
elif ort.__version__ != "1.14.0":
41+
elif ort.__version__ != "1.15.1":
4242
return pytest.mark.skipif(
4343
True,
44-
reason="Must install custom ORT 1.14.0. Other versions do not work \
44+
reason="Must install custom ORT 1.15.1. Other versions do not work \
4545
due to bug in IRFFT: https://github.com/microsoft/onnxruntime/issues/13236",
4646
)
4747
else:

test/utils/test_filesystem.py

+24
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,25 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15+
import hashlib
1516
from pathlib import Path
1617
from modulus.utils import filesystem
1718

1819

20+
def calculate_checksum(file_path):
21+
sha256 = hashlib.sha256()
22+
23+
with open(file_path, "rb") as f:
24+
while True:
25+
data = f.read(8192)
26+
if not data:
27+
break
28+
sha256.update(data)
29+
30+
calculated_checksum = sha256.hexdigest()
31+
return calculated_checksum
32+
33+
1934
def test_package(tmp_path: Path):
2035
string = "hello"
2136
afile = tmp_path / "a.txt"
@@ -28,3 +43,12 @@ def test_package(tmp_path: Path):
2843
ans = f.read()
2944

3045
assert ans == string
46+
47+
48+
def test_http_package():
49+
test_url = "http://raw.githubusercontent.com/NVIDIA/modulus/main/docs/img"
50+
package = filesystem.Package(test_url, seperator="/")
51+
path = package.get("modulus-pipes.jpg")
52+
53+
known_checksum = "e075b2836d03f7971f754354807dcdca51a7875c8297cb161557946736d1f7fc"
54+
assert calculate_checksum(path) == known_checksum

0 commit comments

Comments
 (0)